: Could a page with 300+ links be penalized by Google? I have an HTML page with unique content (text, images) that contains also a calendar component at the bottom of the page. The calendar
I have an HTML page with unique content (text, images) that contains also a calendar component at the bottom of the page. The calendar markup contains 365 anchor elements that have different URLs and different content.
Could search engines penalize the webpage because of having so many URLs?
The structure of pages would look like this:
page.html
calendar.php?day=20151224
calendar.php?day=20151225
...
calendar.php?day=xx
page.html contains 365 links to calendar.php, each of them having different parameters.
calendar.php won't be indexed by Google, but page.html should be indexed.
More posts by @Sarah324
3 Comments
Sorted by latest first Latest Oldest Best
That shouldn't be a problem unless the anchor text is majority the same. Google now penalizes sites that has the same anchor text (keywords), they see this as "bought" or unnatural back links.
Hundreds of links should be fine, assuming the content isn't too repeated, thin, or stuff'd. Many sites use hundreds of links on every single page -- especially big ecoms due to their huge menus of categories/departments. Example is Wayfair with 568 links. They go [mostly] to diverse "high value" areas, mostly categories.
Also, it's quite common to use a router style page like you are doing. This comes mostly in the form of index.php?go=there, Google understands it perfectly fine and won't nuke you into page 90 because of it. Although, you could squeeze a bit more rank out by optimizing the URL struct to be cleaner, eliminate the router file and querystring/parameter directive, either with APP or at htaccess, and avoid having to deal with the parameters teaching in the next paragraph.
As far as link density VS assimilations, in your case, it's a bit different than large ecom because it's all pointing to similar URi. So you need to make sure the days have enough content oomph, there is enough diversity in that content as well as page title and metas, and that Google understands the "day=" parameter. You can "teach" Google about that parameter in Webmaster Tools > Crawl > Url Parameters. First find the day= parameter if its in the list, click edit, then expand show example URLs to make sure it's operating correctly. If the parameter isn't there, add it. In either case, I would teach it to:
(Does this parameter change page content seen by the user?) Yes: Changes, reorders, or narrows page content
(How does this parameter affect page content?) Specifies
(Which URLs with this parameter should Googlebot crawl?) Let Googlebot decide, or you can choose just the calendar.php with only urls with value
Note: I'm not exactly sure why people are still recommending the 90's era technique of using robots.txt to block parameters that obviously affect site content in a pivotal way. Think of all the sorts, limits, and filters in the world. Google knows exactly what they mean now and can decide for itself whether it's an important view. Strangely, sometimes a strange limit and sort, with specific filter is indeed its favorite flavor of a view. But that's ok because those parameters all point to a canonical, right? Just like day= pages would point to canonical page.html, right? So instead of brute force blocking day= and assuming it's "for the best", you should try the modernized parameters method first and trust that Google can comprehend much more than it used to.
Several hundred links on a page is not unusual these days. Look at this visualization of the number of links on the home page of the top 98 websites. Most have hundreds of links. Google can't drop pages with hundreds of links without effecting all these popular home pages.
A calendar widget can be problematic for SEO when it creates a link for each day of the year. That produces many pages thin content. Most pages are not going to have many (if any) events. In general it is better for SEO not to create pages without content. Google's Panda algorithm is particularly hard on sites that create lots of thin content pages. Having such pages could cause your entire site to rank poorly.
You would want to make sure Googlebot doesn't crawl the pages for each day. This line in robots.txt would prevent that:
Disallow: /calendar.php
If Googlebot can't crawl all those links I wouldn't expect them to hurt you in any way. It would be very unlikely that Google would try index them. Googlebot will still be able to crawl and index page.html.
I've worked on sites that have large numbers of links on each page disallowed in robots.txt. That situation is quite common for advertising links, product attribute selection, and dynamic databases. Large numbers of such links are usually fine.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.