Mobile app version of vmapp.org
Login or Join
Rivera981

: Google Webmaster Tool: “Severe health issues are found on your site” I just saw the "Severe health issues are found on your site. Is robots.txt blocking important pages?" message on Google

@Rivera981

Posted in: #Magento #RobotsTxt #WebCrawlers

I just saw the "Severe health issues are found on your site. Is robots.txt blocking important pages?"
message on Google Webmaster Tools.

This is what I have in robots.txt:

User-agent: *
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/
Allow: /media/
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/
Disallow: /contents/
Disallow: /contents/fr/
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /admin/
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /*.js$
Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?p=*&
Disallow: /*?SID=
Disallow: /*?limit=all


I copied this from Magento's default robots.txt and I added some folders that don't need to be indexed.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Rivera981

1 Comments

Sorted by latest first Latest Oldest Best

 

@BetL925

Google gives you that message when it finds out that many pages on your site that it had been indexing are now all blocked in robots.txt. Whether or not that message is legitimate for you depends on whether you actually want those blocked pages crawled.

I'm not specifically familiar with Magento, but the following blocked URLs seem like they might contain important content that you want indexed in search engines:

Disallow: /contents/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /newsletter/


In addition to checking that those aren't disallowed by mistake you should:


Run a site:example.com search on Google and see what they have indexed. Check to make sure there isn't anything important that is now blocked by robots.txt
Examine the URLs in your sitemap. Nothing in your sitemap should be blocked by robots.txt
Use Google Webmaster Tool's "Fetch as Google" tool to make sure that they can actually download some URLs that shouldn't be blocked.




I would advise against disallowing the crawling of all JavaScript and CSS files. You should remove the Disallow: /*.js$ and Disallow: /*.css$. Google has said that they really want to be able to render your pages. They do so for screenshot as well and ranking purposes. Your site may not rank as well if they can't do this.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme