Mobile app version of vmapp.org
Login or Join
Berryessa370

: Question about robots.txt Google shows urls are blocked I am working on this site dealsin.us and the Google webmasters tool shows that there are about 9800+ urls blocked by robots.txt. You can

@Berryessa370

Posted in: #Seo #WebCrawlers

I am working on this site dealsin.us and the Google webmasters tool shows that there are about 9800+ urls blocked by robots.txt. You can view the robots.txt here I have blocked some directories which are not meant for the users and our only to the staff behind the website. I am really confused and would appreciate any help on this.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Berryessa370

1 Comments

Sorted by latest first Latest Oldest Best

 

@Yeniel560

Webmaster Tools used to show you all the URLs that you have blocked with robots.txt (under Crawl Errors), however that functionality appears to no longer exist. There is only the Crawler Access section that lists how many URLs are blocked.

If your pages are appearing in search results without problems (a quick site: search shows that is the case) then there is probably no need to worry. It's likely stemming from some extra URL parameters somewhere, for example if your 'submit' page has a parameter for every category then all those will show up as blocked.

However, looking at the robots.txt I do notice a few things. First, if I am not mistaken the 'allow' line actually overrides the lines above it! As noted by Ilmari in the comments, it does not override other rules but is simply redundant. You should remove that line since everything is crawled by default.

Second, the 'sitemap' line should be separate from the rest, i.e. have a blank line after it. And the * wildcard after /engine/ does nothing, as robots.txt only matches from the start of the URL anyway.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme