Mobile app version of vmapp.org
Login or Join
Karen161

: How to fix robots.txt file error in my GWT? In my blog’s Webmaster Tools, there is a notification in Crawl Errors section, that is Google couldn't crawl your site because we were unable

@Karen161

Posted in: #CrawlErrors #GoogleSearchConsole #RobotsTxt

In my blog’s Webmaster Tools, there is a notification in Crawl Errors section, that is
Google couldn't crawl your site because we were unable to access the robots.txt file.

My blog’s robots.txt file is:

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /

Sitemap: example.blogspot.com/feeds/posts/default?orderby=UPDATED

I don’t think the above file details are wrong but I don’t understand why I received such dangerous notification.


How can I fix this issue?

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Karen161

3 Comments

Sorted by latest first Latest Oldest Best

 

@Bryan171

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /


The error comes up because of the Disallow: /search
* means of engine bots by entering the comment /search, you are basically blocking them from going in the search index of your site. Take notice the code above:

User-agent: Mediapartners-Google
Disallow:


Disallow: means that you are allowing the adsense bots to crawl everywhere unrestricted.

The Allow: / may not be interpreted by older bots but it well interpreted by google bot.

10% popularity Vote Up Vote Down


 

@Bryan171

If you have a message saying that "Google couldn't crawl your site because we were unable to access the robots.txt file". Then it is not the contents of the robots.txt file that is in question, it is that Google simply couldn't access the file. And when Google can't access a robots.txt file then it won't crawl the site.

Using fetch as Googlebot in Webmaster Tools is a good idea. If your robots.txt file fetches successfully then it could be a past issue. If not, then you obviously need to look further to ensure Googlebot access.

10% popularity Vote Up Vote Down


 

@Angela700

There is no such official command as Allow in robots.txt. By default, everything is allowed. (However, it is possible to use Allow to give exceptions when you are disallowing many directory paths in one route. Often, there is no requirement for this though).

Not that I would expect it to cause an issue however.

There is no reason to specify the Mediapartners-Google user agent either as this too, is just saying allow the crawling of everything.

All your robots.txt needs from the above is the following:-

User-Agent: *
Disallow: /search/

User-agent: Mediapartners-Google
Disallow: /

Sitemap: latest-seo-news-updates.blogspot.com/feeds/posts/default?orderby=UPDATED

Google Webmaster Tools will report a warning to say X amount of URL's on your site were blocked by your robots.txt if you are disallowing bots to any part of your site, in which case you are at /search/. You can expand this notice to view specifically what URL's were blocked and you may well find that it is only the ones you want disallowed that Google Webmaster Tools is warning about.

You can also run an application such as Xenu to crawl your site and establish what URL's can be crawled specifically. You can also fetch as Googlebot and test your robots.txt file from within Google Webmaster Tools that will alert you to any further issues or at least complete details of any issues.

Edit:
Upon further clarification, added Disallow directive for UA Mediapartners-Google.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme