Mobile app version of vmapp.org
Login or Join
Cofer257

: Can Google show some pages deleted six months back and blocked using Robot.txt? If I have deleted some pages on my website six months back and blocked these pages using Robots.txt, can that

@Cofer257

Posted in: #Problem #WebCrawlers

If I have deleted some pages on my website six months back and blocked these pages using Robots.txt, can that non-existent page show up on Google?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Cofer257

2 Comments

Sorted by latest first Latest Oldest Best

 

@Heady270

Robots.txt does not actually prevent Google from indexing a page, it just prevents Googlebot from crawling it. Google may choose to index pages in robots.txt if they believe the page is important enough and they can tell something about the page from links that point to it. Here are two cases in which Google has put forward for doing so


The California DMV had used robots.txt to block their entire site. Users still want to be able to find their homepage, as well as deep links into some important pages on the site
del.icio.us had blocked googlebot from crawling tag pages for the bookmarks that people had mad. These pages got enough external links and were important enough pages that Google put them in the index anyway.


When Google indexes a page without crawling it, it tends to make up the title based on external links. It usually won't have a description or a cached version.

If you put a page that had been indexed into robots.txt, googlebot will stop crawling it, however the decision to stop indexing it may happen much later. Google may be happy indexing the page until it feels that what it crawled is sufficiently out of date. It may continue to index it indefinitely.

If you want to remove a page from the index, you should either do so through webmaster tools or use the robots noindex meta tag.

Google views robots.txt as a "do not crawl" directive, but not a "do not index" directive.

10% popularity Vote Up Vote Down


 

@Goswami781

It is possible though unlikely that page content would be cached for this long on some Google servers.

If you want to make sure this content goes away from Google entirely, you should sign up for a Webmasters account and verify your site. From there you can ask Google to remove the content from the cache. Check this article for more information: support.google.com/webmasters/bin/answer.py?hl=en&answer=1663416

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme