: Google is displaying a URL in my site's index with the cache of that result being completely different Ok, this may turn out to be a silly question but I have observed Google displaying a

Ok, this may turn out to be a silly question but I have observed Google displaying a URL in the index with the cache of that result being completely different when it shouldn't even be there in the first place.

Description:

I built a random function for the website docur.co

The function initiates with a request to:
docur.co/random
The robots are blocked from this URL:
docur.co/robots.txt
However Google has followed this URL and produced the following search result:

This is the cache:

My question is: Can anyone tell me what exactly is going on here? As aforementioned, I may have done something wrong...

Update:

Maybe adding the re="nofollow" directly to the anchor on top of the robots directive will ensure that Google will not follow the URL?

10.01% popularity Vote Up Vote Down

: How to include images,scripts that are in default folder of views in web2py I have an image in the default folder of views as an example in the below link (http://127.0.0.1:8000/cooking_recipes/default/images/nav-logo.png)

@Si4351233

Posted in: #Javascript

0 Comments

: Chrome for Android - Auto-refresh Causes Inaccurate Analytics, Wasted Resources, and More So I need a way to suppress the auto refresh feature for mobile Chrome on Android (or any other browsers

@Si4351233

Posted in: #Android #GoogleChrome

0 Comments

: What is the cause of page speed performing extremely poorly on Chrome on Windows and Safari on Linux? I am more interested in the cause of poor page loading speed on Chrome/Windows, since this

@Si4351233

Posted in: #GoogleAnalytics #GoogleChrome #Htaccess #PageSpeed #Windows

1 Comments

: In page RSS buttons provide an important function beyond simply telling the end user that RSS is available. Some users do not use an automated RSS feed reader which captures feeds from all

@Si4351233

0 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Berumen354

You have an error in your robots.txt file.

On line 11 you have Allow: /, a robots.txt file doesn't say what files and directories you can allow, only what you can disallow. The only supported commands for the robots.txt file are "User-agent" and "Disallow".

As the Disallow: /random command is after the invalid command it is possible the Google Searchbot detected an invalid command and because it couldn't process it stopped processing the entire robots.txt file as if it didn't exist at all.

You can validate your robots.txt file using a tool such as the one located at tool.motoricerca.info/robots-checker.phtml
As for why the cacheed version is different to the live version the cached version it what Google see's at the time the spider went through which in the case of your cached link was 6 April 2016 at 16:05:27 GMT.

A new version of your robots.txt file which you could use is...
#The date is August 29th, 1997. #Robots have taken over the world and documentaries cease to be created by humans. #what will happen next?

#Want to join the Docur team?
#E-mail jonbonsilver//at//gmail//dot//com

#Full access for the internet archive.

User-agent: ia_archiver
Disallow: /random

#Every robot that honours the robots.txt standard:

User-agent: *

#Request file from Docur once every second:

Crawl-delay: 1

#Disallowed urls:

#Lets not send bots on a random documentary mission:

Disallow: /random

Disallow: /new-documentaries
#Above is a temp line due to indexing problems.
Disallow: /?page
Disallow: /live-search
Disallow: /vote
Disallow: /favourite
Disallow: /watch-later
Disallow: /save-list
Disallow: /comment
Disallow: /commentlike
Disallow: /commentdislike
Disallow: /add-review
Disallow: /submit-review
Disallow: /add-to/*
Disallow: /post-list
Disallow: /edit-list
Disallow: /documentary-search
Disallow: /new-list-item
Disallow: /settings
Disallow: /notificationread
Disallow: /documentary/*/l
Disallow: */newest
Disallow: */oldest
Disallow: */highest
Disallow: */lowest

10% popularity Vote Up Vote Down

Feed

: Google is displaying a URL in my site's index with the cache of that result being completely different Ok, this may turn out to be a silly question but I have observed Google displaying a

More posts by @Si4351233

: How to include images,scripts that are in default folder of views in web2py I have an image in the default folder of views as an example in the below link (http://127.0.0.1:8000/cooking_recipes/default/images/nav-logo.png)

: Chrome for Android - Auto-refresh Causes Inaccurate Analytics, Wasted Resources, and More So I need a way to suppress the auto refresh feature for mobile Chrome on Android (or any other browsers

: What is the cause of page speed performing extremely poorly on Chrome on Windows and Safari on Linux? I am more interested in the cause of poor page loading speed on Chrome/Windows, since this

: In page RSS buttons provide an important function beyond simply telling the end user that RSS is available. Some users do not use an automated RSS feed reader which captures feeds from all

Login to post a comment!

1 Comments

Back to top | Use Dark Theme