Mobile app version of vmapp.org
Login or Join
Tiffany637

: How do I find when an URL was first indexed by Google? How do I find out when a particular URL was first indexed by Google? I'd prefer a solution that works even for competitors' URLs

@Tiffany637

Posted in: #Dates #Google #GoogleIndex #Indexing #Url

How do I find out when a particular URL was first indexed by Google? I'd prefer a solution that works even for competitors' URLs that are not owned by me.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Tiffany637

3 Comments

Sorted by latest first Latest Oldest Best

 

@Michele947

There may not be any way to find out when an arbitrary web page was first indexed by Google — certainly I don't know of any way to do so. It's possible that Google simply does not store that information, since there's no real reason why they'd need to. Besides, even if they do store this information, they really have no particular reason to make it freely available to third parties.

(If it's your own page, and you have access to your old webserver access logs, it's easy — just search the logs for the first visit from Googlebot to that page. But otherwise there may be no way to tell for sure.)



In any case, the method described by Zistoloen and Stephen Ostermiller in their answers does not generally reveal the date when a particular URL was first indexed by Google. Rather, it shows the date that Google thinks the content at the URL was published or last updated on, and is often based on Google's more or less reliable attempts to "sniff" dates from the page content itself.

In this video, Google's Matt Cutts touches briefly on how these dates are chosen. For convenience, I've transcribed the relevant piece of the video (approximately from 2:09 to 2:22) below:


"... often you'll see the date, as we infer it, or when we first saw it, whenever we crawled that page, or if we can find it somewhere on the page, and we can extract that date, you'll see that right at the very beginning of the snippet."


For pages like blog posts, wiki pages or Stack Exchange questions, where the software running site automatically reports an accurate creation / modification date on the page itself, the date reported by Google is likely to match it. For other types of pages, though, Google's date sniffer has to work harder, and it doesn't always get it right (whatever "right" may mean, in this context).

In particular, these dates are basically useless for determining how long ago a page was indexed, for two reasons:


If a page was modified recently, and the modification date is displayed prominently on the page, Google may pick it up as "the date" of the page, even if the modification was completely trivial.

For example, this rather old wiki page (which archive.org first indexed in 2003) is currently datestamped by Google as being from November 10, 2014 — the date at which it was most recently edited, as shown at the bottom of the page. The change that happened on that date? Just removing a single link from the bottom of the page.
Conversely, Google seems to be happy to accept very old "publication dates" if it finds them on the page — even those that predate the launch of the World Wide Web.

For example, this page on an old programming contest is dated by Google to September 15, 1986 — actually the date of the event described on the page. Similarly, this page documenting a student strike in 1970 is dated by Google to May 10, 1970 (the date of one of the scanned documents on the page), and, even more absurdly, this Linux manual page is dated by Google to November 4, 1989 (a random example date used on the page).

You can find plenty more such examples by using the custom date range search described by Stephen and Zistoloen, but setting the upper end of the range to, say, August 6, 1991.

10% popularity Vote Up Vote Down


 

@BetL925

To know the age of an URL you can follow this link by replacing example.com by the URL you want:
www.google.com/search?tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2000&q=site%3Ahttp%3A%2F%2Fwww.example.com&safe=active&gws_rd=ssl

For example, here's the result from Google for the Meta site of Stack Overflow:


Otherwise, the Wayback machine is also a good solution but less precise from my experience.

10% popularity Vote Up Vote Down


 

@Heady270

Zistoloen found a way to have Google display the date when it first indexed the content of the page. I'm adding it to my answer as well because I think I can explain it more clearly.


Search Google for something that brings up the page you want as a result
Use "Search Tools"
Select "Custom Range..." from the "Any time" drop down
Put in a large date range such as 1/1/1900 to 1/1/2020


Google will then show the date that it discovered the content that is on the page in the search result.



If the page gets updated with new content, Google also updates this date. So it is more of a "first indexed this content" date rather than "first indexed this URL" date.



The Google cache for a page shows when the page was last indexed. You can see that the Stack Exchange home page was last indexed today:





Another option is using the Internet Archive's Wayback machine. That shows you what a page looked like in the past. You can figure out about when the pages were first published. Both Google and the Internet Archive crawl and use the page shortly after it is first published.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme