Mobile app version of vmapp.org
Login or Join
Annie201

: How does Google recognize publish date of a post When I search something in Google, I sometimes see the publishing date of the post/article underneath. I've also searched for an article of my

@Annie201

Posted in: #GoogleSearch

When I search something in Google, I sometimes see the publishing date of the post/article underneath. I've also searched for an article of my own which I have on my Wordpress-powered site, and Google also recognizes its publish date.

When I open my website's source, I don't see any special tags or anything that indicates the publish date. It is only written in a regular div, with nothing special tagged that would tell the SE that it is the publishing date (I could have any other dates of other things around the page too).

So is it hardcoded into Google the exact place of the Wordpress publish date in the DOM tree, or am I missing something?

I'm building a new website, with my own CMS, and I'm trying to find out how to implement date published recognition.

10.05% popularity Vote Up Vote Down


Login to follow query

More posts by @Annie201

5 Comments

Sorted by latest first Latest Oldest Best

 

@Ogunnowo487

I just had a problem that all of my main pages were shown as being updated over 4 years ago, even though Google knows that that's not true because the pages have been indexed for that long and change substantially from month to month. After being really puzzled, then really annoyed, then puzzled again, I finally found the problem. Our legal terms were being served in a hidden div with a "Last updated: October 30 2007" and the div was being loaded on almost all our pages. (Because it pops up on registration) I've removed it and now I assume the date will either disappear or be corrected to something more reasonable.

A cautionary tale and one more piece of evidence that they check the semantics of the site more than the technical details or their own indexing history.

10% popularity Vote Up Vote Down


 

@Ann8826881

I very much doubt that the published date of a post or article is based on the <lastmod> entry in an XML sitemap (as others have suggested) or the Last-Modified HTTP header for that matter. An XML Sitemap is only advisory, not authoritative. The last modified date of a document is probably not the same as the (original) publish date of an article. And, as I mentioned in my comment at the top of the page, the last modified date of a document is probably more important for caching and perhaps determining crawl rates. The Last-Modified HTTP header of dynamically generated pages are often very close to the actual date/time (as it is for WordPress blogs).

An RSS/Atom feed on the other hand does contain this specific nugget of information. And indeed, on Wordpress sites that do not include the publish date in the content, the publish date still appears in Google's search results. And as far as I can tell, this matches the date in the RSS Feed.

EDIT#1: However, an RSS feed does not necessarily contain all the pages. In most cases it should only contain the latest or most recently updated pages. But there is no reason that Google should forget what it has already read, and providing the content of that page has not changed then neither should the last modified date.

If there is no RSS feed I think Google is clever enough to analyse the page content. Particularly if dates are marked up 'semantically' with the help of microformats. It's perfectly feasible that Google will see the following as the authoritative published date for an article that it is contained within:

<abbr class="published" title="2010-08-27T15:45:00-0700">
Friday, August 27th, 2010
</abbr>


Google certainly does read microformats - hCard, hReview, etc.

Just to add, I don't think Google would state a publish date unless it was able to find something authoritative that would suggest this. It's not going to deduce a 'publish date' on speculative data, since an incorrect 'publish date' is no use to anybody and Google would get a lot of stick for it!

And just for the record (if @Tom is suggesting otherwise :) I think posts/articles should have the publish date visibly displayed. Many don't, and this can be frustrating for the reader particularly when researching technology issues and you find that having read half way through the article it's out of date!

EDIT#2: I have since experienced a similar annoyance that @mmdanziger details in his answer. On one of my old sites I have text of the form "Site Last Updated Sun 17th Jun 2012" (not marked up in any special way) at the top of every page (written to the page with JavaScript!!). This same date has been picked up by Google and now appears alongside several (but not all) pages that appear in the SERPS - this certainly is not the publish date of the page. It would seem that Google is simply scrapping the page for a string of the form "last updated (datestring)" (having processed the JavaScript!!). This particular site does not have an RSS feed. The site does have a Sitemap.xml file but the dates are different.

I have noticed similar behaviour on other sites also.

10% popularity Vote Up Vote Down


 

@Shanna517

you should go through xml sitemap or RSS feed version to index your publish data through major search engines such as Google, Yahoo, & MSN. Generate XML sitemap for your website and submit it in web master tools for index.

10% popularity Vote Up Vote Down


 

@XinRu657

I think it intelligently looks for any dates on the page and when it's confident that it's the relevant date it uses it.

It's a little difficult sometimes as I think it can have a negative impact on SERP click-ability, I suppose it can have a temporary positive impact if it's a recent article/post but I'm fairly sure my sites would be better off without it (Google searchers might not be better off without it though!)

There are no options to control it via Google, only with your own methods. You can either:


Replace dates with dynamically generated images in an attempt to stop Google discovering it, but this can lead to other problems such as visual alignment/consistent font display/accessibility etc.
Strip all dates from the pages (this again might be frustrating for visitors/users when they want to discover the age of a source if you have relevant information).


For these reasons I would just ignore it.

10% popularity Vote Up Vote Down


 

@Dunderdale272

I think Google uses Sitemap and RSS feed to recognize published date..
you can impliment this feature in your CMS by creating a xml site map according to Standards.

<lastmod>2011-08-18</lastmod>

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme