Mobile app version of vmapp.org
Login or Join
Jessie594

: Does Google penalize daily updated tags in sitemaps if the data is not daily updated? I've got a sitemap that is generated daily with a lot of links to product pages. These products are

@Jessie594

Posted in: #GoogleSearch #Seo #Sitemap

I've got a sitemap that is generated daily with a lot of links to product pages. These products are imported daily from another data source. Because the update consists of throwing away all current product info and replacing it with the new imported info the last modified date always jumps one day. This is also used in the sitemap. Even for products that haven't changed. All product pages pretend to have been updated.

Will Google penalize the website for pretending the pages have changed from day to day while they haven't?

My solution would be to only change the entry only if the new imported product data differs from the previous data. I just want to make sure this is a useful upgrade to make, while I could also spend my time on other improvements.

10.07% popularity Vote Up Vote Down


Login to follow query

More posts by @Jessie594

7 Comments

Sorted by latest first Latest Oldest Best

 

@Sims2060225

I suggest you to read this Best practices for XML sitemaps & RSS/Atom feeds


Last modification time

Specify a last modification time for each URL in an XML sitemap and
RSS/Atom feed. The last modification time should be the last time the
content of the page changed meaningfully. If a change is meant to be
visible in the search results, then the last modification time should
be the time of this change.

XML sitemap uses <lastmod>
RSS uses <pubDate>
Atom uses <updated>


Be sure to set or update last modification time correctly:

Specify the time in the correct format: W3C Datetime for XML sitemaps, RFC3339 for Atom and RFC822 for RSS.
Only update modification time when the content changed meaningfully.
Don’t set the last modification time to the current time whenever the sitemap or feed is served.

10% popularity Vote Up Vote Down


 

@Lee4591628

I've never liked the idea of updating <lastmod> every day as itt's not just wrong, it's misleading search engines.

In a post over on SO, Google's Gary Illyes wrote:


The lastmod tag is optional in sitmaps and in most of the cases it's ignored by search engines, because webmasters are doing a horrible job keeping it accurate.


I've generally advocated for either using <lastmod> correctly, or not at all. Leaving it off (as well as <changefreq> & <priority>) even makes the file itself smaller and quicker for search engines to read as well.

10% popularity Vote Up Vote Down


 

@Candy875

No it simply ignores the information you have provided when it is incorrect. In this case, web crawlers figure out by themselves how often they should crawl your pages.

10% popularity Vote Up Vote Down


 

@Michele947

I don't work for Google, and can't say for sure what they actually do, but the sensible way for them to treat <lastmod> timestamps would be as hints not to waste time re-crawling pages that haven't changed.

So if you report all your pages as changed every day, Googlebot will just keep crawling all your pages in whatever order it feels like, rather than only focusing on the pages that have changed. In effect, it's just as if you didn't report any last modification timestamps at all.

The main reason to provide correct <lastmod> timestamps is to make changes to your site show up faster in Google's index. If you have hundreds of pages on your site, it's going to take a while for Google to crawl them all and find any changes. However, if you tell Googlebot which pages have changed recently, it can crawl those pages first and avoid wasting so much time on the rest.

Of course, you could just bump up Googlebot's crawl rate in Webmaster Tools instead and hope for the best. But really, it shouldn't be too hard to make your update script preserve timestamps. For example, I assume you're currently doing something like this:

for each product do:
write new page content into product page file;
end do;


If so, just change it to something like this instead:

for each product do:
read old page content from product page file into string A;
write new page content into string B;
if A is not equal to B then:
write string B into product page file;
end if;
end do;

10% popularity Vote Up Vote Down


 

@Karen161

Google will not penalize you for this. In order to get a penalty you really need to go black hat on Google's ass so don't worry about that. Google will find out soon enough if you're content has changes (that's what they've been working on the past few years) and use the lastmod property as a hint.

10% popularity Vote Up Vote Down


 

@Cofer257

No. Google will use lastmod as a hint (same as all sitemap values) but if it decides that your content is not getting updated daily then it will simply ignore it and revisit your pages on its own schedule.

10% popularity Vote Up Vote Down


 

@Kevin317

I've never heard anything about a penalty due to this. At worst you're wasting the spider's time, but that's part of why we have computers in the first place: doing tedious repetitive things. Still, you should ideally be addressing the issue.

This...


My solution would be to only change the entry only if the new imported product data differs from the previous data.


...is what you should be doing in the first place, regardless of external considerations like sitemaps, etc. If your content isn't different(and I would include deleting and replacing with identical information in that description), then your lastmod date shouldn't be modified. Here you're wasting your own resources. You haven't said how many products are involved, but there's going to be a point where this process is going to get slow and computationally expensive.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme