Mobile app version of vmapp.org
Login or Join
Holmes151

: How to tell googlebot crawl more page from my site? I have a small local news site, every one increase hundreds of articles. I have set a crontab, list new articles' url into a sitemap by

@Holmes151

Posted in: #Googlebot #GoogleSearch #Sitemap

I have a small local news site, every one increase hundreds of articles. I have set a crontab, list new articles' url into a sitemap by hour.

for example, I have 24 sitemaps which is generated by hour.

sitemap01.xml which has included articles from 0:00 - 1:00

sitemap02.xml which has included articles from 1:00 - 2:00

<url>
<loc><![CDATA[http://www.example.com/article/'.$row['id'].']]></loc>
<lastmod>'.date("c",strtotime('now')).'</lastmod>
</url>


I have checked sitemap-list in google webmasters, I find google don't crawl my sitemap everyday, some sitemap's last crawl date is 10 days before. So now it only increase 8-40 articles into google index everyday.

but when i use

cat /var/log/httpd/access_log |awk '{print }' |sort |uniq -c |sort -n |tail


I could see 7 googlebot ip address in my access_log, total access in my site 5000+ per day. What are they doing?

And If I use encrypted.google.com/webhp?hl=en#q=site:example.com&hl=en&tbs=qdr:h&start=0
some page would display in search resualt every hour, but they are only news article. when I released, it like a static page, only vititor's comments would be increased in it.

So how to let google crwal more of my page? make the crawl more effective? access all my new page and ignore the older page which it has been crawled?

Thanks.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Holmes151

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

Google does not guarantee using any sitemap that is submitted. In fact, Google is rather old school and would rather crawl your site following links. Most of the time, Google will check the sitemap against what it is able to discover using it's crawler and compare the two. If there is no difference, then Google will likely ignore the sitemap except to continue to compare the sitemap with what it has discovered.

As well, Google will visit each site according to freshness and in some part, SERP performance. If your site updates pages often, then over a period of time Google will increase the frequency of fetches for your site. If your site is very popular, then Google may speed up the increase a little bit more.

Some pages are determined to be timely and relevant to timely events. Hence your news pages. Often these are understood and indexed fairly quickly, but not often so that changes such as user comments will be displayed. Google does prefer pages that look like web pages. For example so-called "live" content is not preferred and does not get fetched often like an ever-changing web page. As well, Google can recognize comments and does not concern itself with this as fully relevant or valuable to gain in a timely manner. So if a page is popular with comments, Google does not see this as necessarily an update to the content portion of the page and therefore only partially effects the freshness signal.

In the end, you have no control over Google and you cannot tell them to hurry up. Google will do what Google will do when Google feels like it is best. My recommendation is that your site appear to be as old schools as possible and actual content be kept as fresh as possible. Do not try and fool Google. Google knows the tricks by now. Just do honest work and it will all work out okay.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme