Mobile app version of vmapp.org
Login or Join
Bryan171

: How reliable is the number of hits provided by Google site: command? A partner of my company (not sure whether i can tell their name here - i´ll use example.com instead) asked me yesterday,

@Bryan171

Posted in: #Google #GoogleSearch #GoogleSearchConsole

A partner of my company (not sure whether i can tell their name here - i´ll use example.com instead) asked me yesterday, why there is such a big difference when they enter the site command at google at first with "www" and then without "www", to check the number of indexed pages.

First of all, the Google Webmaster Tools show about 9.000.000 indexed pages.

"site:example.com" provides me about 10.000.000 hits.
"site:www.example.com" are only about 4.000.000 hits.

I first thought that they have some subdomains registered, such as "blog.example.com", or "test.example.com" ...
But entering "site:example.com -site:www.example.com" to detect pages from subdomains other than www, brings up exactly 2 hits.
Where are the other 6.000.000 ?

The next thing is, providing google a little bit more in the query such as "site:www.example.com in" (or other linking words) brings me more hits than the blank site command with www ("site:www.example.com"). Sometimes about 2.000.000 more than the blank www query...

Is the number of hits really that inaccurate?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Bryan171

1 Comments

Sorted by latest first Latest Oldest Best

 

@BetL925

All document counts returned by Google are "estimates". They are often very poor estimates. They are often wildly inaccurate.

From How precise is the number of results in a site: query?


site: queries attempt to estimate how many pages are in our index, but we would never claim that it is an exact amount that is completely accurate. ...Once you get past a few thousand pages, that is not all that useful as a metric.


From Matt Cutts's comment on Is Google Guilty Of Deliberate Query Sabotage?:


We try to be very clear that our results estimates are just that--estimates. In theory we could spend cycles on that aspect of our system, but in practice we have a lot of other things to work on, and more accurate results estimates is lower on the list than lots of other things.


You also might be interested in reading more about it:


Why Google Can’t Count Results Properly
Datacenter comments - Matt Cutts talks about site: estimates at 3:00

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme