Mobile app version of vmapp.org
Login or Join
Vandalay111

: Does Google search prefer PDF files over HTML files in its search results? I have noticed for some time, that when I search for something using Google search, I gets lots of hits that are

@Vandalay111

Posted in: #ContentType #GoogleSearch #SearchResults

I have noticed for some time, that when I search for something using Google search, I gets lots of hits that are all PDF links. (I know I can tell it not to find PDF hits using -filetype:pdf)

But the strange thing, is that there could also be an HTML web page with same content and also on the same site where the PDF file is, but the HTML link seems to have lower priority and is not shown until many pages later.

For example, I myself keep a PDF file and also an HTML page of same content as the PDF. Yet, when I search for the title, Google always return a link to the PDF file first, and I have to flip many many pages to see link to the HTML page, where the PDF itself was on it!

My question is: Is this a known bias by Google search for finding and returning links to PDF's over HTML pages in its search algorithm? And is this something known? and if so, what is the reason behind it?

I do not have proof of this, other than it is based on my own observation. It will be interesting if there a way to verify or not this search bias for PDF links over HTML link in a more systematic way.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Vandalay111

1 Comments

Sorted by latest first Latest Oldest Best

 

@Mendez628

I don't have any evidence but there are many reasons why Google may prefer the PDF version. For example if that is the most linked to version, Google may deem that copy of the content to be the best source - especially if the HTML alternative links to the PDF too. This might give more relevance to that version of the content.

Also, if the two copies of the content are largely the same then for the sake of diversity, they are not likely to be presented side by side on the search result page.

It is also possible that Google has evolved to understand the type of content that is in a PDF and maybe it feels certain types of content may be best shown in that format such as a scientific paper.

I would be interested to know if anybody has any hard data!

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme