: Will the content of a PDF on our website affect SEO? I know that Google crawls websites including PDFs, but does the content contained within PDFs affect SEO rankings? We want to put some
I know that Google crawls websites including PDFs, but does the content contained within PDFs affect SEO rankings?
We want to put some PDFs on our website but don't want to see our results take a nose-dive as a result.
I know I can just add a robots.txt directive to exclude them, but I would rather not do this if I don't need to (and frankly don't trust the crawlers to not just index them anyway).
More posts by @Michele947
3 Comments
Sorted by latest first Latest Oldest Best
If the PDF content is unique, then you shouldn't have a problem. If the PDF content is exactly the same as another page, then you might have a problem.
In this situation I would use a canonical link. Unfortunately PDFs do not allow you to specify a canonical link, but as indicated in this Google Webmaster Tools answer:
If you can configure your server, you can use rel="canonical" HTTP
headers to indicate the canonical URL for HTML documents and other
files such as PDFs.
If the materials in the PDF are related to the materials on the site, then they are not going to dilute the theme of the site. They are read by search engines; for example searching "test filetype:pdf" will bring up PDFs which contain the word "test".
To rephrase the question to make answering it easier: If the content of the PDF was in an HTML file format, would it hurt the site? Generally speaking content is good.
In the eyes of Google, a PDF is just another web page – a web page that offers a prime opportunity to boost your content ahead of your competitors and vice versa.
The reason I say is that Google ranks PDF files in the SERPs. It is sure that it crawls the PDF files. If PDF content is fresh and relevant, it will increase your website reputation. It always better to protect the PDF files from crawlers if you think they would be destructive.
Use robots.txt to block the files from search engines crawlers
User-agent: *
# Block the /pdfs/directory.
Disallow: /pdfs/
# Block pdf files. Non-standard but works for major search engines
Disallow: *.pdf
Use nofollow on your links to the PDF
<a href="something.pdf" rel="nofollow">Download PDF</a>
You can also use x-robot-tags to prevent them from indexing
HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow
(…)
If you follow the first two points. The PDF will not effect your SEO, no matter what it contain.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.