: Is there a legal way to serve data-dump content without incurring search engine penalties? I have a games site and I'm writing an API that will pull relevant Q&A from my copy of the SE
I have a games site and I'm writing an API that will pull relevant Q&A from my copy of the SE data dump. I will abide by all licensing rules.
That said, I'm hesitant to do this because Google will penalize my site for serving duplicate content.
Is there a legal way to serve data-dump content, while abiding by licensing terms, without incurring search engine penalties? I want to serve this content, but I don't want to compromise the main function of my website.
I plan to use Google API to translate content to my native language. I will not serve the content in English.
More posts by @Sims2060225
3 Comments
Sorted by latest first Latest Oldest Best
If you're worried, just make sure you set up a robots.txt file on your server to tell Google (and other search engines) not to index the data dump content. Your users will be able to access the information, but Google will ignore it.
A search on "robots.txt" will yield all the details of how to set this up.
From Google's article titled 'deftly dealing with duplicate content':
"...our algorithms won't view the same article written in English and
Spanish as duplicate content."
If you're reproducing licensed content in a different language from the original, Google will see your content as original.
If you're reproducing content in the same language, you should link back to the source to ensure that Google ranks that higher than your version. The content reproduced on your site won't incur any 'duplicate content penalties' that effect your whole site; it will simply mean that those particular same-language pages will be seen lower in search results than the original version.
For reliable human translation, I use and recommend mygengo.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.