Mobile app version of vmapp.org
Login or Join
Angie530

: What is duplicate content and how can I avoid being penalized for it on my site? This is a general, community wiki question regarding duplicate content. If your question was closed as a duplicate

@Angie530

Posted in: #CatchAll #DuplicateContent #Penalty

This is a general, community wiki question regarding duplicate content.

If your question was closed as a duplicate of this question and you feel that the information provided here does not provide a sufficient answer, please open a discussion on Pro Webmasters Meta.




What does Google consider to be duplicate content?
Will the way I am presenting my content result in a duplicate content penalty?
How can I avoid having my site's content treated as duplicate content?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Angie530

1 Comments

Sorted by latest first Latest Oldest Best

 

@Sarah324

Google's Duplicate Content webmaster guide defines duplicate content (for purposes of search engine optimization) as "substantive blocks of content within or across domains that either completely match other content or are appreciably similar".

Google's guide goes on to list the following as examples of duplicate content:



Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
Store items shown or linked via multiple distinct URLs
Printer-only versions of web pages



Penalties

Search engines need to penalize some instances of duplicate content that are designed to spam their search index such as:


scraper sites which copy content wholesale
simplistic article spinning techniques which generate "new" content by selectively replacing words in existing content.


When search engines find duplicate content they may:


Penalize an entire site that contains duplicate content.
Pick a page as the canonical source of the content and lower the priority or not index the other page with the duplication.
Take no punitive action and index multiple copies of the content


Avoiding internal duplication

When asked about duplicate content, Google's Matt Cutts said that it should only hurt you if it looks spammy, however many webmasters employ the following techniques to avoid unnecessary content duplication:


Ensure that content is only accessible under one canonical URL
If your site must return the same content under multiple URLs (e.g. for a "print view" page) specify a canonical URL manually with a link element in the document header
In cases where your site returns similar content based upon parameters encoded in the URL (e.g. sorting a product catalog) exclude the URL parameters in Google Webmaster Tools


Content Syndication

Publishing content on your site that has been published elsewhere is called content syndication. Creating duplicate content through content syndication can be OK:


As long as you have permission to do so
You tell your users what the content is and where it came from
You link to an original source (A direct deep link to original content from the page with the copy, not just a link to the the home page of the site where the original can be found)
Your users find it useful
You have something to add to that content such that users would rather find that content on your site than elsewhere. (Commentary or critique for example.)
You have enough original content on your site as well (at least 50% original, but ideally 80% original)


While Google doesn't penalize for every instance of duplicated content, even non-penalized duplicate content may not help you get visitors:


You are competing with all the other copies that are out there
Google will likely prefer the original source of the content and the most reputable copy of the content.


Google will penalize duplicate content published on your website from other sources if:


It appears to be scraped or stolen (especially without attribution).
Users don't react well to it (especially clicking back to Google after visiting your site.)
There are so many copies of it out there that there is no reason to send users to your copy of it.
Your copy isn't the original, most reputable, or most usable; and doesn't have any commentary or critique.
Your site doesn't have enough original content to balance all the republished content.
You duplicate pages so often within your own site that Googlebot has trouble crawling the full site.


Internationalization and Geo Targeting

Content localization is one area in which duplicating content can be beneficial for SEO. It is perfectly fine to publish the same content on sites targeted at different countries that speak the same language. For example you may have a US site, a UK site, and an Australian site, all with the same content.

With a site for each country, it is usually possible to rank better for users in that country. In addition, it is possible to specifically cater to users in each country with minor spelling differences, pricing in the currency of the country, or product shipping options. For more information on setting up geo-targeted websites see How should I structure my URLs for both SEO and localization?

Dealing with Content Scrapers

Other sites that steal your content and republish it without permission can occasionally cause duplicate content problems for your site. Search engines work hard to ensure that it is hard for scraper sites to benefit from duplicating your content. If a scraper site is causing problems for you, then it may be possible to get the site removed from the Google index by filing a DMCA request with Google

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme