Mobile app version of vmapp.org
Login or Join
Merenda212

: How can I clone or mirror a site without SEO penalties for duplicate content? I am a web developer and I want to create clones of the sites I've developed for clients, so that I have an

@Merenda212

Posted in: #Seo #WebDevelopment

I am a web developer and I want to create clones of the sites I've developed for clients, so that I have an "original copy" on a subdomain of my own website, so that I can showcase my work to new clients.

What is the best way to not get my clients original websites penalised for duplicate content?

I am planning to have a robots.txt file that disallows all robots, as well as using

<link href="http://www.client-canonical-site.com/" rel="canonical" />

in the <head> of the pages.

Is that sufficient? Should I use rel=nofollow on all the links as well?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @Merenda212

4 Comments

Sorted by latest first Latest Oldest Best

 

@Angela700

robots.txt is not going to stop Google or others from Indexing the blocked pages. It's simply asking them to not crawl these folders.

If however your main domain, or a client links directly to the blocked pages Google will index the page.

Be sure to add noindex and nofollow to those pages meta tags, but to be safest you should password protect the folder they are in.

10% popularity Vote Up Vote Down


 

@Heady270

The best solution is rel="canonical". Some robots are bad robots and will crawl your page, then they will place links on their SERP's and after that Google will know about them. I've tested this with one of my websites, and some of the links was indexed, even there are rule - User-agent: * Disallow: /

Good luck

10% popularity Vote Up Vote Down


 

@Carla537

The problem with web bots is that you have to assume that they will follow all the rules that you've set out.

If they're going to follow any of them, it will be the robots.txt file, so having just that that should be enough. However, the rest won't hurt.

10% popularity Vote Up Vote Down


 

@Kaufman445

Use robots.txt and put the following inside:

User-agent: *
Disallow: /


That's really all you need. Also, if a client's website is indexed as the original one, it won't get a penalty from duplicates.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme