: Spam site re-proxying my site, how do I stop this? So there's some guy out there duplicating my http://bitlucid.com website in some nonstandard way and for some nefarious reasons. I first found
So there's some guy out there duplicating my bitlucid.com website in some nonstandard way and for some nefarious reasons. I first found out about it when I saw a "copy" spam url in the logs for "explorebuncombe.com", and opened that url to find exactly my site.
Here we can see some of the (many other) copy domains: www.copyscape.com/?q=http%3A%2F%2Fbitlucid.com
I'm thinking that I can just slap a canonical tag on my site and it should only be to my benefit, if anything, but I want to ensure that I have a way to combat this approach for other domains that I host for myself or for others. I'm not concerned about the copying of content, so much as the potential to affect my ranking in google and the hotlinking of content. I did put a canonical tag on the site's source. How would I go about preventing direct resource proxying if that's what this is (e.g. via nginx or apache)?
More posts by @Alves908
3 Comments
Sorted by latest first Latest Oldest Best
Hmm do you see the paradox here? You are asking to shut down a site that scraped you under DMCA, yet you said they are allowed. I dont think you are going to get too far in the battle to shut them down. Here is the contents of your license that is [still] on the page footer as of Mar 13 2015:
The person who associated a work with this deed has dedicated the work
to the public domain by waiving all of his or her rights to the work
worldwide under copyright law, including all related and neighboring
rights, to the extent allowed by law.
You can copy, modify, distribute and perform the work, even for
commercial purposes, all without asking permission. See Other
Information below.
So, I opened a scraper too, its therealroyronalds.com if you wanna check it out! Just kidding...some things you can try:
Remove that license then submit your complaint to Google scraper report then disavow their domains in Google webmaster tools. Also you should report them as a phisher since they are posing as you. Make a lander on your site explaining what happened and be sure to keyword using their domains. This will ensure you pop for their results once they disapear. Also onboards folks who may have been phished in by them.
If they are using an origin pull (such as file_get_contents or CURL) then change the entire site, serve evil.js with evil.css with nasty beastiality images when the requests are coming from their domains(s) or server IP's. Heck you can even cloak-stuff in terrible spam words so they get hammered out of results. Get creative...maybe a rant about how much they hate Google and make them call Cutts, Mueller, Grigorik, & Far very bad names. You can do this by analyzing request headers to rewrite response and/or use HTACCESS in a similar way to how hotlinking prevention works.
If they are CNAME masking your site, its a bit different. There is no origin-pull in this case but you can still pick up the request and redirect (preserving the URI) to your domain instead. So if they came in on exploredare.com/me it would 301 to bitlucid.com/me. or perhaps you could route them to a bitlucid.com/you-came-from-a-scraper kind of lander.
If they are sniping data from utils and things such as JSON[P], you can use CORS to limit what domains are allowed to use the assets. If that is not feasible, use the same tactic -- serve bait data instead of real when requests come from one of these scrapes/domains. Do the same for sitemaps, feeds, RSS, or anything else. You can fill them with random [porn] links or anything...their apps will probably still pull and display the data.
You should use an SSL and explicitly call all resources HTTPS. This would break most of their site and help a bit to distinguish yourself from the nefarious ones. Be sure to set HSTS, CST, and other higher security flags so that it reiterates the fact that you are you. If you have the cash, get the higher grade cert that comes with the big green "identity bar". Its highly doubtful that they will get a high grade cert (or a cert at all), and if they are a scraper it means they would have to go through and change all the src to http every time.
Yes. You have a problem.
Explorebuncombe.com: Is an events site and does not appear to have a copy of your site but may have in the past- who knows. It is not related to the next two sites.
Exploredare.com: Is a copy of your site without a frame or 301 redirect. The IP address is 50.56.48.239 which is on a Rackspace IP address block.
Exploregastonia.com: Is a copy of your site as well. I did not check for a frame or redirect- I rather doubt if these techniques are being used. The IP address is 50.56.48.239 which is on a Rackspace IP address block.
Both sites are hosted on Secureserver.net. Secureserver.net has the IP address of 97.74.104.222 which is on a GoDaddy.com IP address block.
Secureserver.net does not have a home page and gives a 404 error no matter how you access it.
The registration for Secureserver.net is under domaincontrol.com which resolves to 127.0.0.1 and does not have www, however, it is the product of GoDaddy.
This means that Exploredare.com and Exploregastonia.com as well as Secureserver.net are registered through GoDaddy.
Adding a canonical tag is like closing the gate after the horse has left the stable. I would recommend it. But too late for this issue.
You have a licence notice on the bottom of your site. I suggest removing it immediately. You may have given people permission to copy your site. This is the license you granted: creativecommons.org/publicdomain/zero/1.0/ I would not suggesting that this license never existed- they are on the site copies.
The damage is that there is duplicate content on the web and your search prowess has been sapped.
I suggest:
1] Changing your site significantly with the canonical tag on each page. This includes content, template, etc. Make sure that your new site outperforms the old site from an SEO standpoint. We can help with some of that here.
2] Immediately filling a DMCA (Digital Millennium Copyright Act) with Google complaint- info found here: support.google.com/legal/answer/1120734 If this page is not helpful, go here: www.google.com/webmasters/tools/dmca-dashboard You will need a Google Webmaster Tools account.
3] Call GoDaddy (480) 505-8877 and explain what is happening. Use this page if you need to. They should be disabling the website and e-mailing the site owner immediately upon your complaint. As well, you may have the legal right to the site owner contact information including name, address, and phone number for your legal records. You may have to talk to a supervisor. If they refuse the contact information, you may need to get a lawyer to write a letter to their legal department. I would mention your intent to seek legal counsel even if you do not in the end.
4] Call RackSpace 1 (800) 961-4454 and explain what is happening. Use this page if you need to. They should shutdown these sites immediately upon your complaint. As well, you may have the legal right to the site owner contact information including name, address, and phone number for your legal records. You may have to talk to a supervisor. If they refuse the contact information, you may need to get a lawyer to write a letter to their legal department. I would mention your intent to seek legal counsel even if you do not in the end.
There seems to be a lot of this these days. I am not sure what the payoff is. What is the site to gain from copying your site? Nothing.
What you should do is update your website right away and add some sort of time stamp to it for everyone to see. For example:
Website updated on: MM-DD-YYYY
Next, make the links (especially images) more absolute. So instead of links like:
abc.html
image1.jpg
Use links like:
www.example.com/abc.html http://www.example.com/image1.jpg
Once that is done, install some limiting module that limits the requests so that future "script kiddies" wont content scrape your website so easily.
Also, add a terms and conditions section and get it indexed with google to clearly let people know that you made your site.
If this continues, you could try filing a complaint against the company who has the IP address that is copying your content at:
dmca.com
OR
you can do a whois lookup on the IP and send an email to the email address of the abuse department that appears in the whois record.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.