Mobile app version of vmapp.org
Login or Join
Hamaas447

: Dealing with bots that still fetch 301's months later Over the course of 2013 & up until September 2014 we migrated roughly a dozen domains to a new platform. In most cases the domain

@Hamaas447

Posted in: #301Redirect #Redirects

Over the course of 2013 & up until September 2014 we migrated roughly a dozen domains to a new platform. In most cases the domain names remained the same, we simply pointed the domain to a new IP address where the new appservers were located. Since the underlying architecture was significantly different we put in place upwards of 500 redirects (all 301's) so that users & bots would be directed properly to the new pages, and over time all the old links have been replaced in Google, Bing, etc. with the new links.

I recently did an analysis of our web server logs and found that in the past 30 days these redirects have been hit over 211,000 times, with over 208,000 of them identifying themselves as various bots from their User-Agents. One particular bot seems to only ever hit URL's that result in a 301 response and never proceeds further. I can't find an entry in our logs for this particular UA that results in a 2xx, 4xx, or 5xx response.

Given that we've had these redirects in place for anywhere from 9 months to 2 years and that the vast majority of traffic hitting them are bots (in many cases they are exclusively being hit by bots), does it make sense to change these from say a 301 to a 410 (Gone) to inform these bots that the URL's are gone for good? I'd like to eventually do away with all these redirects if I can since they just add to the complexity of our configurations.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Hamaas447

1 Comments

Sorted by latest first Latest Oldest Best

 

@Nimeshi995

If you break these 301's, you will lose all value for any link you break. You may not care of course. But then again, you might.

Many bots work from databases that are shared, sold, passed around. As well, many are following existing links to your site. Also consider that there are a ton of scrapers from domain monetizers that will continue no matter what you do. These are the scum of the earth along with hackers. But the good news is that domain monetizers generally come from the same IP address or IP address block. Since these are not users, you will be able to block whole blocks of IP addresses to keep the junk out of your log files without effecting real users since users come from subscriber blocks and not webhosts.

Generally, what you are seeing is rather normal. Even if you present a 410, this will continue for a long time if not forever. That is just the way it is. I still get hits for pages that have been gone for over a decade! It will mostly fade away over time however.

You will want to satisfy worthwhile bots of course, but perhaps break anything that no longer has value. Search engines will take note of any 404 or 410, but may still follow a new link if it appears on the net. This means that while they will recognize a page is gone, they may try again from time to time.

Lastly, without seeing your log file entries, it is impossible to come up with a complete strategy for you. If you have a question regarding any access found in your log file, we will be able to assess it's worth (perhaps) and help you to either block it or determine if you whether you want to allow access and potentially if a 301 or 404 or 410 is appropriate. I do that a lot here so please feel free to post a question here.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme