Mobile app version of vmapp.org
Login or Join
Bryan171

: URL/Crawl Errors continue to increase after pharma hack I'm working with a site that was hacked a couple of months ago and was sending out a ton of spam. The problem seemed to have started

@Bryan171

Posted in: #CrawlErrors #GoogleSearchConsole #Htaccess #Spam

I'm working with a site that was hacked a couple of months ago and was sending out a ton of spam. The problem seemed to have started with a dormant WordPress blog that was part of the site (so the site is not WordPress itself, the blog was). Since that was no longer being used, I removed the blog, its database, and found infected files in the images folder and in a javascript file on the main site. That dramatically decreased 404 errors. Then they started to increase again. I tried using my .htacess file to block referrers who were supposedly linking this site from theirs with the viagra links. Again, crawl errors decreased dramatically and have now started up again. I can't figure out what is happening here. I've removed any weird code (looked mainly for base decode64) and blocked referrals from spam sites, so shouldn't the crawl errors being going down and staying down, at least from those spam sites?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Bryan171

1 Comments

Sorted by latest first Latest Oldest Best

 

@Heady270

Blocking by referrer won't help: Googlebot doesn't send a referrer header when it crawls, so it will never see your block.

Here is what Google's John Mueller (who works on Webmaster Tools and Sitemaps) has to say about 404 errors that appear in Webmaster tools:


HELP! MY SITE HAS 939 CRAWL ERRORS!!1

I see this kind of question several times a week; you’re not alone - many websites have crawl errors.


404 errors on invalid URLs do not harm your site’s indexing or ranking in any way. It doesn’t matter if there are 100 or 10 million, they won’t harm your site’s ranking. googlewebmastercentral.blogspot.ch/2011/05/do-404s-hurt-my-site.html
In some cases, crawl errors may come from a legitimate structural issue within your website or CMS. How you tell? Double-check the origin of the crawl error. If there's a broken link on your site, in your page's static HTML, then that's always worth fixing. (thanks +Martino Mosna)
What about the funky URLs that are “clearly broken?” When our algorithms like your site, they may try to find more great content on it, for example by trying to discover new URLs in JavaScript. If we try those “URLs” and find a 404, that’s great and expected. We just don’t want to miss anything important (insert overly-attached Googlebot meme here). support.google.com/webmasters/bin/answer.py?answer=1154698 You don’t need to fix crawl errors in Webmaster Tools. The “mark as fixed” feature is only to help you, if you want to keep track of your progress there; it does not change anything in our web-search pipeline, so feel free to ignore it if you don’t need it.
support.google.com/webmasters/bin/answer.py?answer=2467403 We list crawl errors in Webmaster Tools by priority, which is based on several factors. If the first page of crawl errors is clearly irrelevant, you probably won’t find important crawl errors on further pages.
googlewebmastercentral.blogspot.ch/2012/03/crawl-errors-next-generation.html There’s no need to “fix” crawl errors on your website. Finding 404’s is normal and expected of a healthy, well-configured website. If you have an equivalent new URL, then redirecting to it is a good practice. Otherwise, you should not create fake content, you should not redirect to your homepage, you shouldn’t robots.txt disallow those URLs -- all of these things make it harder for us to recognize your site’s structure and process it properly. We call these “soft 404” errors.
support.google.com/webmasters/bin/answer.py?answer=181708 Obviously - if these crawl errors are showing up for URLs that you care about, perhaps URLs in your Sitemap file, then that’s something you should take action on immediately. If Googlebot can’t crawl your important URLs, then they may get dropped from our search results, and users might not be able to access them either.



The 404 errors that Google reports are for your benefit. If they are not actually problems that need to be corrected, you don't need to do anything about them.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme