: How do I prevent access to SPAM URLs and remove them from Google's index? When I started managing my friends' website, I had found a folder in it with a bunch of garbage HTML files with
When I started managing my friends' website, I had found a folder in it with a bunch of garbage HTML files with names such as: free-nasscamd-server-31day, gorilla-quentin-trollip-pdf, etc... Assuming someone had hacked those files in there, I had deleted that folder and all those HTML files in it, and checked everywhere else to make sure there wasn't anything else lingering.
Two months later, I am still finding in my access logs there are garbage URLs still trying to be accessed somehow, although they return 404 errors now since the pages don't exist.
And when I go into Google and type site:{url}, it displays a bunch of garbage URLs as well, such as: {url}/pitchet-program-samsung-wave2/, {url}/rogue-pirates-of-the-caribbean-themes-nokia-x3torrent/, etc...
How do I prevent the attempted access to those URLs?
How do I remove those garbage URLs from Google?
More posts by @Welton855
2 Comments
Sorted by latest first Latest Oldest Best
Two things: First, if these URLs result in a 404 error because the resource has been removed, Google will try these resources with a fetch a number of times before determining that the resources are deleted. It will take time, but it is the simplest way to get these resources removed. Optionally, you can specify in the .htaccess file (assuming Apache) and return a 410 error. This would be faster, but requires work. My advice is to just let the 404 errors occur and eventually, at least for Google, Bing, Yahoo! and the like, these will disappear. However, you cannot stop other requests due to links and so forth. 404 errors are annoying and do pollute the log file and analytics, so I understand wanting to remove them. Short of a 410 error, the best thing to do is to allow the 404 errors. It may be that some of these requests will not disappear. The reality is there may be nothing you can do about that.
Second, do a file level anti-virus scan of the entire file system including root-kit to make sure there is not an existing virus. As well, there is likely that a software vulnerability still exists. Check the versions of installed software including and especially PHP and PHP applications. Update software to make sure that you are running safe versions of all software. These files did not show up in a vacuum. Make sure you are plugging the holes or the problem will continue.
Robots.txt
If you have hundreds to thousands of URLS then the easiest method would be to use robots.txt to inform Google that these URLS should not be indexed. This will also trigger Google to review the URLS already listed and hopefully remove them in time. It should be noted that URL removals can take weeks to months depending on volume and how often Google checks back to your site.
Webmaster Tools
If you only have a few dozen urls then the quickest and easiest method when dealing with low-volume url remove is without a doubt using Google's webmaster tools request url removal tool, again please note that this is a mere request and things can take a while to kick in.
Hack Prevention
You should also investigate how those files got their in the first place, deleting the files is a great start but understanding how they got in is the most important factor to help prevent this issue arising again. Your friends web host may be able to comment on several security issues, most issues are SQL exploits, out of date plugins and content management systems, weak shared hosting security, weak use of passwords, brute-forcible login systems (ones that don't ban after X attempts). You can also keep an eye out for potential security problems by regularly visiting an Web Apps Exploit Database.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.