Mobile app version of vmapp.org
Login or Join
Mendez628

: Magento local store deactivated, still getting GET requests from web crawlers Many months ago I deactivated one store (localised to a language) within a Magento site, leaving the english store

@Mendez628

Posted in: #ApacheLogFiles #CrawlErrors #Magento

Many months ago I deactivated one store (localised to a language) within a Magento site, leaving the english store the only store live.

Looking in my Apache access logs, and also summarised in Logwatch, every day I still get GET requests from web crawlers (e.g. yahoo) that result in 404s, because they're trying to access pages from the deactivated site.

The sitemap for the deactivated store was removed long ago. Why are these 404s still happening and how would I go about stopping them?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Mendez628

2 Comments

Sorted by latest first Latest Oldest Best

 

@Becky754

This is normal business. Search engines are notoriously slow. The reason for this is simple- the Internet is flippin' HUGE!!

The best thing to do with the least amount of work is to let the 404's happen. Do not stop this process. This is the primary way that search engines know that a page is gone.

The very best thing to do is to issue a 410 error if you can. But that can take work that is not always easy to do.

The difference between a 404 (temporarily gone) and a 410 (gone) is that the 410 is immediate whereas the 404 is not. With the 404, Google for example, will retry the page several times over a period of time before determining that the page is truly gone. This is the standard way of course, but not as fast as a 410.

You do not want to stop or block this process. Otherwise, the search engines cannot know these pages have been removed.

If you have bots accessing your site that you do not want or appreciate, you can block those of course and you may want to for a period at least.

10% popularity Vote Up Vote Down


 

@Heady270

web crawler goes in, where it is allowed. You can prevent their come ins with


deleting the folder,
closing it with htpassword,
placing in the root of the folder a htaccess with Header set X-Robots-Tag "noindex, nofollow"
check, that urls from this folder don't have backlinks from outta space

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme