: Why is Google crawling non-existant URLs? I can see in live traffic of my wordpress website that goggle bot crawl non existing pages. www.example.gr/search/search-results/password-reset%252Fpassword-reset/password-reset%252Fpassword-
I can see in live traffic of my wordpress website that goggle bot crawl non existing pages.
example.gr/search/search-results/password-reset%252Fpassword-reset/password-reset%252Fpassword-reset%252F&listview=2/?pg=6&dtype=prosfata&listview=2 example.gr/search/search-results/password-reset%252F&listview=1/password-reset/search/advanced-search/tag/katigoria/gaming/?pg=15&order=lcomdate&dtype=prosfata&listview=1
I can’t find out where google bot has discover this links but are thousand and almost the only links that google Crawl.
I have add noindex, noffolw for these urls but bot steel Crawl them. How I can stop this? Why google Crawl only these urls? I thing that the High CPU amount can caused by this.
One more question. Recently I have add caching to my website. Shouldn’t google Crawl the cached pages for better speed? When I use the “fetch as google” I can see that Crawl no cached pages.
More posts by @Carla537
1 Comments
Sorted by latest first Latest Oldest Best
Googlebot crawls any URL that it finds:
Links on your own and third party websites
Text on the page that looks like a URL
JavaScript strings that look like they might be URLs
Check your own site to see if there are links to these pages. If not, it is probably some other site. Google may be able to tell you which site in Google Search Console in the crawl error report.
One thing that you can do about it is to use robots.txt to disallow crawling of whole directories. Based on your examples, /search would be a great candidate for disallow:
Disallow: /search
It is also possible that it isn't actually Googlebot doing the crawling. It may be a bot spoofing Googlebot to try to find vulnerabilities on your website. You can verify whether or not it is actually Googlebot by checking the IP address using the procedure here: How to identify if IP address is really google's IP
If it isn't actually Googlebot, you can block the IP addresses used in .htaccess: How to block entire IPs of a VPN server by IP
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.