Mobile app version of vmapp.org
Login or Join
Si4351233

: Yandex frequently replaces page names with ampersands The Yandex spider is a frequent visitor to one of the sites I manage. On occasion it replaces the page name with two ampersands and a space.

@Si4351233

Posted in: #CrawlErrors #WebCrawlers #Yandex

The Yandex spider is a frequent visitor to one of the sites I manage. On occasion it replaces the page name with two ampersands and a space. So if the page is:

/mypage.aspx?param=value


then it will try and crawl it as:

/&& ?param=value


Any idea why it is doing this?

EDIT:
If I remember correctly the IP that this mistake is coming from is based in California and not Russia. I believe that they crawl US sites from a US based IP address. Not sure if that helps.

More information about the request:

IP: 199.21.99.82
City: Palo Alto
State: California
Country: United States
ISP: Yandex Inc.
User-Agent: Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Si4351233

2 Comments

Sorted by latest first Latest Oldest Best

 

@Megan663

There could only be two explanations for this behavior:


The crawler found a link to that malformed URL, either on your site or on some other site.
The Yandex crawler has a bug.


If you aren't seeing that URL crawled by other bots or visited by real users, then I suspect it is a bug with the Yandex crawler. As for why Yandex would have that particular bug, I can't say. There is no valid reason for any user agent to make that type of substitution in a URL.

10% popularity Vote Up Vote Down


 

@Yeniel560

Its probably mis-translating your URL, may be if your URL was more simplified such as mypage.com/page-two

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme