: Block Yandex crawler Our site has been behaving very strangely for the last few days, lots of time outs etc. Finally think I found the cause, the Yandex bot is crawling around 10,000 pages

Our site has been behaving very strangely for the last few days, lots of time outs etc. Finally think I found the cause, the Yandex bot is crawling around 10,000 pages an hour! I need to stop it ASAP, I think that's creating around 50-100gb of bandwidth usage per day.

Blocked IP's (via myip.ms/info/bots/Google_Bing_Yahoo_Facebook_etc_Bot_IP_Addresses.html):

100.43.90.0/24, 37.9.115.0/24, 37.140.165.0/24, 77.88.22.0/25, 77.88.29.0/24, 77.88.31.0/24, 77.88.59.0/24, 84.201.146.0/24, 84.201.148.0/24, 84.201.149.0/24, 87.250.243.0/24, 87.250.253.0/24, 93.158.147.0/24, 93.158.148.0/24, 93.158.151.0/24, 93.158.153.0/32, 95.108.128.0/24, 95.108.138.0/24, 95.108.150.0/23, 95.108.158.0/24, 95.108.156.0/24, 95.108.188.128/25, 95.108.234.0/24, 95.108.248.0/24, 100.43.80.0/24, 130.193.62.0/24, 141.8.153.0/24, 178.154.165.0/24, 178.154.166.128/25, 178.154.173.29, 178.154.200.158, 178.154.202.0/24, 178.154.205.0/24, 178.154.239.0/24, 178.154.243.0/24, 37.9.84.253, 199.21.99.99, 178.154.162.29, 178.154.203.251, 178.154.211.250, 95.108.246.252, 5.45.254.0/24, 5.255.253.0/24, 37.140.141.0/24, 37.140.188.0/24, 100.43.81.0/24, 100.43.85.0/24, 100.43.91.0/24, 199.21.99.0/24

My robots.txt:

User-agent: Yandex
Disallow: /

User-agent: *
Disallow: ... etc

But it's still apparently crawling as reported by Cloudflare.

What else can I do to stop it?

10.01% popularity Vote Up Vote Down

: Google Analytics - display campaign pages showing up as 'referral' traffic I tag all Facebook advertising as facebook / display with utm tags, which enables me to see which campaigns are driving

@XinRu657

Posted in: #Facebook #GoogleAnalytics #Referrer

0 Comments

: How to set up a spreadsheet for utm tags including sequential numbering There are a lot of formulas around to generate utm tags, but I specifically want to add sequential numbering to the campaign

@XinRu657

Posted in: #GoogleAnalytics

0 Comments

: Will my domain setup (GoDaddy) work with Github pages CDN? I've registered an Apex domain (example.com) on GoDaddy. I'd like to link it to a project page (gh-pages branch). My GoDaddy setup is

@XinRu657

Posted in: #Dns #Github #Godaddy

1 Comments

: Image Optimization : Can i use same image for thumbnail(180110) and small thubmail (8055) in web Currently using thumbnail image 180110 size for album and same image but separate size 8055

@XinRu657

Posted in: #Images #Optimization #Seo

2 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Harper822

Right from Yandex website

User-Agent Mozilla/5.0 (compatible; Yandex...) string identifies Yandex robots. Robots
can send GET (for example, YandexBot/3.0) and HEAD (YandexWebmaster/2.0) requests to a
server. A reverse DNS lookup can be used to check the authenticity of Yandex robots. More
information can be found in the How to check that a robot belongs to Yandex section of
the Webmaster help.

If you have any questions about our robots, please contact our support service:
support@search.yandex.com. If you are experiencing technical issues with our robots
we recommend attaching your server log.

You can email their team and request they don't crawl your server or block the correct user-agent. If your server is overloaded and cannot keep up with the robot download requests, you should use the Crawl-delay directive. It will allow you to specify the minimum amount of time (in seconds) between the search robot downloading one page and starting the next.

Examples:

User-agent: Yandex
Crawl-delay: 2 # specifies a 2 second timeout

and

User-agent: *
Disallow: /search
Crawl-delay: 4.5 # specifies a 4.5 second timeout

10% popularity Vote Up Vote Down

Feed

: Block Yandex crawler Our site has been behaving very strangely for the last few days, lots of time outs etc. Finally think I found the cause, the Yandex bot is crawling around 10,000 pages

More posts by @XinRu657

: Google Analytics - display campaign pages showing up as 'referral' traffic I tag all Facebook advertising as facebook / display with utm tags, which enables me to see which campaigns are driving

: How to set up a spreadsheet for utm tags including sequential numbering There are a lot of formulas around to generate utm tags, but I specifically want to add sequential numbering to the campaign

: Will my domain setup (GoDaddy) work with Github pages CDN? I've registered an Apex domain (example.com) on GoDaddy. I'd like to link it to a project page (gh-pages branch). My GoDaddy setup is

: Image Optimization : Can i use same image for thumbnail(180110) and small thubmail (8055) in web Currently using thumbnail image 180110 size for album and same image but separate size 8055

Login to post a comment!

1 Comments

Back to top | Use Dark Theme

: Block Yandex crawler Our site has been behaving very strangely for the last few days, lots of time outs etc. Finally think I found the cause, the Yandex bot is crawling around 10,000 pages

More posts by @XinRu657

: Google Analytics - display campaign pages showing up as 'referral' traffic I tag all Facebook advertising as facebook / display with utm tags, which enables me to see which campaigns are driving

: How to set up a spreadsheet for utm tags including sequential numbering There are a lot of formulas around to generate utm tags, but I specifically want to add sequential numbering to the campaign

: Will my domain setup (GoDaddy) work with Github pages CDN? I've registered an Apex domain (example.com) on GoDaddy. I'd like to link it to a project page (gh-pages branch). My GoDaddy setup is

: Image Optimization : Can i use same image for thumbnail(180*110) and small thubmail (80*55) in web Currently using thumbnail image 180*110 size for album and same image but separate size 80*55

Login to post a comment!

1 Comments

Back to top | Use Dark Theme

: Image Optimization : Can i use same image for thumbnail(180110) and small thubmail (8055) in web Currently using thumbnail image 180110 size for album and same image but separate size 8055