Mobile app version of vmapp.org
Login or Join
Carla537

: Server overhead caused by bots? I have one customer website causing overhead (http://www.modacalcio.it/en/by-kind/football-boots.html). With htop opened, I am trying navigate the website and the much

@Carla537

Posted in: #Bingbot #Googlebot #Plesk #Vps #WebCrawlers

I have one customer website causing overhead (http://www.modacalcio.it/en/by-kind/football-boots.html).

With htop opened, I am trying navigate the website and the much load of the website is done by the ajax link being placed on the left side of the website.

The website is hosted by a VPS with 3 proc and 2GB RAM, with enough hard with disk space.

The real problem is that this website is new and not visited much.

From the http-status module I am seeing that the overhead is caused by bots (Google bots, Bing bots, hrefs checker and so on).

So I thought that's probably due to those spiders trying to crawl all those links at once - could this be causing this overhead?

I have also put rel="nofollow" in those links, but this doesn't keep the bots away.

Is there any way through code or Plesk to disable those links to those bots?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Carla537

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

The overhead is likely being caused by the data and how it's being served:

Running a diagnostic, like here, indicated that each request for this page results in 150 separate requests, with over 2.2MB in page size, taking up to 9 seconds to load. Looking at your headers here, it appears you have no-cache specified in Cache-Control and Pragma.

You might want to enable HTTP cache and also use gzip compression with your Nginx server (as indicated in the headers as your server). See this for more: Setting up HTTP cache and gzip with nginx Also see this Google article: How gzip compression works

You may also want to check your server's core module configuration to make sure that keepalive_disable is set to none, and keepalive_requests is set to at least the default number (100). For Apache in Plesk, this thread will help with that.

Monitoring your system's resources after these modifications should indicate whether your VPS configuration is sufficient or should be upgraded.

If you still want to prevent robots from crawling the links, specify them as disallowed within your robots.txt file, as covered here.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme