Mobile app version of vmapp.org
Login or Join
Harper822

: Bots and image caching Once again, I'm trying to optimize my site because one test showed the time to first byte is slightly over 200ms, but I need it under 200ms to make google happy. I

@Harper822

Posted in: #Cache #Images #WebCrawlers

Once again, I'm trying to optimize my site because one test showed the time to first byte is slightly over 200ms, but I need it under 200ms to make google happy.

I was looking at code to my site and for the longest time, I have image generation with name stamping done on-the-fly and for regular website users I have image generation done once on-the-fly and cached for subsequent requests.

Now I'm thinking maybe I should just cache every image thats generated regardless of what requests it, but I'm not 100% sure if this is such a good strategy, because I have so much space to store all of the the cached website files on a ram drive (about 1 GB), and if I use cache for a robot and the robot requests billions of images at once (yes I've seen a few 15Mbits/s spikes in my bandwidth log), then I'm afraid I'll run out of cache space quickly and the site will slow down to a halt until I manually reset the cache.

I could go with another option to reduce image processing but if I do that then legit users that see the image on the wrong site will wonder why the image isn't in the best quality.

If anyone has an idea how I can manage my image cache better so that it doesn't get full when bots try to access the site like crazy I'll be willing to look into it. Heck, I'll even accept anyone who provides code to a less-processor-consuming way to stamp images.

I just need some kind of starting point to solve this. I'll also appreciate answers about the upper limits robots use (such as how many requests bots make in a given time frame, and when etc) because if I can sync my cache with their behavior, then I can reset half my cache and have things run quickly again.

Any ideas?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Harper822

1 Comments

Sorted by latest first Latest Oldest Best

 

@Eichhorn148

how many requests bots make in a given time frame


This time is vary for each site and each file type, based on many factors. Google calculates the average server response time for each file type and crawls them dependently of response time.

Assuming your HTML files are flying from server, but your images have longer answer time, because they should be firstly generated by PHP. Based on this Google will crawl more HTML files (if it finds some), but less images.

Whether your images are cached after they are generated, is in my opinion not important for the crawling intensity and quality. The search console , if it shows the crawling intensity and downloaded data amount, doesn't cache data too. So i mean the primary job you should accomplish in this direction, is to optimize the image time generation, which depends mostly from the code (PHP, framework). And on the fly serving will always have delay. You will get substantial speed win, if you find another way to create and store your images.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme