Mobile app version of vmapp.org
Login or Join
Angie530

: Delay Loading of Webpage Content / Debounce Cache Background: I am implementing a custom cache handler in PHP. The cache script is triggered on-demand by website traffic. I want to intelligently

@Angie530

Posted in: #Php #Seo

Background: I am implementing a custom cache handler in PHP. The cache script is triggered on-demand by website traffic. I want to intelligently handle an influx of simultaneous requests. To do this, I'm using the first request to trigger the cache writer. Concurrent requests are be deferred until after the cache is built. I welcome debates regarding this approach, but I still want an answer to my actual question below.

Question: I'd like to know the best method to defer loading of webpage content. Here are two options I'm considering.

Option 1: Delay the response server-side. I don't like this option because I could end up having hundreds of sleeping connections, and I don't think hosts would like that.

<?php
while (!file_exists($cache)) sleep(1);


Option 2: Reload the page client-side. I don't like this option because the requests could be doubled, or trippled, depending on how long it takes to build the cache. With this I'm also worried about SEO Optimization. Will bots retry the page? Will they try more than once if the page still isn't ready?

<?php
header('HTTP/1.1 503 Service Temporarily Unavailable');
header('Status: 503 Service Temporarily Unavailable');
// Maybe code 408 Instead
header('Retry-After: 1');
header("Refresh: 1");
// Maybe some JavaScript to reload the page instead


Maybe there's another best approach?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Angie530

1 Comments

Sorted by latest first Latest Oldest Best

 

@Cugini213

Performance and reliability are very important factors for SEO, therefore the idea of defering the page load seems to be counter productive in terms of SEO at least. Having said that however, provided your initial page response is relatively quick and includes a version of the content, even if not the latest, then you should be ok. Perhaps one of these options would be a satisfactory solution:


You could rebuild the entire website cache using a cron job (or scheduled task), with it running once every 5 minutes (or more frequently if required, or less frequently if the site has hundreds of pages). Requests are responded to immediately using the most recent cached version available. In order to do this your cache might be built to a temporary file, and then the 'live' cache file overwritten with it when the build process has finished running.
You could run a script to build an initial cache for every page, and then on each page request the most recent cached version available is sent back in the HTTP response. Inside the page an AJAX call to a separate script triggers the cache writer, keeps the connection open taking as long as it needs and then returns the updated cache version in the response for the page to update itself with the new content (replacing the contents of a <div></div> for example).
If showing an older cached version is simply not an option for you (though to me this defeats the principle of a cache), then you could return the page template in the HTTP response with an empty <div></div> for the content (except perhaps a spinner and/or a message to say that content is loading), fire off an AJAX call to trigger the cache writer and fetch the content and update the div element as soon as possible.


If the AJAX calls are likely to have to wait more than 15 seconds to for the cache writer to finish working, even though you don't really want to double or triple the number of requests to your server, you may need to send an AJAX response of 'waiting' or something like that, and then get the browser to retry otherwise the browser may look and feel like it has crashed while waiting for the response.

Provided you script your AJAX so that each user has only one active AJAX call/retry in operation at a time you should be able to keep your sleeping connections to a minimum, and if they are capped at 15 seconds they'll probably never reach a sleeping state.

Search engines only index (or keep indexed) content that is returned with HTTP 200 headers and each time the bots are fed an HTTP 503 header your page looses some SEO juice. If you have your Retry-After header set to 1 second then it is likely your page could be retrying every second until the cache is rebuilt, but could be pouring away all your SEO juice. I would strongly recommend you:


Keep the use of 503 headers reserved for scheduled maintenance rather than the normal operation of your site with slow loading pages.
Use AJAX for any deferred loading of content, or even better, don't use AJAX but just included the latest cached version in the page response and do your cache writing from a cron job instead of while the browser/user is waiting.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme