: Track the modification time of files with a regeneratable website I have a website that is automatically regenerated every day based on a fresh copy of an externally-supplied database, which has
I have a website that is automatically regenerated every day based on a fresh copy of an externally-supplied database, which has no concept of the modification time of the individual records (e.g., OpenBSD ports), and I'm thinking that perhaps there is some way to automatically keep track of the last modification time of all files based on a SHA-1 signature, or something similar?
The primary need of this modification time is to ensure that an XML sitemap could include such data, and thus alert Google et al. of which pages may have changed, and which have not, in order to optimize crawling.
More posts by @Murray432
1 Comments
Sorted by latest first Latest Oldest Best
I don't know of a specific tool to do that. However, and since you write the generator, you could work out a simple process where you (1) generate the page and (2) copy the resulting page to where it goes only if it changed. This is assuming your generator creates the page the exact same way every time (i.e. you do not add a creation date & time in your documents.)
So...
generate /tmp/page-123
if ! cmp /tmp/page-123 $WEBSITE_PATH/page-123
then
cp /tmp/page-123 $WEBSITE_PATH/page-123
fi
Then your other tools can make use of the modification time of the file under $WEBSITE_PATH since it won't change when the new database comes in.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.