Mobile app version of vmapp.org
Login or Join
Miguel251

: Logging what was in the content of a page I'm one of the people working on a Java based website that has the apache httpd in front of it. The content on the pages is migrating towards personalizing

@Miguel251

Posted in: #Analytics #Apache2 #Performance

I'm one of the people working on a Java based website that has the apache httpd in front of it.

The content on the pages is migrating towards personalizing the content and as such just about every user will get a unique page containing a different combination of blocks with possibly even different content in the blocks.

What we want is to have logging that contains everything you see in an average apache httpd access log (all the technical details like the performance, cookies, header fields, etc.) and the functional information of what was in the html of that page (i.e. which block was shown where with which content and why was it put there).

We have per content block a string that describes it (lets call this content-string).

We came up with the following ideas to measure:


Log inside the application the request id along with the content-string during the generating of this block. At a later moment we could then combine the technical measurements (from the apache access log) with the content logs. This idea was discarded when we found it meant 'no caching' of the content blocks.
Log the content-string as an extra http response header and log using a standard apache feature. This means that we must wait with sending the content until we have everything because once we start sending html to the user there is no way to add an extra response header.
Use a sniffer (i.e. something like what is deep inside tealeaf or oracle ruei) and have that extract the content-string from inside the html (hidden inside html comments). I haven't find anything opensourced yet.
Use an apache module to extract the content and place this into the logfile.


So 1. and 2. don't work. In addition we see that having two measurement points would make joining the various datasets pretty hard to do right.
For ideas 3. and 4. I've not been able to find a viable solution yet.

So I'm asking here: What is a solution that will work for a high traffic website? What other ideas are out there and what practical tools are available?

10% popularity Vote Up Vote Down


Login to follow query

More posts by @Miguel251

0 Comments

Sorted by latest first Latest Oldest Best

Back to top | Use Dark Theme