Mobile app version of vmapp.org
Login or Join
Yeniel560

: Sitemap file getting double compressed I have a script that generates my XML sitemap and writes it to the file sitemap.xml.gz - i.e. an XML file, compressed with gzip. This file is definitely

@Yeniel560

Posted in: #Apache #Gzip #HttpHeaders #XmlSitemap

I have a script that generates my XML sitemap and writes it to the file sitemap.xml.gz - i.e. an XML file, compressed with gzip. This file is definitely written correctly as when I download it via FTP it's all good.

However, when I download the file direct from the site (over HTTP), the resulting file appears to be doubly-compressed. When I unzip the file, the sitemap.xml file is a binary file. If I rename that to sitemap2.xml.gz and try unzipping again, I get the true XML file.

So I think the server (Apache2) is for some reason taking the .gz file and serving it with gzip compression again. The headers for the file come back as this:

Status: HTTP/1.1 200 OK
Date: Mon, 16 Jul 2012 00:00:47 GMT
Server: Apache
Last-Modified: Sun, 15 Jul 2012 23:35:26 GMT
ETag: "89fff2-3bc46-4c4e6c48deb80"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Connection: close
Transfer-Encoding: chunked
Content-Type: application/x-gzip


In my httpd.conf I have this:

# compress all text & html:
AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml text/css text/javascript application/x-javascript application/javascript


My VirtualHost declaration only has some mod rewrite stuff.

Anyone have any ideas why Apache might be sending the gzip header for this file?



UPDATE: I removed the application/xml entry from the AddOutputFilterByType line, and the file now downloads normally like any other binary file. However, the problem now is that regular .xml files are no longer sent gzipped.

So it seems like the server is deciding that .xml.gz files should be parsed as application/xml, even though it sends it with the header application/x-gzip.

Also, I checked the /etc/mime-types file, it doesn't have an entry for gzip and has this comment at the top:


Note: Compression schemes like "gzip", "bzip", and "compress" are not
actually "mime-types". They are "encodings" and hence must not have
entries in this file to map their extensions. The "mime-type" of an
encoded file refers to the type of data that has been encoded, not the
type of encoding.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Yeniel560

1 Comments

Sorted by latest first Latest Oldest Best

 

@Kevin317

Quoting @cyberx86 over at ServerFault (who you should go and vote up):


The .xml.gz filetype may be defined as being an xml file (e.g. with forcetype in a filesmatch block) - which would cause Apache to match it to the type above.

I think you can get around that by adding an exception, above it:

SetEnvIfNoCase Request_URI ".xml.gz$" no-gzip dont-vary

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme