Mobile app version of vmapp.org
Login or Join
Sarah814

: Why are there significant differences in file sizes despite the same format and approximate pixel counts? I'm not entirely sure this is the right community for this question but it doesn't seem

@Sarah814

Posted in: #FileSize

I'm not entirely sure this is the right community for this question but it doesn't seem to fit anywhere else. I also don't see any similar questions.

Image one, image two.

I've included the DropBox links because they're too large for Imgur or other free hosting sites. The first is 10,830 × 4,981 PNG (53.9 megapixels) and 2,400,601 bytes (2.3 megabytes). The second is 10,812 × 5,159 PNG (55.8 megapixels) and 16,451,853 bytes (15.7 megabytes). They have identical information from Jeffrey Friedl's tool.

PNG
Animation no
Bit Depth 8
Color Type RGB with Alpha
Compression Deflate/Inflate
Filter Adaptive
Image Size 10,830 × 4,981
Interlace Noninterlaced
Gamma 1.7362
Profile Name Photoshop ICC profile
Pixels Per Unit X 3,780
Pixels Per Unit Y 3,780
Pixel Units meters


Can anyone explain why the two file sizes are so different despite being similar or identical in every other way? I haven't been able to find an answer anywhere else, although that might be because of my inability to phrase the question correctly or use the right terminology.

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Sarah814

2 Comments

Sorted by latest first Latest Oldest Best

 

@RJPawlick971

The bit string 1010101010 can be encoded different ways. Each different way requires different amounts of file space. Different formats encode in different ways.

Comparing two different means of encoding the above string using NRZ (non-return to zero) requires less space than CRC (Cyclic Redundancy Check). There are many different schemes used for compressing data to save space over simple strings of bits.

Here are some diagrams to help you see how counting the clock cycles can help you visualize the extra steps going from 0 to 1 and back.

NRZ and its variations

CRC and its flavours

Alignment of graphical elements changes the amount of data in the raster scan. A horizontal path needs a beginning and an end on only one raster scan line. That takes up no real space in the file to encode. A diagonal path requires a start and a stop on each of the scan lines that the diagonal intercepts which accounts for significantly more file space to specify. The file has no more pixels but the alternation "on" and "off" directions for each of the raster scan lines require more "instructions" which takes up more file space.

10% popularity Vote Up Vote Down


 

@Ann6370331

So PNG uses compression, and put simply, the best-case is where all bytes are identical and worst-case is going to be random data.

My very basic understanding of PNG is that it compresses based on "scanlines" and the algorithm assesses the effectiveness of several filtering schemes to figure out which one is best for the particular scanline. I think it may be a deflate algorithm similar to gzip, or at least that's what someone on the internet wrote.

The most obvious thing about your two images is the difference in density of the lines. Another way to put this is that image 2 is less uniform on a line-by-line byte-by-byte basis (i.e. more randomness).

The best way to increase compression for these two items is to reduce the color depth: try an indexed color mode or greyscale (8-bit PNG) rather than rgb+alpha. This will reduce the uncompressed data size to a third of the size (or a quarter if you are using alpha).

Note also that a "2 megapixel difference" in pixel dimension is actually a 6MB difference in uncompressed data size for an RGB image (2MB per channel).

If I recall, one of the users on this stackexchange actually "wrote" PNG.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme