: What is the difference between 32 Bit Float and 32 Bit fixed pixel types? I am using a real time graphics program that offers the following "pixel formats". I understand many of them but
I am using a real time graphics program that offers the following "pixel formats".
I understand many of them but one thing I do not understand is the difference between float and fixed.
Why would someone use one over the other? They both take up the same amount of space so I do not see how one could offer more precision.
More posts by @Kaufman565
2 Comments
Sorted by latest first Latest Oldest Best
Lets keep things simple by thinking about a monochrome image.
To represent out image we cut it up into a grid of pixels and record a number representing the light intensity at each pixel. For simplicity lets assume that the value linearly represents the ammount of light (reality is a little more complicated than that, but it's close enough for now).
However we still have to actually represent that number as a bit pattern. There are an infinite number of possible light intensity values but only a finite number of possible bit patterns of a given size. Clearly we must make a compromise.
There are basically two strategies for representing numbers on computers, fixed-point and floating point (integers can be considered a special case of fixed point). There is also the question of signed verses unsigned.
In a fixed point number we fix the scale factor in advance. So for example to represent numbers in the range 0 to 1 inclusive we might use a 32-bit unsigned number with a scale factor of 1/(232-1).
In a floating point number we allow a range of scale factors to be chosen. We do this by splitting our bits up. Some bits are used to store a binary number and other bits are used to store what power of 2 to scale that binary number by (there is a bit of extra trickery in practice, read IEEE 754 if you really want the gory details). With floating point numbers we can store a very wide range of numbers but the precision varies depending on the value of the number.
When dealing with a camera, screen or printer our hardware determines the darkest possible and lightest possible values and typically breaks up the possible values in between in a roughly equal manner. Negative values don't really make any physical sense since there is no such thing as negative light.
On the other hand when we get into the world of 3D modelling it can be useful to represent a much wider range of brightness values. There can be a vast difference in brightness between the brightest and darkest parts of the world and while negative light doesn't physically exist that doesn't mean you can't have it in a 3D model. So formats using signed floating point numbers can start to make sense.
Of course there is a price to pay. The 32-bit signed floating point number can represent much larger and much smaller values than the 32-bit unsigned fixed point number but in some cases it will have a lower precision.
So basically, "float" means that it stores a significand and a base exponent. A "standard" 32-bit float uses 1 bit for sign, 8 bits for exponent and 23 bits for signifcand. The key here is that the scale (the exponent) is stored with the number.
"Fixed" is stored hi-word/low-word "packed". A 32-bit fixed point is probably going to be a 16 bit integer before the radix and 16-bit integer after (the decimal part). This does not comport with @elegent 's comment about a range between 0 and 1: that link may be implementation specific, I don't know. It sounds to me like that implementation uses all 32 bits for the decimal portion and then relies on context for scale (the exponent) or that they are leveraging a color-model (likely) that uses the same range.
Regarding precision: the number of bits available does not equate to better precision on its own, but I am completely unqualified to speak about the distinctions beyond the obvious overflow/underflow issues.
stackoverflow.com/questions/8638792/how-to-convert-packed-integer-16-16-fixed-point-to-float stackoverflow.com/questions/7524838/fixed-point-vs-floating-point-number
In the context of your screen-capped menu: they are using fixed as a synonym for integer: 8-bit fixed (rgba), means 8-bits per pixel, where each byte is represents an integer value between 0 and 255.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2025 All Rights reserved.