Floating Point And Benford's Law, Part Two
A number of readers asked for an explanation of my offhand comment that Benford's Law can be used to show that binary is the best base for doing floating point math.
One of the desired properties of a floating point system is that the "representation error" is as small as possible. For example, suppose we want to express "one third", in ten-place fixed point decimal notation. The closest we can get is "0.3333333333", which has a representation error of 0.0000000000333333…
A useful way to get a handle on representation error is to look at the granularity of the system. By granularity I mean the smallest difference we can make between two values. For ten-place fixed point decimal notation, the smallest nonzero difference between any two numbers is 10-10. In a floating point system, the granularity changes as the exponent changes. Really large numbers have large granularity; the difference between two successive floats might be millions or billions. And really small numbers have tiny granularity. The maximum possible representation error is half the granularity. Since there is a clear relationship between granularity and representation error, I’m just going to talk about granularity from now on. Small granularity = small representation error = goodness.
Let's consider two similar systems, one which uses a binary mantissa and one which uses a hexadecimal mantissa. I’m going to continue my convention of showing binary numbers in blue, and we’ll do hex in green.
Suppose we've got on the one hand, 32 bit IEEE floats, aka "singles". That is, we've got one sign bit, nine bits of exponent biased by -255 (so exponents can be from -255 to 256), a 22 bit mantissa, and an implicit leading "1." So a number like
(0, 100000011, 1100000000000000000000) would be +1.1100 x 24 = 28
I’m getting sick of spelling out the exponents in non-biased binary. From now on, I’m just going to give them in decimal. And I’m going to give the sign bit by plus and minus, not zero and one.
We now want a hex system with roughly similar range that doesn't take up more storage. Suppose we've got one sign bit, seven bits of exponent biased by -63, and six hex digits for the mantissa. We don't get a leading "1." because not every hex number can be so expressed, so we'll have to use a leading 0. That system has roughly the same range, and if we were to "behind the scenes" express this thing as bits, we'd still be using 32 bits. One for the sign, seven for the exponent, and 24 for the mantissa.
In this system,
(+, 2, A00000) would be +0.A00000 x 162 = 160
An immediate disadvantage of this system is that numbers have multiple representations.
(+, 2, A00000) and (+, 3, 0A0000) are the same number. Let’s ignore that for now. We’ll ignore all hex mantissas with leading zeros. (And besides, the binary IEEE system wastes lots of cases for denormals and NaNs too, so it's not clear that this is any worse.)
What's the granularity of the hex system when, say, the exponent is 2? Well, consider a number like
(+, 2, A00000) – the smallest possible number higher than this is (+, 2, A00001) which is 2-16 larger. Clearly, 2-16 is the granularity for all values with an exponent of 2. More generally, if the exponent is N then the granularity is 24N-24.
How would we represent
(+, 2, A00000) in our binary system? That would be (+, 7, 0100000000000000000000) = +1.01 x 27 = 160. The next largest number that can be represented in our system is (+, 7, 0100000000000000000001) which is 2-15 larger. So the hex system has smaller granularity and hence smaller representation error, and is therefore the better system, right?
Not so fast. Let’s make a chart.
Decimal Hex Binary Binary Granularity Exponent
16 (+, 2,
100000) (+, 4, 0000000000000000000000) -18
32 (+, 2, 200000) (+, 5, 0000000000000000000000) -17
48 (+, 2, 300000) (+, 5, 1000000000000000000000) -17
64 (+, 2, 400000) (+, 6, 0000000000000000000000) -16
80 (+, 2, 500000) (+, 6, 0100000000000000000000) -16
96 (+, 2, 600000) (+, 6, 1000000000000000000000) -16
112 (+, 2, 700000) (+, 6, 1100000000000000000000) -16
128 (+, 2, 800000) (+, 7, 0000000000000000000000) -15
144 (+, 2, 900000) (+, 7, 0010000000000000000000) -15
160 (+, 2, A00000) (+, 7, 0100000000000000000000) -15
176 (+, 2, B00000) (+, 7, 0110000000000000000000) -15
192 (+, 2, C00000) (+, 7, 1000000000000000000000) -15
208 (+, 2, D00000) (+, 7, 1010000000000000000000) -15
224 (+, 2, E00000) (+, 7, 1100000000000000000000) -15
240 (+, 2, F00000) (+, 7, 1110000000000000000000) -15
The hex system always has a granularity exponent of -16. In 8/15ths of the cases, the binary system has worse granularity than the hex system. In 4/15ths of the cases, they have the same granularity, and in only 3/15ths of the cases, the binary system has better granularity. Therefore the hex system is clearly the same or better most of the time. We should use this system instead of binary floats.
Not so fast! We’ve forgotten Benford’s Law! That's true if any number is as likely as any other, but that's not realistic.
Suppose the numbers that we are manipulating obey Benford’s Law. In that case, we would expect that fully a quarter of them would begin with 1 when encoded in hex. We’d expect another quarter of them to begin with 2 or 3, another quarter to begin with 4, 5, 6 or 7, and the remaining quarter to begin with 8, 9, A, B, C, D, E and F. If we make that assumption then we must conclude that the binary system is better half the time, equal a quarter of the time, and worse a quarter of the time.
Clearly this isn't the case just for the exponent 2. For any hex exponent N, the hex system will have a granularity exponent of 4N-24. For numbers in the range expressible by the hex system with that exponent, a quarter of the time, the binary system will have a granularity exponent of 4N-26, a quarter of the time it'll be 4N-25, a quarter of the time it'll be 4N-24, and the remaining quarter it'll be 4N-23, so three-quarters of the time it'll be as good or better. On average, the binary system is considerably better if data are distributed according to Benford's Law.
This is just one example, not a proof. But we could generalize this example and show that for any system with a given number of bits, binary mantissas yield smaller representation errors on average than mantissas in any other base.