Benford's Law
Created | Updated Dec 7, 2008
Benford's Law, simply stated, says that if a number is selected at random from any series which has a fixed upper limit and that upper limit is greater than 10, the probability of that number starting with the digit 1 is higher than the probability of it starting with 9. In fact, in general observations, 30% of randomly selected numbers start with the digit 1, 18% with the digit 2, 13% with the digit 3 and so on to about 5% starting with the digit 9.
Why does this happen?
Suppose we start to count upwards from one to nine. When a number is selected by random, each digit has an even chance of being the first digit, but by the time we get to 20, the digit 1 has built up a significant lead of 50% and this is not caught up again until we reach 99. Then we get into the hundreds and the digit 1 builds up that lead again. Now, the chance of the upper limit of the range consisting entirely of nines1 is low indeed, hence the distribution as seen above.
Whence Came this Discovery?
It first was discovered in 1881 when an astronomer called Simon Newcomb wrote an article in the American Journal of Mathematics, noting that for an unknown reason the early pages in a book of logarithms2 were smudged. The logarithms were ordered by first digit, so the earlier digits were earlier on in the book and were smudged because they were more used and therefore the numbers beginning with lower digits were more used in mathematics and came up more often. Like all good scientific discoveries this was completely ignored until he was far too dead to accept any credit for the discovery, until in 1938 a physicist named Frank Benford3 reopened the file and did a bit of statistical analysis. He found the distributions above and that they occurred in nearly every case but was at a loss to explain why. It was not until 1996 that the seemingly simple explanation above was proved mathematically by Theodore Gill of the Georgia institute of Technology.
What Use is It?
Well, if a characteristic like this distribution is evident in every large set of random numbers then if you have a set of numbers which you believe to be randomly generated but which do not fit the distribution, chances are they aren't. This can be used to detect anything from insurance fraud to sifting extra-terrestrial radio transmissions for intelligence; and if you can find a bookie who does not know of this rule you can make some money with some well-placed bets.