What is a Random Number?

2 Conversations

WARNING!

This entry is currently being rewritten from scratch. Therefore, it'd be wise not to refer to this entry until the rewrite is done. In the meantime, I welcome suggestions as to what you'd like to see in this article, and what you'd rather not see in it.

Introduction

Random numbers are strange things because they must satisfy a number of seemingly contradictory demands. On one hand, we want them to be unpredictable. On the other hand, once we know a number in a sequence of random numbers, it is no longer unpredictable, so is it still random? (After all, it was before we knew it.)

Take another example. Imagine a lottery that is only held once. A single number is drawn. No one can examine the machine that draws the number. Is the number “random”? (For example, the number might be the initial configuration of our universe.)

Let's see if we can clear up this confusion.

Executive Summary

  • We must know what equidistribution is.
  • A sequence of numbers is random if we can pick an arbitrary subsequence, and it's equidistributed. But:
  • We mustn't cheat and look at the sequence before specifying the subsequence. However,
  • We can look at all past members of the sequence before we decide whether the next (yet unknown) member belongs in the subsequence or not. Still,
  • Once we have decided that a member of the sequence belongs or does not belong in the subsequence, we cannot undo that decision. Finally,
  • We must be able to write a computer program for the rule that chooses the next member of the subsequence.


And all this just realize that you can't beat the odds. Wow!

Problem Definition

We will consider infinite sequences of real numbers


(1) 〈Un〉 = U0, U1, U2, ..., where 0 ≤ Ui < 1 for all i



and we want to ask, is this a sequence of random numbers? (We will therefore be talking about the entire sequence, not about any individual member.) The sequence is infinite (there is no last element), but once we know what a random infinite sequence is, we can perhaps find out when to call a finite sequence “random”, too. The reason for going about things in this order is that it is actually easier to talk about infinite sequences first, and apply the ideas that we get there to finite sequences later (though not in this article). Also, we're talking about sequences of real numbers first. Sequences of dice throws and the like are addressed below in Definition R5 and following.

Outline

The remainder of this entry will follow the discussion by a mathematician/computer scientist named Donald E. Knuth. He has written what is thought by almost anyone in the field to be the definitive source of information on the analysis of algorithms, a work called “The Art of Computer Programming”, currently published in three volumes (seven are planned, and volume four has three sub-volumes, 4A, 4B, and 4C). The discussion we are paraphrasing takes place in Section 3.5 of Volume 2.1

First, we will show you two possible definitions of randomness for sequences of the form of Eq. (1). We will show how the first definition is too wide whereas the second is too narrow. We will then gradually tighten the first definition and loosening the second definition until both converge to a definition that should be reasonable for all purposes.

Two Statements

The first statement comes from the mathematician D. H. Lehmer, uttered in 1951:

A random sequence is a vague notion embodying the idea of a sequence in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests, traditional with statisticians and depending somewhat on the use to which the sequence is to be put.

The second definition is from J. N. Franklin, made in 1962:

The sequence (1) is random if it has every property that is shared by all infinite sequences of independent samples of random variables from the uniform distribution.

Lehmer's statement has a few problems. The worst problem is, it is not very precise. That's OK for a cocktail-party definition, but it's definitely not OK for mathematicians (who never go to cocktail parties anyway). You really shouldn't say "vague notion", "somewhat", or "uninitiated" in a mathematical definition if you don't want to be sneered at my the other mathematicians. (Lehmer is a mathematician of the highest caliber, and he did not give this statement as a formal definition.)

What mathematicians want in this case is a short list of mathematical properties that is shared by every sequence that is intiutively random. But the list must not be too short, because we must agree that any sequence that satisfies the conditions in that list is “random”.

The first requirement is that the numbers in our random sequence (1) are equally distributed between zero and one. How can we write this mathematically?

Let u and v be two real numbers with 0 ≤ u < v < 1. If U is a random variable uniformly distributed between 0 and 1, the probability that U is between u and v is v - u. To see this, think of the numbers between 0 and 1 as written around the circumference of a wide disc. Then mark off u and v on the circeumference and roll the disc on its edge. The probability that the disc stops with on a point between u and v is then v - u: If u = 0 and v = 1, the entire circumference is marked, and since the disc has to stop somewhere on its circumference, it is certain to stop on a number between 0 and 1. If, e.g., u = 0.25 and v = 0.75, half the circumference is marked, hence the probability of the disc stopping on the marked section is 1/2. And so on.

We want to transfer this statement about the number U to a sequence like (1). We can do this by noting that in this case, the average number of numbers between u and v should approach v - u if we sample ever more of the sequence (1). More precisely:

Definition U. Let #(n) be the number of values of j, 0 ≤ j < n, such that

u ≤ Uj < v.

Then, if #(n)/n tends to v - u as n tends to ∞, we call (1) “equidistributed”.

We can generalize this notion of equidistribution to general predicates. Let S(n) be a statement about the integer n and the sequence (1). For example, S(n) might be “u ≤ Un < v”.

Definition A. Let #(n) be the number of values of j, 0 ≤ j < n, such that S(j) is true. We say that S(n) is true with probability p if the limit as n tends to infinity of #(n)/n is p. Symbolically: Pr(S(n)) = λ if limn → ∞ #(n)/n = p.

Why is this definition important? It enables us to talk about predicates defined over the entire sequence (1), when the predicate might be true for some members and not true for others. It enables us to quantify the confidence that we can have in a statement about the sequence (1). For example, consider this lemma.

Lemma A. The sequence (1) is equidistributed in the sense of definition U if and only if Pr(Un < x) = x for all x between 0 and 1.

Proof. First assume that the sequence (1) is equidistributed in the sense of Definition U. In the definition, set u = 0 and v = x and you get the claim of the lemma. On the other hand, assume that Pr(Un < v) = v for all v between 0 and 1. Then if we observe that the number of indices j such that u ≤ Uj < v is equal to the number of j's such that Uj < v, minus the number of j's such that Uj < u. Therefore, #(u ≤ Un < v) = #(Un < v) - #(Un < u). Therefore, the limit of #(u ≤ Un < v) exists and is equal to v - u. Q.E.D.

This, by the way, is not the only sensible definition of probability. The probability theory of Bruno DeFinetti2 for example treats probability as wholly subjective, with no objective meaning whatsoever. This means that in this theory, the question “What is the probability of event X?” cannot be asked in a meaningful way; the only question of this kind that this theory admits is “What do You think is the probability of event X?” DeFinetti claims, however, that the distinction between objectibe and subjective standpoints is not important,

but rather the resulting reversals of the rôles and meanings of many concepts, and, above all, of what is ‘rigorous’, both logically and mathematically.3

Under DeFinetti's definition of probability, we can meaningfully talk about the probability of an event that we can not repeat, not even just reasoning about it, such as the one-time lottery mentioned at the beginning of this entry. (An event like this we will call sungular.) DeFinetti says that as long as one is consistent, one is free to assign any probability to an event. This is because under his theory, the probability of an event is not (despite linguistic appearances to the contrary) a property of the event, but rather an opinion of You. For purposes of this discussion, however, we will stick with Knuth's definition, because we fear that it is otherwise totally impossible to talk about what random numbers are. DeFinetty, for all his eloquence, skirts the question of what random numbers are, and we assume this is because the question cannot be meaningfully asked in a subjective theory. (We could be wrong, however. If this is the case, please notify the author and/or the h2g2 editing staff).

All the above means that we will not talk about the probability of a singular event, because the definition of probability that we will use has a limit built in, which means many (conceptually infinitely many) repetitions of an experiment.

We now return you to our regularly scheduled program.

Equidistribution, ∞-distribution, and Randomness

The question now is, does “equidistributed” equal “random”? The answer to this question is no, and it's not hard to see why. (That's what mathematicians always say when they have worked for hours on a proof.) For if U0, U1, ... and V0, V1, ... are equidistributed sequences, then so is the sequence W0 = 0.5U0, W1 = 0.5 + 0.5V0, W2 = 0.5U1, W3 = 0.5 + 0.5V1, ..., but the terms of this combined sequence are alternatively less than and greater than 0.5, which is not random by any reasonable definition. Therefore, we will need a stronger property than equidistribution.

One key insight is that we should not only demand that the unit line be filled uniformly, but higher-order unit cubes, too. This is an extension of the notion of equidistribution to two and more dimensions, and is called k-distribution:

Definition B. The sequence (1) is said to be “k-distributed” for some integer k if


Pr(u1 ≤ Un < v1, ...,
uk ≤ Un + k - 1 < vk)
= (v1 - u1)...(vk - uk)

for all real numbers 0 ≤ ui < vi < 1 and 1 ≤ i ≤ k. A sequence that is k-distributed for all k is said to be “∞-distributed”. An 1-distributed sequence is equidistributed un the sense of Definition U.

There is an impressive number of important theorems that are fulfilled by ∞-distributed sequences (we won't go into the details here, see Knuth if you want them). Therefore, ∞-distributed sequences are certainly important from a purely mathematical point of view. So we'll try this:

Definition R1. The sequence (1) is called “random” if it is an ∞-distributed sequence.

One of the impressive theorems alluded to above states that a ∞-distributed sequence will certainly pass many of the known statistical tests for randomness and probably many more besides. (Therefore, Definition R1 is akin to Lehmer's statement at the beginning of this section.)

Now, if we want to adapt Definition R1 as the official definition of randomness, we must ask ourselves, is every truly random sequence ∞-distributed? (The reverse question is, do we accept any ∞-distributed sequence as random? As we have seen above, those sequences will pass a great many statistical tests, so they won't be distinguishable from truly random sequences by statistical means.) The answer here is no.

If there is indeed such a thing as a process that generates truly random numbers, we should be able to make the assumption that every sequence of real numbers between 0 and 1 is equally likely to occur in the output of such a generator.4 Unfortunately, some of the sequences are not even equidistributed.5 (If a sequence is (k + 1)-distributed, it is also k-distributed. That means that a sequence that is not k-distributed cannot be (k + 1)-distributed. Therefore, if a sequence is not even equidistributed, it cannot be ∞-distributed.)

On the other hand, ∞-distributed sequences are much more common than those sequences that are not even equidistributed. (We won't prove this seemingly innocent statement.) Therefore, we conclude that a random sequence should be ∞-distributed with probability one. There is an important distinction between things that happen certainly, and things that happen with probability one. An event happens with probability one when the ways that it can not happen are so few that they can make no contribution. To put it another way, the set of exceptions must have measure zero. For example, any finite set has measure zero. The set of rational numbers also has measure zero, even though there are infinitely many of them, and even though every real number can be approximated arbitrarily well by rational numbers.

Once we have this out of the way, we can formalize Franklin's statement at the beginning of this section as follows.

Definition R2. The sequence (1) is “random” if the following is true: Let P be a property such that P(〈Vn〉) holds with probability one for a truly random (i.e., independent and uniformly distributed) sequence 〈Vn〉, then P(〈Un〉) also holds.

This is just a longwinded way of saying that every property that holds for a truly random sequence (with probability one) must be shared by a sequence (1) for it to be called random.

This certainly sounds reasonable. After all, we want our random sequences to be indistinguishable from truly random sequences in every respect. Unfortunately, adapting this definition, we are forced to conclude that there are no random sequences at all! Therefore, Definition R2 is certainly too strong.

The argument is very simple. Let P(x0) be the property that the sequence does not begin with x0. Then this property is certainly true with probability one for all truly random sequences and for all x0 with 0 ≤ x0 < 1. If we choose x0 = U0, however, the property P does not hold for the sequence (1) which therefore fails our definition.

So, Definition R2 is certainly too strong. We have not (yet) shown that Definition R1 is too weak, however. Let's attack it some more.

Since almost all real numbers are irrational, we should perhaps insist that Pr(Un is rational) = 0. Generalizing Definition A, we could say that a sequence is equidistributed if for all S ⊆ [0,1), if S is of measure μ, then Pr(Un ∈ S) = μ. In particular, the set of rational numbers has measure zero, so no sequence of rational numbers is equidistributed by that definition. Unfortunately, this definition is again too strict, because the set S = {U0, U1, ...} is of measure zero, yet Pr(Un ∈ S) = 1 for any sequence. Therefore, no sequence would be random by this definition.

OK, so far Definition R1 seems to be OK. Another angle of attack is to notice that if (1) is a random sequence, every subsequence should also be random. For example, if (1) is a random sequence, the subsequence

U0, U3, U6, U9, U12, ..., U3n, ...

should also be a random sequence. We can therefore try this:

Definition R3. The sequence (1) is said to be “random” if each infinite subsequence is ∞-distributed.

Unfortunately, this is again too strong. For every equidistributed sequence has a monotonic subsequence with Us0 < Us1 < Us2 < ... (Warning: do not try to prove this seemingly obvious fact.)

Hmmm... We seem to be at an impasse. As soon as we know the sequence (1), we can choose subsequences that make the sequence fail the definition, even if it is a truly random sequence. But here lies a possible way out: Maybe we shouldn't be allowed to look at the sequence before specifying the subsequence. Or in other words, what if we were forced to specify the indices constituting the subsequence before we could look at the sequence itself? This gives rise to the following definition:

Definition R4. The sequence (1) is said to be “random” if, for every effective algorithm that specifies an infinite sequence of distinct nonnegative integers sn for n ≥ 0, the subsequence Us0, Us1, Us2 ... is ∞-distributed.

What is this talk about effective algorithms? This means that we must be able to write a computer program that outputs sn, given n, after some finite time. The subsequences 2n, n + 1 etc. are all computable by effective algorithms. By this definition, the sequence 〈πn mod 1〉, i.e., the part of πn after the decimal point, would not be random, because it is either not equidistributed or there is an effective algorithm that computes a monotonically increasing subsequence. This is puzzling at first until we realize that no explicitly given sequence can be random by this definition. That's OK, however, since we should agree that no explicitly given sequence can really be random.

To make things a bit more confusing (as if they weren't already), we can say that the sequence 〈xn mod 1〉 will be random by this definition for almost all real numbers x. But that's OK too, since real numbers are almost all uncomputable by algorithms. If you're interested in the details, you should look up the articles by J. F. Koksma6 and H.Niederreiter and R. F. Tichy.7.

All this sounds very reasonable, but it still has a flaw. The flaw is that we must specify the entire subsequence beforehand, when in fact we should be allowed to look at all the values of U1, ..., Usn - 1, ... Uk - 1 before we decide whether or not sn should be k or not. We call such a rule a “subsequence rule” and say that a subsequence rule is computable if there is an effective algorithm that decides whether sn should be k, as above. Unfortunately, computable subsequence rules cannot cope well with arbitrary real numbers. For example, if a real number x is given by its radix-10 expansion, there is no effective algorithm that will decide whether x < 1/3! For example, if x = 1/3 exactly, the algorithm would have to examine every digit in the expansion. Therefore, we have this definition, which bases on b-ary sequences, which are sequences of integers written in base b:

Definition R5. A b-ary sequence is said to be “random” is every infinite subsequence defined by a computable subsequence rule is 1-distributed. Furthermore, the sequence (1) is said to be “random” if the b-ary sequence 〈⌊b Un⌋〉 is “random” for all integers b ≥ 2. (The sign surrounding “b Un” is a floor sign, meaning (for nonnegative integers) to cut off everything after the radix point.)

OK, now we're almost there. Just a few more lines and we're done. First note that the “1-distributed” in the above definition is nto a typo. Why 1-distributed and not ∞-distributed? Well ... just because, OK? If you're interested in the details, you can look them up in Knuth.

One last objection is that it is not clear that every sequence that satisfies R5 will also satisfy R4. The reason is that the “computable subsequence rule“ rule will specify the indices in the subsequence as a monotonic sequence: if 〈Xsn〉 is the sequence, we always have s0 < s1 < ..., but that's not necessarily the case for R4. We can combine definitions R4 and R5 in order to overcome this objection:

Definition R6. A b-ary sequence 〈Xn〉 is said to be “random” if, for every effective algorithm that specifies an infinite sequence of distinct nonnegative integers 〈sn〉 as a function of n and the values of Xs0, Xs1, ..., Xsn - 1, the subsequence lang;Xsn〉 corresponding to this algorithm is “random” in the sense of definition R5. Furthermore, the sequence (1) is said to be “random” if the b-ary sequence 〈⌊b Un⌋〉 is “random” for all integers b ≥ 2.

Hooray! We're done. Well, almost. We still have to see whether sequences that satisfy R6 actually exist. The good news is that yes, there are such sequences. For sequences satisfying R5, an even stronger result has been shown, namely an algorithm that constructs such sequences. The technial aspects of this are, however, ... well, too technical! See Knuth if you really want to get down to it.

You may now ask yourself whether Definition R6 is the ultimate in random sequence definitions. Not necessarily. Definition R6 is an attempt to make precise a very murky concept, namely that of randomness. As we have seen with DeFinetti's definition, there is more than one way to define probability, so it's not unreasonable to assume that there is also more than one way to define randomness. Discussions with peer reviewers have certainly confirmed this! It is still a good definition, though. Many objections have been addressed, and it's not easy to criticise the definition on technical grounds.

1Donald E. Knuth, Seminumerical Algorithms, volume 2 of The Art of Computer Programming, Third Edition, Addison-Wesley, 1998.2Bruno DeFinetti, Theory of Probability, Volume 1, John Wiley & Sons, 1974, ISBN 0-471-92611-63ibid., p. 54This is a purely hypothetical machine. It is used only as a device to find out what inds of properties the sequences generated by such a machine might have. It is not a pseudo-random generator or necessarily a computer program. It is more like a fantastic ERNIE, where we are allowed to examine the entire infinite sequence of numbers all at once.5For example, the sequence consisting only of zeroes.6Compositio Math. 2 (1935), 250–2587Mathematica 32 (1985), 26–32

Bookmark on your Personal Space


Entry

A1065746

Infinite Improbability Drive

Infinite Improbability Drive

Read a random Edited Entry


Written and Edited by

Disclaimer

h2g2 is created by h2g2's users, who are members of the public. The views expressed are theirs and unless specifically stated are not those of the Not Panicking Ltd. Unlike Edited Entries, Entries have not been checked by an Editor. If you consider any Entry to be in breach of the site's House Rules, please register a complaint. For any other comments, please visit the Feedback page.

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more