# Bigger and Bigger Infinities

Created | Updated Jan 28, 2002

It's a commonly held belief that infinity is a pretty weird concept, mixed up with mysticism and staring up into the night sky until the early hours of the morning. This entry is not about that sort of infinity. It is about the idea of a set of things that never ends, that no matter how many of them you count, there's always going to be more you've not looked at yet. This kind of infinity (the mathematical kind) is a lot easier to get a hold on than the vague stargazing kind, and allows us to show some surprising things about infinity, in particular that there are *different* infinities. There are infinities that really are bigger than other infinities, despite the fact that they all 'go on forever'. *The Hitchhiker's Guide to the Galaxy* has this to say about the word 'Infinite':

Bigger than the biggest thing ever and then some. Much bigger than that in fact, really amazingly immense, a totally stunning size, real 'wow, that's big' time. Infinity is just so big that by comparison, bigness itself looks really titchy. Gigantic multiplied by colossal multiplied by staggeringly huge is the sort of concept we're trying to get across here.

Even this doesn't really get across the difference between the finite and the infinite. You have to hijack the word 'bigger' and fly it to places it never thought it would go to talk about the difference between finite and infinite. To talk about how much *more* enormous the bigger infinities are than the 'smaller' infinities requires the same sort of terrorist activities against common sense definitions of size.

#### Warning

This entry is quite technical. In the course of it, the existence of bigger infinities is proved. This is, unavoidably, going to be pretty hard. However, no prior knowledge is assumed, and terms used are defined either in the main text or in footnotes at the point the term is first used. It might be an idea to read the definitions footnotes first and refer back to them if needed. Those uncomfortable with maths are probably better skipping to the last paragraph, and just letting the weirdness wash over them.

This entry assumes you believe in the existence of **N**, the set of all natural numbers, {0,1,2,3,...}^{1}. So how many *elements*^{2} does **N** have? Clearly it's not any finite number of elements, so it must be (by definition) infinite. We^{3} write |X| for the 'size' of any set X. If X is finite, with (say) 42 elements, we would say |X| = 42. For infinite sets we haven't yet given these 'sizes'^{4} names, so we just write |**N**| for the size of **N**. Now the question is, are there any sets X which have strictly 'more' elements than **N**? Ie is there some X such that |X| > |**N**|?^{5}

### More?

Well, first we need to say what we really mean by 'more'. We need to do this in a way such that we don't need to count the elements of the sets (which is somewhat difficult for infinite things). To say that X has less than or an equal size to Y (ie |X| ≤ |Y|), we can try pairing up each element of X with an element of Y. If we run out of elements of X before we run out of elements of Y, then Y obviously has more elements in it.

To formalise this, we need a *function*^{6} f mapping elements of X to elements of Y, which is '*one-to-one*', meaning that a and b (which are different elements of X) will get sent to different elements of Y by f (note that f(a) is different to f(b) because a and b are different). Here we are pairing up each a in X with its f(a) in Y, and if we can do that for all the a in X, then we must run out of elements of X before we run out of elements of Y. We need f to be one-to-one, or else we would be trying to pair more than one element of X with an element of Y. So to summarise, we say '|X| ≤ |Y|' when we have a one-to-one function f (say) from X to Y.

So that's what 'more' means, defined in a way that doesn't require us to count the things. If we have both |X| ≤ |Y| and |Y| ≤ |X| then we can say |X| = |Y|^{7}.

All this is pretty obvious for finite sets (once you've disentangled all that stuff about functions). What is maybe less obvious is that for infinite sets, you can have such a function f (from **N** to **N** say) which is one-to-one, but not *onto*^{8}. For example, if the function f just doubles a number, the function f does the following for the first few terms:

1 goes to 2

2 goes to 4

3 goes to 6

4 goes to 8

5 goes to 10

...

It's pretty easy to see that if a is different from b, then 2*a is different from 2*b, so f(a) is different to f(b), ie f is one-to-one. But, for example, nothing goes to 3, so f is not *onto*. This sort of thing cannot happen with finite sets.

### Anyway, Back to the Plot

Next we define a way to get a new set from one we are given, and show that the new set is really bigger than the old one.

For a set X, the *power set* of X (written 'P(X)') is defined to be the set of all *subsets*^{9} of X. So, if X were the three element set:

{burger,chocolate,doughnut}

then P(X) is the set containing all the subsets of X, ie:

P(X) = { {burger,chocolate,doughnut},

{burger,chocolate},

{burger,doughnut},

{chocolate,doughnut},

{burger},

{chocolate},

{doughnut},

{} }

Note that last element - the empty set, *is* a subset of X. It's the subset containing zero elements. Also note that although X is a member of P(X) (because the entire set X is, strictly speaking, a subset of X), this doesn't automatically tell us that P(X) has more members than X. *burger* is not an element of P(X), although {*burger*} (the set containing only *burger*) is - there is a difference between these two things.

You might want to check that there are no other subsets of X that aren't listed above; drawing some diagrams may help. Note that the order in which we write the elements down doesn't matter.

For finite sets, it's not too hard to show that |P(X)| = 2^{|X|}^{10} and so |X| < |P(X)|. That's a strict less than - in terms of our functions definition, |X| < |Y| means that there is a one-to-one function from X to Y but there is no possible one-to-one function from Y to X, ie Y is just too big to 'fit inside' X.

If you think about what this says for finite sets, you'll see that this makes sense as a definition. Ok here's the hard bit: we can show that |X| < |P(X)| for *all* sets - not just finite ones. Here's the proof, first worked out by Cantor^{11}:

#### Cantor's Proof

First we can pretty easily show that |X| ≤ |P(X)| by taking the function F(x) = {x}, where F is the function (think 'machine' again) that on input of an x in X, outputs the subset of X consisting only of x). This is one-to-one because if F(x) = F(y), then {x}={y} (ie the set containing only x is equal to the set containing only y) so therefore x = y.

We now want to show that there is no possible one-to-one function from P(X) to X. Start off by supposing we *do* have such a function f - it seems quite plausible that it might exist. Using this f we will get a contradiction, which shows that the f is impossible. For more information on the method of proof by contradiction, see Basic Methods of Mathematical Proof. Here's the clever self-referential bit:

Define a subset S of X by:

S = {a in X such that there is some subset A of X with f(A) = a, and a is not in A}

In other words, S consists of all elements of X that something gets sent to by f and that the thing they get sent to doesn't contain the thing you send. To find out what S is, you go through all the elements a of X, find out if any subset A of X gets sent to a by f, and also if a is not in that A then we know a is a member of S. That may not have helped work out what S is much, but this the crux of the proof.

Unfortunately it's a little hard to give examples of what S might look like, because the whole point of this proof is to show that f is impossible, and S is defined in terms of f. We can cheat a bit though: suppose we take as an example: X is the set consisting of *burger*, *chocolate*, and *doughnut*, ie:

X = {burger,chocolate,doughnut}

Suppose f is a function from P(X) to X which does the following:

{burger,chocolate} goes toburger

{burger,doughnut} goes tochocolate

{doughnut} goes todoughnut

f would have to do something with all the other things in P(X) (subsets of X), but we'll forget about that for the sake of this example. In this case, S = {*chocolate*}, because the other two are inside the subsets of X that get sent to them by f.

Now S is a subset of X, so it is a member of P(X) (remember P(X) is the set of all subsets of X, and strictly speaking, X is a subset of X), so we can see where it gets sent to under our function f. Let s = f(S). Now the question is whether or not s is in S (it will turn out that both of these are impossible, so we will have to conclude that assuming that f could exist in the first place was wrong).

The next two paragraphs are where we find out how the definition of S is so clever. So it would be a good idea to refer back to it when going through them.

Suppose s is in S. Then, by the definition of S, there is some subset A of X with f(A) = s, and s is not in A. Well, we can find out what that A is. We know that f is one-to-one, so f(A) = s = f(S), so A = S, and s is not in A, so s is not in S. But we started this paragraph assuming that s *is* in S. Contradiction, therefore we can't have s is in S.

Now assume s is *not* in S. Well there is some subset A of X with f(A) = s (we can take A = S), so for s to not be in S, the last bit 'a is not in A' must be false. So 's is not in A' is also false. So 's is not in S' is false too. So s *is* in S. Another contradiction.

So, the whole thing just doesn't work when we assumed we had f in the first place, we must have been making a mistake. No such f can possibly exist because the consequences are contradictory. So |X| < |P(X)|.

### Bigger and Bigger and Bigger Infinities

So we know that |X| < |P(X)| for any set X at all. If we stick **N** (the natural numbers) in there, we get:

|N| < |P(N)|

**N** is an infinite set, so P(**N**) must be infinite as well, but it's also strictly bigger than **N**. So it must be a *bigger* infinity. It turns out that

|P(N)| = |R|

which is the number of points on the real number line.

Are there any *even bigger* infinities, or have we hit the top yet? Well, we can use Cantor's proof again to show that:

|P(P(N))| > |P(N)|

and again to show

|P(P(P(N)))| > |P(P(N))|

and so on, to get infinitely many bigger and bigger infinities. 'How many is "infinitely many"?' you ask^{12}? Cantor's proof will get you a different one for each natural number, just by putting that many Ps before **N**. But there are even more than that. There are unimaginably many different-sized infinities. For any infinite number, you can find more than that many different infinities. It's even provably impossible to form the set of all infinite numbers (which you might want to do in order to try and find out how big it is), it's just 'too big' to even be a set, and the axioms of set theory break down with collections of things this big. It's *big*.

^{1}The curly brackets '{' and '}' are used to denote a set. So we would write the set consisting of

*burger*,

*chocolate*, and

*doughnut*as {

*burger*,

*chocolate*,

*doughnut*}. Sets can also be written in the form of 'all things in some other set which have some property' - so (for example) the set of all even natural numbers would be written {x in

**N**such that x is even}.

^{2}An element of a set is simply a member of it. So

*chocolate*is an element of the first set from footnote 1. Two sets are equal if they have the same elements.

^{3}'We' as in mathematicians, and now you as well!

^{4}The jargon for 'sizes' is 'cardinalities'.

^{5}The symbol '>' is read to mean 'is greater than', '≥' to mean 'is greater than or equal to', '<' means 'is less than', and '≤' means 'is less than or equal to'.

^{6}A function from a set X to a set Y can be thought of as a machine that takes elements of X in, and gives you back an element of Y. For a function (think machine) f, when you put an element x of X in, the element of Y you get out is called f(x).

^{7}It is possible to prove that with one-to-one functions f from X to Y and g from Y to X you can make a

*one-to-one*and

*onto*function h from X to Y (onto means that every element of Y gets outputted by the function (think 'machine') for some element of X, so there are no elements of Y 'left over' with nothing getting sent to them), which means that h pairs up each element of X with exactly one element of Y, and everything gets paired with something else, so X and Y really must be the same size. This proof however (the Schröder Bernstein theorem), is beyond the scope of this entry.

^{8}As also defined in footnote 7, an onto function from X to Y is one such that every element of Y gets outputted by the function. This means that a function that is

*not*onto is one such that some elements of Y get nothing sent to them at all.

^{9}A subset of a set X is another set which contains only elements from X.

^{10}To pick a subset of X, we can go through the elements in turn, deciding if that element is going to be in our subset or not. So if we have |X| elements, we make |X| choices between two options ('is this element in or out?'). The number of ways we can make these choices is then 2

^{|X|}.

^{11}Georg Ferdinand Ludwig Philipp Cantor 1845 - 1918

^{12}Having just proved that there are quite a few of them about, this is a reasonable question to ask.