Showing posts with label maths. Show all posts
Showing posts with label maths. Show all posts

Saturday, 14 June 2008

A surprising connection.

Galois connections are often hidden behind well-behaved areas of mathematics, and they are often produced in a standard way from simple binary relations. Here's one that seems to produce mathematics out of thin air.

Ultrafilters on a set \small \rule[-1.5]{0.1}{0.1} X are a bit like generalised points of that set. Of course, principal ultrafilters correspond to points of the set in an obvious way. If \small \rule[-1.5]{0.1}{0.1} X lives in some model of set theory, and we take an ultrapower of that model by some ultrafilter, then in the ultrapower \small \rule[-1.5]{0.1}{0.1} X gains a generic point, with respect to which the ultrafilter is principal.

Sometimes we might want to associate actual points of the set to these ultrafilters: We are interested in relations \small \rule[-1.5]{0.1}{0.1} R between \small \rule[-1.5]{0.1}{0.1} X and the set \small \rule[-1.5]{0.1}{0.1} \beta(X) of ultrafilters on \small \rule[-1.5]{0.1}{0.1} X. Such a relation is a set of ordered pairs \small \rule[-1.5]{0.1}{0.1} (x, {\cal U}) with \small \rule[-1.5]{0.1}{0.1} {\cal U} an ultrafilter on \small \rule[-1.5]{0.1}{0.1} X, and \small \rule[-1.5]{0.1}{0.1} x a point of \small \rule[-1.5]{0.1}{0.1} X. \small \rule[-1.5]{0.1}{0.1} {\cal U} may be thought of as specifying a set of subsets of \small \rule[-1.5]{0.1}{0.1} X to which some imaginary point belongs. Unless \small \rule[-1.5]{0.1}{0.1} {\cal U} is the principal ultrafilter at \small \rule[-1.5]{0.1}{0.1} x, there will be some sets containing \small \rule[-1.5]{0.1}{0.1} x but not in \small \rule[-1.5]{0.1}{0.1} {\cal U}: Since such sets are witnesses of the fact that \small \rule[-1.5]{0.1}{0.1} {\cal U} isn't \small \rule[-1.5]{0.1}{0.1} x, and so I'll call them inconsistent with the pairing \small \rule[-1.5]{0.1}{0.1} (x, {\cal U}). All other sets are consistent with the pairing.

This consistency relation induces a Galois connection from the set of relations \small \rule[-1.5]{0.1}{0.1} R of the type described above to the power set of the power set of \small \rule[-1.5]{0.1}{0.1} X. It is here, on this bleak mountaintop of abstraction, that there is a surprise. The sets of subsets of \small \rule[-1.5]{0.1}{0.1} X which are closed with respect to this connection are precisely the topologies on \small \rule[-1.5]{0.1}{0.1} X.

Proof: Let \small \rule[-1.5]{0.1}{0.1} {\cal T} be a set of subsets of \small \rule[-1.5]{0.1}{0.1} X closed with respect to the connection. Then there is a relation \small \rule[-1.5]{0.1}{0.1} R which is taken to \small \rule[-1.5]{0.1}{0.1} {\cal T} by the connection. That is, \small \rule[-1.5]{0.1}{0.1} {\cal T} is the set of subsets of \small \rule[-1.5]{0.1}{0.1} X consistent with \small \rule[-1.5]{0.1}{0.1} R; \small \rule[-1.5]{0.1}{0.1} {\cal T} is the set of sets \small \rule[-1.5]{0.1}{0.1} O such that, for all \small \rule[-1.5]{0.1}{0.1} x and \small \rule[-1.5]{0.1}{0.1} {\cal U} with \small \rule[-1.5]{0.1}{0.1} xR{\cal U}, if \small \rule[-1.5]{0.1}{0.1} x \in O then \small \rule[-1.5]{0.1}{0.1} O \in {\cal U}. In particular, as the empty set \small \rule[-1.5]{0.1}{0.1} \emptyset contains no points, it is in \small \rule[-1.5]{0.1}{0.1} {\cal T}. As \small \rule[-1.5]{0.1}{0.1} X is in every ultrafilter, \small \rule[-1.5]{0.1}{0.1} X \in {\cal T}. If \small \rule[-1.5]{0.1}{0.1} A and \small \rule[-1.5]{0.1}{0.1} B are in \small \rule[-1.5]{0.1}{0.1} {\cal T}, \small \rule[-1.5]{0.1}{0.1} xR{\cal U}, and \small \rule[-1.5]{0.1}{0.1} x \in A \cap B, then \small \rule[-1.5]{0.1}{0.1} x is in both, so both are in \small \rule[-1.5]{0.1}{0.1} {\cal U}. But then \small \rule[-1.5]{0.1}{0.1} A \cap B \in {\cal U}, as \small \rule[-1.5]{0.1}{0.1} {\cal U} is an ultrafilter. That is, \small \rule[-1.5]{0.1}{0.1} A \cap B \in {\cal T}. If each set in a family \small \rule[-1.5]{0.1}{0.1} ({\cal U}_i)_{i \in I} is in \small \rule[-1.5]{0.1}{0.1} {\cal T}, and \small \rule[-1.5]{0.1}{0.1} x \in \bigcup_{i \in I}{\cal U}_i, then \small \rule[-1.5]{0.1}{0.1} x is in one of them, so one of them (and hence their union) is in \small \rule[-1.5]{0.1}{0.1} {\cal U}. That is, \small \rule[-1.5]{0.1}{0.1} {\cal T} is closed under arbitrary unions. Putting it all together, \small \rule[-1.5]{0.1}{0.1} {\cal T} is a topology on \small \rule[-1.5]{0.1}{0.1} X.

Suppose that \small \rule[-1.5]{0.1}{0.1} {\cal T} is a topology on \small \rule[-1.5]{0.1}{0.1} X, and let \small \rule[-1.5]{0.1}{0.1} R be the relation \small \rule[-1.5]{0.1}{0.1} {\cal T} is taken to by the Galois connection, and suppose that the connection takes \small \rule[-1.5]{0.1}{0.1} R to \small \rule[-1.5]{0.1}{0.1} {\cal T}'. It is enough to show that \small \rule[-1.5]{0.1}{0.1} {\cal T}' = {\cal T}. Evidently \small \rule[-1.5]{0.1}{0.1} {\cal T} \subseteq {\cal T}', so it's enough to show that, for any set \small \rule[-1.5]{0.1}{0.1} C \not\in {\cal T}, we have \small \rule[-1.5]{0.1}{0.1} C \not\in {\cal T}'. Let \small \rule[-1.5]{0.1}{0.1} C be such a set, and let \small \rule[-1.5]{0.1}{0.1} O be the interior of \small \rule[-1.5]{0.1}{0.1} C. As \small \rule[-1.5]{0.1}{0.1} C isn't open, there is some \small \rule[-1.5]{0.1}{0.1} x \in C \setminus O. Let \small \rule[-1.5]{0.1}{0.1} {\cal F} be the set of all open neighbourhoods of \small \rule[-1.5]{0.1}{0.1} x, together with the complement of \small \rule[-1.5]{0.1}{0.1} C. Any finite intersection of sets in \small \rule[-1.5]{0.1}{0.1} {\cal F} is nonempty, so \small \rule[-1.5]{0.1}{0.1} {\cal F} can be extended to an ultrafilter \small \rule[-1.5]{0.1}{0.1} {\cal U}. Any neighbourhood of \small \rule[-1.5]{0.1}{0.1} x is in \small \rule[-1.5]{0.1}{0.1} {\cal U}, so \small \rule[-1.5]{0.1}{0.1} xR{\cal U}. But \small \rule[-1.5]{0.1}{0.1} C \not\in {\cal U}, so \small \rule[-1.5]{0.1}{0.1} C isn't in \small \rule[-1.5]{0.1}{0.1} {\cal T}', as required.

This result is remarkable enough, but there's more. It turns out that compactness and Hausdorffness correspond closely with similar properties of relations. Say a relation \small \rule[-1.5]{0.1}{0.1} R is surjective if, for every \small \rule[-1.5]{0.1}{0.1} {\cal U} there is at least one \small \rule[-1.5]{0.1}{0.1} x with \small \rule[-1.5]{0.1}{0.1} xR{\cal U}. Say \small \rule[-1.5]{0.1}{0.1} R is injective if, for any \small \rule[-1.5]{0.1}{0.1} {\cal U}, there is at most one \small \rule[-1.5]{0.1}{0.1} x with \small \rule[-1.5]{0.1}{0.1} xR{\cal U}. If \small \rule[-1.5]{0.1}{0.1} R is a function, these definitions are exactly the usual definitions of injectivity and surjectivity.

Claim \small \rule[-1.5]{0.1}{0.1} 1: Let \small \rule[-1.5]{0.1}{0.1} R be a relation, as above, and let \small \rule[-1.5]{0.1}{0.1} {\cal T} be the topology that \small \rule[-1.5]{0.1}{0.1} R is taken to by the Galois connection. Then \small \rule[-1.5]{0.1}{0.1} {\cal T} is compact if \small \rule[-1.5]{0.1}{0.1} R is surjective.
Proof: By contradiction. Pick any open cover of \small \rule[-1.5]{0.1}{0.1} X with no finite subcover, and let \small \rule[-1.5]{0.1}{0.1} {\cal F} be the set of complements of the sets in the cover. Then any finite intersection of sets in \small \rule[-1.5]{0.1}{0.1} {\cal F} is nonempty, so \small \rule[-1.5]{0.1}{0.1} {\cal F} may be extended to an ultrafilter \small \rule[-1.5]{0.1}{0.1} {\cal U} on \small \rule[-1.5]{0.1}{0.1} X. By surjectivity, there is some point \small \rule[-1.5]{0.1}{0.1} x with \small \rule[-1.5]{0.1}{0.1} xR{\cal U}. \small \rule[-1.5]{0.1}{0.1} x must lie in some set \small \rule[-1.5]{0.1}{0.1} O of the original cover. But \small \rule[-1.5]{0.1}{0.1} O can't be in \small \rule[-1.5]{0.1}{0.1} {\cal U} (its complement is), contradicting the definition of \small \rule[-1.5]{0.1}{0.1} {\cal T}.

Claim \small \rule[-1.5]{0.1}{0.1} 2: Let \small \rule[-1.5]{0.1}{0.1} {\cal T} be any topology on \small \rule[-1.5]{0.1}{0.1} X, and let \small \rule[-1.5]{0.1}{0.1} R be the relation that \small \rule[-1.5]{0.1}{0.1} {\cal T} is taken to by the Galois connection. Then \small \rule[-1.5]{0.1}{0.1} {\cal T} is compact iff \small \rule[-1.5]{0.1}{0.1} R is surjective.
Proof: The 'if' follows from Claim \small \rule[-1.5]{0.1}{0.1} 1 and the fact that \small \rule[-1.5]{0.1}{0.1} {\cal T} is closed with respect to the Galois connection. To prove the 'only if', suppose that \small \rule[-1.5]{0.1}{0.1} {\cal T} is compact, and let \small \rule[-1.5]{0.1}{0.1} {\cal U} be any ultrafilter on \small \rule[-1.5]{0.1}{0.1} X. Suppose for a contradiction that every \small \rule[-1.5]{0.1}{0.1} x \in X has an open neighbourhood not in \small \rule[-1.5]{0.1}{0.1} {\cal U. These neighbourhoods form an open cover, which therefore has a finite subcover. The complements of the sets in this subcover are in \small \rule[-1.5]{0.1}{0.1} {\cal U}, and their intersection is empty, contradicting the fact that \small \rule[-1.5]{0.1}{0.1} {\cal U} is an ultrafilter. So there is an \small \rule[-1.5]{0.1}{0.1} x such that every open neighbourhood of \small \rule[-1.5]{0.1}{0.1} x is in \small \rule[-1.5]{0.1}{0.1} {\cal U}, so that \small \rule[-1.5]{0.1}{0.1} xR{\cal U}. As \small \rule[-1.5]{0.1}{0.1} {\cal U} was arbitrary, \small \rule[-1.5]{0.1}{0.1} R is surjective.

Claim \small \rule[-1.5]{0.1}{0.1} 3: Let \small \rule[-1.5]{0.1}{0.1} R be a relation, as above, and let \small \rule[-1.5]{0.1}{0.1} {\cal T} be the topology that \small \rule[-1.5]{0.1}{0.1} R is taken to by the Galois connection. Then \small \rule[-1.5]{0.1}{0.1} R is injective if \small \rule[-1.5]{0.1}{0.1} {\cal T} is Hausdorff.
Proof: By contradiction. Let \small \rule[-1.5]{0.1}{0.1} {\cal U} be an ultrafilter, and let \small \rule[-1.5]{0.1}{0.1} x \neq y \in X be such that \small \rule[-1.5]{0.1}{0.1} xR{\cal U} and \small \rule[-1.5]{0.1}{0.1} yR{\cal U}. Then we can find disjoint open sets \small \rule[-1.5]{0.1}{0.1} O and \small \rule[-1.5]{0.1}{0.1} P with \small \rule[-1.5]{0.1}{0.1} x \in O and \small \rule[-1.5]{0.1}{0.1} y \in P. Then \small \rule[-1.5]{0.1}{0.1} xR{\cal U} implies that \small \rule[-1.5]{0.1}{0.1} O \in {\cal U}, and \small \rule[-1.5]{0.1}{0.1} yR{\cal U} implies that \small \rule[-1.5]{0.1}{0.1} P \in {\cal U}. But \small \rule[-1.5]{0.1}{0.1} O \cap P = \emptyset \not\in {\cal U}, contradicting the fact that \small \rule[-1.5]{0.1}{0.1} {\cal U} is an ultrafilter.

Claim \small \rule[-1.5]{0.1}{0.1} 4: Let \small \rule[-1.5]{0.1}{0.1} {\cal T} be any topology on \small \rule[-1.5]{0.1}{0.1} X, and let \small \rule[-1.5]{0.1}{0.1} R be the relation that \small \rule[-1.5]{0.1}{0.1} {\cal T} is taken to by the Galois connection. Then \small \rule[-1.5]{0.1}{0.1} R is injective iff \small \rule[-1.5]{0.1}{0.1} {\cal T} is Hausdorff.
Proof: The 'if' part follows from Claim \small \rule[-1.5]{0.1}{0.1} 3 and the fact that \small \rule[-1.5]{0.1}{0.1} {\cal T} is closed with respect to the Galois connection. To prove the 'only if', suppose \small \rule[-1.5]{0.1}{0.1} {\cal T} isn't Hausdorff, and let \small \rule[-1.5]{0.1}{0.1} x and \small \rule[-1.5]{0.1}{0.1} y in \small \rule[-1.5]{0.1}{0.1} {\cal T} be distinct but not separated by any pair of open sets. Let \small \rule[-1.5]{0.1}{0.1} {\cal F} be the set of open sets containing either \small \rule[-1.5]{0.1}{0.1} x or \small \rule[-1.5]{0.1}{0.1} y. Any finite intersection of sets in \small \rule[-1.5]{0.1}{0.1} {\cal F} is an intersection of an open set containing \small \rule[-1.5]{0.1}{0.1} x with one containing \small \rule[-1.5]{0.1}{0.1} y, so is nonempty. Hence \small \rule[-1.5]{0.1}{0.1} {\cal F} can be extended to some ultrafilter \small \rule[-1.5]{0.1}{0.1} {\cal U}. Then \small \rule[-1.5]{0.1}{0.1} xR{\cal U} and \small \rule[-1.5]{0.1}{0.1} yR{\cal U}, so \small \rule[-1.5]{0.1}{0.1} R isn't injective.

The converses to claims \small \rule[-1.5]{0.1}{0.1} 1 and \small \rule[-1.5]{0.1}{0.1} 3 are false. This remarkable pattern is a shadow of a pair of adjoint functors, which I hope to say a little more about soon.

Tuesday, 3 June 2008

Another splinter of mathematics.

I'm jealous of the part IB maths students, who are taking their exams at the moment. Part of the reason for this is that one of them mentioned one of the questions from today's exam to me, and it's such a neat bit of maths that I have to share it.

The question is about a village idiot, who has to paint a fence. The fence is made up of a large number of slats in a circular arrangement. The idiot begins with a particular slat; let us call it his favourite. After painting any slat, he paints one of the two adjacent slats. He decides which of these two slats to paint next at random, for example by tossing a coin. He does not care whether he has painted a slat before: He follows the decision of the coin and, if necessary, adds multiple coats to some slats. Eventually, of course, he will have painted all of the slats; some particular slat will have the distinction of being painted last. Before he starts painting, can we tell which slat this is likely to be? That is, can we work out, for any given slat, the probability that it will be the last one to be painted?

We can, and the answer is striking and counterintuitive. Obviously, the chance that the idiot's favourite slat is the last to be painted is \small \rule[-1.5]{0.1}{0.1} 0. Let's fix our attention on some other slat, which I'll call 'the slat in question'. At some point, as the fool meanders about, he will paint one of the slats adjacent to the slat in question. The first time this happens, he will be painting the slat on one side without having begun work on the slat on the other side. So, at this point, the chance that the slat in question is the last to be painted is the chance that the idiot will, in his meanderings around the circle, reach the slat on the other side before he reaches the slat in question. In order to do so, he must paint all of the other slats.

Let us call this chance \small \rule[-1.5]{0.1}{0.1} p. So when, as he must, the fool eventually paints one of the adjacent slats, we can say with certainty that the chance we are interested in is \small \rule[-1.5]{0.1}{0.1} p. But this number \small \rule[-1.5]{0.1}{0.1} p doesn't depend at all upon how the idiot moves about until the point when he paints a neighbouring slat. So we may as well say, from the start, that the probability of the fool painting the slat in question is \small \rule[-1.5]{0.1}{0.1} p .

Now it is worth noting that nothing in the above argument, not even the value of \small \rule[-1.5]{0.1}{0.1} p, depends on which slat we were considering, except that it cannot be the idiot's favourite slat. So the probability of any of the other slats being painted last is the same: \small \rule[-1.5]{0.1}{0.1} p. Say there are \small \rule[-1.5]{0.1}{0.1} n slats in total. Then there are \small \rule[-1.5]{0.1}{0.1} n-1 slats that we might have considered, each of which the fool paints last with probability \small \rule[-1.5]{0.1}{0.1} p. Only one of the slats is painted last, so the probability that at least one is painted last is \small \rule[-1.5]{0.1}{0.1} (n-1)p. But this certainly happens, so \small \rule[-1.5]{0.1}{0.1} (n-1)p = 1. That is, \small \rule[-1.5]{0.1}{0.1} p = \frac{1}{n-1}.

This rather neat argument illustrates the power of the ideas in the Markov chains course. You may have thought I was doing something a little fishy in the fourth paragraph above. The main point of the course is that what I did was not fishy at all, and can be made completely rigorous. Finally, here's a question for those who know their stuff: What happens if the fool always moves clockwise with some probability \small \rule[-1.5]{0.1}{0.1} q, and anticlock wise with probability \small \rule[-1.5]{0.1}{0.1} (1-q)? I'm afraid you'll need to do a little calculation to get at the answer.

Saturday, 31 May 2008

Getting the bends.

This is a rather pretty piece of mathematics that I've always wanted to write up. It's a proof of the generalised Descartes circle theorem, one of very few mathematical theorems to have been immortalised in verse:

For pairs of lips to kiss maybe
Involves no trigonometry.
'Tis not so when four circles kiss
Each one the other three.
To bring this off the four must be
As three in one or one in three.
If one in three, beyond a doubt
Each gets three kisses from without.
If three in one, then is that one
Thrice kissed internally.

Four circles to the kissing come.
The smaller are the benter.
The bend is just the inverse of
The distance from the center.
Though their intrigue left Euclid dumb
There's now no need for rule of thumb.
Since zero bend's a dead straight line
And concave bends have minus sign,
The sum of the squares of all four bends
Is half the square of their sum.

To spy out spherical affairs
An oscular surveyor
Might find the task laborious,
The sphere is much the gayer,
And now besides the pair of pairs
A fifth sphere in the kissing shares.
Yet, signs and zero as before,
For each to kiss the other four
The square of the sum of all five bends
Is thrice the sum of their squares.

And let us not confine our cares
To simple circles, planes and spheres,
But rise to hyper flats and bends
Where kissing multiple appears,
In \small \rule[-1.5]{0.1}{0.1} n-ic space the kissing pairs
Are hyperspheres, and Truth declares -
As \small \rule[-1.5]{0.1}{0.1} n + 2 such osculate
Each with an \small \rule[-1.5]{0.1}{0.1} n + 1 fold mate
The square of the sum of all the bends
Is n times the sum of their squares.

Why should this rather pretty fact be true? Well, here's one way to think about it. If the \small \rule[-1.5]{0.1}{0.1} mth sphere has centre at \small \rule[-1.5]{0.1}{0.1} O_m and radius \small \rule[-1.5]{0.1}{0.1} r_m then the distance from \small \rule[-1.5]{0.1}{0.1} O_m to \small \rule[-1.5]{0.1}{0.1} O_1 is \small \rule[-1.5]{0.1}{0.1} r_m + r_1. To put it another way, the points \small \rule[-1.5]{0.1}{0.1} O_m lie at the vertices of an \small \rule[-1.5]{0.1}{0.1} n+1-simplex, whose side lengths are determined by the various radii. This \small \rule[-1.5]{0.1}{0.1} n+1-simplex has to fit somehow into \small \rule[-1.5]{0.1}{0.1} n-dimensional space, so it must have hypervolume \small \rule[-1.5]{0.1}{0.1} 0. The first step will be to study how the hypervolume of a simplex depends on the lengths of its sides.

It won't be necessary to actually derive a formula. Surprisingly enough, all we'll need is a qualitative description of the kind of formula involved. To make the derivation easier, we can suppose that one of the vertices is at the origin. Then the hypervolume is proportional to the determinant of the \small \rule[-1.5]{0.1}{0.1} n+1 by \small \rule[-1.5]{0.1}{0.1} n + 1 matrix \small \rule[-1.5]{0.1}{0.1} A whose columns are the vectors corresponding to the remaining vertices. What do we know about these vectors? Well, we certainly know their lengths. Also, for any pair \small \rule[-1.5]{0.1}{0.1} v and \small \rule[-1.5]{0.1}{0.1} w, we know the length of \small \rule[-1.5]{0.1}{0.1} v-w. But \small \rule[-1.5]{0.1}{0.1} |v-w|^2 = |v|^2 + |w|^2 - 2 v \cdot w, so we know the dot product of any pair.

These dot products are the entries of the matrix you get when you multiply \small \rule[-1.5]{0.1}{0.1} A by its transpose. So we can work out the determinant of that matrix, which is the square of the determinant of \small \rule[-1.5]{0.1}{0.1} A. What kind of expression does this give for the hypervolume? Well, it's \small \rule[-1.5]{0.1}{0.1} \sqrt{P}, where \small \rule[-1.5]{0.1}{0.1} P is a polynomial in the lengths of the sides. As the determinant is calculated as a homogeneous polynomial of degree \small \rule[-1.5]{0.1}{0.1} n + 1, and the dot products are expressions of degree \small \rule[-1.5]{0.1}{0.1} 2 in the lengths of the sides, we get that \small \rule[-1.5]{0.1}{0.1} P is homogeneous of degree \small \rule[-1.5]{0.1}{0.1} 2n + 2.

Now, in the case we're considering, the sides are of the form \small \rule[-1.5]{0.1}{0.1} r_l + r_m. So the hypervolume is the square root of some homogeneous polynomial \small \rule[-1.5]{0.1}{0.1} Q of degree \small \rule[-1.5]{0.1}{0.1} 2n + 2 in the radii of the spheres. What can we say about \small \rule[-1.5]{0.1}{0.1} Q? Well, it must be symmetric, since it defines a symmetric quantity. We can say more. Let's fix all of the radii except one, \small \rule[-1.5]{0.1}{0.1} r_m, and let \small \rule[-1.5]{0.1}{0.1} r_m grow very large. The simplex will now no longer fit into \small \rule[-1.5]{0.1}{0.1} n-space: It will have positive hypervolume \small \rule[-1.5]{0.1}{0.1} V which will also grow very large. How does the growth of \small \rule[-1.5]{0.1}{0.1} V depend on \small \rule[-1.5]{0.1}{0.1} r_m? Pick another vertex \small \rule[-1.5]{0.1}{0.1} O_l. We can find some fixed \small \rule[-1.5]{0.1}{0.1} R such that all of the vertices except \small \rule[-1.5]{0.1}{0.1} O_m are always in the sphere \small \rule[-1.5]{0.1}{0.1} S centred at \small \rule[-1.5]{0.1}{0.1} O_l with radius \small \rule[-1.5]{0.1}{0.1} R. Taking another sphere \small \rule[-1.5]{0.1}{0.1} T, of the same radius, about \small \rule[-1.5]{0.1}{0.1} O_m it is clear that the whole simplex is contained in the convex hull of these two spheres. This hull consists of a hypercylinder of height \small \rule[-1.5]{0.1}{0.1} r_l + r_m and radius \small \rule[-1.5]{0.1}{0.1} R, with two hemihyperspherical caps. Its volume is a linear function of \small \rule[-1.5]{0.1}{0.1} r_m.

That is, the hypervolume of the simplex grows no faster than some linear function of \small \rule[-1.5]{0.1}{0.1} r_m. So \small \rule[-1.5]{0.1}{0.1} Q, the square of that hypervolume, grows no faster than some quadratic function of \small \rule[-1.5]{0.1}{0.1} r_m. That is, \small \rule[-1.5]{0.1}{0.1} Q contains no terms of degree greater than two in any of the variables \small \rule[-1.5]{0.1}{0.1} r_m. Let \small \rule[-1.5]{0.1}{0.1} b_m be the bend of the \small \rule[-1.5]{0.1}{0.1} mth hypersphere, that is, the reciprocal of \small \rule[-1.5]{0.1}{0.1} r_m. Then \small \rule[-1.5]{0.1}{0.1} Q times the product of the squares of the bends of all the hyperspheres must be a polynomial, \small \rule[-1.5]{0.1}{0.1} R, in those bends. What's more, \small \rule[-1.5]{0.1}{0.1} R is symmetric and homogeneous of degree \small \rule[-1.5]{0.1}{0.1} 2(n+2)-(2n+2) = 2.

We are now very close; there just aren't many polynomials that \small \rule[-1.5]{0.1}{0.1} R could be. \small \rule[-1.5]{0.1}{0.1} R has to be a linear combination of the sum of the squares of the bends and the square of their sum. What's more, the above argument shows that there is a collection of mutually tangent hyperspheres with a given collection of bends if and only if \small \rule[-1.5]{0.1}{0.1} R of that collection is \small \rule[-1.5]{0.1}{0.1} 0. We can tell what linear combination \small \rule[-1.5]{0.1}{0.1} R has to be by considering the degenerate situation with \small \rule[-1.5]{0.1}{0.1} b_m = 1 for \small \rule[-1.5]{0.1}{0.1} m at most \small \rule[-1.5]{0.1}{0.1} n, and \small \rule[-1.5]{0.1}{0.1} b_{n+1} = b_{n+2} = 0. Here we have a collection of spheres of equal radius wedged between two parallel hyperplanes. The sum of the squares of the bends is \small \rule[-1.5]{0.1}{0.1} n, and the square of their sum is \small \rule[-1.5]{0.1}{0.1} n^2, \small \rule[-1.5]{0.1}{0.1} n times as big. So \small \rule[-1.5]{0.1}{0.1} R must be proportional to the difference of the square of the sum of the bends and \small \rule[-1.5]{0.1}{0.1} n times the sum of their squares, completing the proof.

Before looking at some consequences, there is another result which drops out of the proof. In the case \small \rule[-1.5]{0.1}{0.1} n=1, \small \rule[-1.5]{0.1}{0.1} R is proportional to \small \rule[-1.5]{0.1}{0.1} (b_1 + b_2 + b_3)^2 - b_1^2 - b_2^2 - b_3^2 = 2(b_1 b_2 + b_2 b_3 + b_3 b_1), so \small \rule[-1.5]{0.1}{0.1} Q is proportional to \small \rule[-1.5]{0.1}{0.1} r_1 r_2 r_3 (r_1 + r_2 + r_3). Now, given any triangle with sides \small \rule[-1.5]{0.1}{0.1} a, \small \rule[-1.5]{0.1}{0.1} b and \small \rule[-1.5]{0.1}{0.1} c, and semiperimeter \small \rule[-1.5]{0.1}{0.1} s, circles centred at the vertices with radii \small \rule[-1.5]{0.1}{0.1} s-a, \small \rule[-1.5]{0.1}{0.1} s-b and \small \rule[-1.5]{0.1}{0.1} s-c will be mutually tangent. So by the above work, the square of the area of the triangle should be proportional to \small \rule[-1.5]{0.1}{0.1} (s-a)(s-b)(s-c)(s-a + s-b + s-c) = s(s-a)(s-b)(s-c). To find out what the scale factor is, consider the right-angled triangle with sides \small \rule[-1.5]{0.1}{0.1} 3, \small \rule[-1.5]{0.1}{0.1} 4 and \small \rule[-1.5]{0.1}{0.1} 5 units. Then \small \rule[-1.5]{0.1}{0.1} s = 6, \small \rule[-1.5]{0.1}{0.1} s-a = 3, \small \rule[-1.5]{0.1}{0.1} s-b = 2, \small \rule[-1.5]{0.1}{0.1} s-c = 1, \small \rule[-1.5]{0.1}{0.1} s(s-a)(s-b)(s-c) = 36, and the square of the area is \small \rule[-1.5]{0.1}{0.1} 6^2, which is also \small \rule[-1.5]{0.1}{0.1} 36. It follows that for any triangle the area is \small \rule[-1.5]{0.1}{0.1} \sqrt{s(s-a)(s-b)(s-c)}. This is Heron's formula.

Suppose we have a plane with three mutually tangent spheres, each of radius \small \rule[-1.5]{0.1}{0.1} 1, on it. Sitting nestled between the spheres and the plane is a sphere of radius \small \rule[-1.5]{0.1}{0.1} r. We can find \small \rule[-1.5]{0.1}{0.1} r using the method outlined above. Let the bend of the smaller sphere be \small \rule[-1.5]{0.1}{0.1} b. Then \small \rule[-1.5]{0.1}{0.1} (b + 3)^2 = 3(b^2 + 3), so \small \rule[-1.5]{0.1}{0.1} 2b^2 - 6b = 0, so \small \rule[-1.5]{0.1}{0.1} b = 3, and the inner sphere has radius \small \rule[-1.5]{0.1}{0.1} \frac{1}{3}. Although we had to solve a quadratic equation, we ended up with a rational value. But that isn't surprising in this case: We already knew one of the roots would be \small \rule[-1.5]{0.1}{0.1} 0, which is rational, as there is clearly another plane tangent to all three spheres and parallel (that is, tangent at infinity) to the other plane. The rationality of the other root follows from the fact that the sum of the roots of a quadratic equation in \small \rule[-1.5]{0.1}{0.1} x is minus the coefficient of \small \rule[-1.5]{0.1}{0.1} x.

Applying this in the rather well behaved \small \rule[-1.5]{0.1}{0.1} 3-dimensional setting gives another pleasant result in the same spirit as the Descartes theorem. Fix three mutually tangent spheres of bends \small \rule[-1.5]{0.1}{0.1} u, \small \rule[-1.5]{0.1}{0.1} v and \small \rule[-1.5]{0.1}{0.1} w. Take some sphere \small \rule[-1.5]{0.1}{0.1} S_0 tangent to all three. What happens when we construct a sequence of spheres, taking each \small \rule[-1.5]{0.1}{0.1} S_{k+1} tangent to \small \rule[-1.5]{0.1}{0.1} S_k and to the original three spheres, but with \small \rule[-1.5]{0.1}{0.1} S_{k+1} different from \small \rule[-1.5]{0.1}{0.1} S_{k-1}? Well, the bends \small \rule[-1.5]{0.1}{0.1} b_{k-1} and \small \rule[-1.5]{0.1}{0.1} b_{k+1} of \small \rule[-1.5]{0.1}{0.1} S_{k-1} and \small \rule[-1.5]{0.1}{0.1} S_{k+1} are the roots of the quadratic \small \rule[-1.5]{0.1}{0.1} (x + b_k + u + v + w)^2 = 3(x^2 + b_k^2 + u^2 + v^2 + w^2), so they must sum to minus the coefficient of \small \rule[-1.5]{0.1}{0.1} x, namely \small \rule[-1.5]{0.1}{0.1} u + v + w + b_k. This gives \small \rule[-1.5]{0.1}{0.1} b_{k+1} = u + v + w + b_k - b_{k-1}. Letting \small \rule[-1.5]{0.1}{0.1} K = u + v + w, this gives the successive equations:
 \small
\begin{array}{lcccl}
b_2 &=& K + b_1 - b_0 && \\
b_3 &=& K + b_2 - b_1 &=& 2K - b_0 \\
b_4 &=& K + b_3 - b_2 &=& 2K - b_1 \\
b_5 &=& K + b_4 - b_3 &=& K + b_0 - b_1 \\
b_6 &=& K + b_5 - b_4 &=& b_0 \\
b_7 &=& K + b_6 - b_5 &=& b_1
\end{array}
and so on, with period \small \rule[-1.5]{0.1}{0.1} 6. So we get a chain of \small \rule[-1.5]{0.1}{0.1} 6 spheres threading around the original three. What's more, the spheres which are opposite in this chain have bends summing to \small \rule[-1.5]{0.1}{0.1} 2K, twice the sum of the bends of the original three. This result has also been cast in verse: 'The mean of the bends of each opposite pair is the sum of the three through the thoroughfare'. This couplet is taken from a longer poem in a somewhat obscure book that I don't have to hand at the moment. There's a neat illustration, together with a hint at another proof that you always get a chain of length \small \rule[-1.5]{0.1}{0.1} 6, here.