Saturday, 31 May 2008

Getting the bends.

This is a rather pretty piece of mathematics that I've always wanted to write up. It's a proof of the generalised Descartes circle theorem, one of very few mathematical theorems to have been immortalised in verse:

For pairs of lips to kiss maybe
Involves no trigonometry.
'Tis not so when four circles kiss
Each one the other three.
To bring this off the four must be
As three in one or one in three.
If one in three, beyond a doubt
Each gets three kisses from without.
If three in one, then is that one
Thrice kissed internally.

Four circles to the kissing come.
The smaller are the benter.
The bend is just the inverse of
The distance from the center.
Though their intrigue left Euclid dumb
There's now no need for rule of thumb.
Since zero bend's a dead straight line
And concave bends have minus sign,
The sum of the squares of all four bends
Is half the square of their sum.

To spy out spherical affairs
An oscular surveyor
Might find the task laborious,
The sphere is much the gayer,
And now besides the pair of pairs
A fifth sphere in the kissing shares.
Yet, signs and zero as before,
For each to kiss the other four
The square of the sum of all five bends
Is thrice the sum of their squares.

And let us not confine our cares
To simple circles, planes and spheres,
But rise to hyper flats and bends
Where kissing multiple appears,
In \small \rule[-1.5]{0.1}{0.1} n-ic space the kissing pairs
Are hyperspheres, and Truth declares -
As \small \rule[-1.5]{0.1}{0.1} n + 2 such osculate
Each with an \small \rule[-1.5]{0.1}{0.1} n + 1 fold mate
The square of the sum of all the bends
Is n times the sum of their squares.

Why should this rather pretty fact be true? Well, here's one way to think about it. If the \small \rule[-1.5]{0.1}{0.1} mth sphere has centre at \small \rule[-1.5]{0.1}{0.1} O_m and radius \small \rule[-1.5]{0.1}{0.1} r_m then the distance from \small \rule[-1.5]{0.1}{0.1} O_m to \small \rule[-1.5]{0.1}{0.1} O_1 is \small \rule[-1.5]{0.1}{0.1} r_m + r_1. To put it another way, the points \small \rule[-1.5]{0.1}{0.1} O_m lie at the vertices of an \small \rule[-1.5]{0.1}{0.1} n+1-simplex, whose side lengths are determined by the various radii. This \small \rule[-1.5]{0.1}{0.1} n+1-simplex has to fit somehow into \small \rule[-1.5]{0.1}{0.1} n-dimensional space, so it must have hypervolume \small \rule[-1.5]{0.1}{0.1} 0. The first step will be to study how the hypervolume of a simplex depends on the lengths of its sides.

It won't be necessary to actually derive a formula. Surprisingly enough, all we'll need is a qualitative description of the kind of formula involved. To make the derivation easier, we can suppose that one of the vertices is at the origin. Then the hypervolume is proportional to the determinant of the \small \rule[-1.5]{0.1}{0.1} n+1 by \small \rule[-1.5]{0.1}{0.1} n + 1 matrix \small \rule[-1.5]{0.1}{0.1} A whose columns are the vectors corresponding to the remaining vertices. What do we know about these vectors? Well, we certainly know their lengths. Also, for any pair \small \rule[-1.5]{0.1}{0.1} v and \small \rule[-1.5]{0.1}{0.1} w, we know the length of \small \rule[-1.5]{0.1}{0.1} v-w. But \small \rule[-1.5]{0.1}{0.1} |v-w|^2 = |v|^2 + |w|^2 - 2 v \cdot w, so we know the dot product of any pair.

These dot products are the entries of the matrix you get when you multiply \small \rule[-1.5]{0.1}{0.1} A by its transpose. So we can work out the determinant of that matrix, which is the square of the determinant of \small \rule[-1.5]{0.1}{0.1} A. What kind of expression does this give for the hypervolume? Well, it's \small \rule[-1.5]{0.1}{0.1} \sqrt{P}, where \small \rule[-1.5]{0.1}{0.1} P is a polynomial in the lengths of the sides. As the determinant is calculated as a homogeneous polynomial of degree \small \rule[-1.5]{0.1}{0.1} n + 1, and the dot products are expressions of degree \small \rule[-1.5]{0.1}{0.1} 2 in the lengths of the sides, we get that \small \rule[-1.5]{0.1}{0.1} P is homogeneous of degree \small \rule[-1.5]{0.1}{0.1} 2n + 2.

Now, in the case we're considering, the sides are of the form \small \rule[-1.5]{0.1}{0.1} r_l + r_m. So the hypervolume is the square root of some homogeneous polynomial \small \rule[-1.5]{0.1}{0.1} Q of degree \small \rule[-1.5]{0.1}{0.1} 2n + 2 in the radii of the spheres. What can we say about \small \rule[-1.5]{0.1}{0.1} Q? Well, it must be symmetric, since it defines a symmetric quantity. We can say more. Let's fix all of the radii except one, \small \rule[-1.5]{0.1}{0.1} r_m, and let \small \rule[-1.5]{0.1}{0.1} r_m grow very large. The simplex will now no longer fit into \small \rule[-1.5]{0.1}{0.1} n-space: It will have positive hypervolume \small \rule[-1.5]{0.1}{0.1} V which will also grow very large. How does the growth of \small \rule[-1.5]{0.1}{0.1} V depend on \small \rule[-1.5]{0.1}{0.1} r_m? Pick another vertex \small \rule[-1.5]{0.1}{0.1} O_l. We can find some fixed \small \rule[-1.5]{0.1}{0.1} R such that all of the vertices except \small \rule[-1.5]{0.1}{0.1} O_m are always in the sphere \small \rule[-1.5]{0.1}{0.1} S centred at \small \rule[-1.5]{0.1}{0.1} O_l with radius \small \rule[-1.5]{0.1}{0.1} R. Taking another sphere \small \rule[-1.5]{0.1}{0.1} T, of the same radius, about \small \rule[-1.5]{0.1}{0.1} O_m it is clear that the whole simplex is contained in the convex hull of these two spheres. This hull consists of a hypercylinder of height \small \rule[-1.5]{0.1}{0.1} r_l + r_m and radius \small \rule[-1.5]{0.1}{0.1} R, with two hemihyperspherical caps. Its volume is a linear function of \small \rule[-1.5]{0.1}{0.1} r_m.

That is, the hypervolume of the simplex grows no faster than some linear function of \small \rule[-1.5]{0.1}{0.1} r_m. So \small \rule[-1.5]{0.1}{0.1} Q, the square of that hypervolume, grows no faster than some quadratic function of \small \rule[-1.5]{0.1}{0.1} r_m. That is, \small \rule[-1.5]{0.1}{0.1} Q contains no terms of degree greater than two in any of the variables \small \rule[-1.5]{0.1}{0.1} r_m. Let \small \rule[-1.5]{0.1}{0.1} b_m be the bend of the \small \rule[-1.5]{0.1}{0.1} mth hypersphere, that is, the reciprocal of \small \rule[-1.5]{0.1}{0.1} r_m. Then \small \rule[-1.5]{0.1}{0.1} Q times the product of the squares of the bends of all the hyperspheres must be a polynomial, \small \rule[-1.5]{0.1}{0.1} R, in those bends. What's more, \small \rule[-1.5]{0.1}{0.1} R is symmetric and homogeneous of degree \small \rule[-1.5]{0.1}{0.1} 2(n+2)-(2n+2) = 2.

We are now very close; there just aren't many polynomials that \small \rule[-1.5]{0.1}{0.1} R could be. \small \rule[-1.5]{0.1}{0.1} R has to be a linear combination of the sum of the squares of the bends and the square of their sum. What's more, the above argument shows that there is a collection of mutually tangent hyperspheres with a given collection of bends if and only if \small \rule[-1.5]{0.1}{0.1} R of that collection is \small \rule[-1.5]{0.1}{0.1} 0. We can tell what linear combination \small \rule[-1.5]{0.1}{0.1} R has to be by considering the degenerate situation with \small \rule[-1.5]{0.1}{0.1} b_m = 1 for \small \rule[-1.5]{0.1}{0.1} m at most \small \rule[-1.5]{0.1}{0.1} n, and \small \rule[-1.5]{0.1}{0.1} b_{n+1} = b_{n+2} = 0. Here we have a collection of spheres of equal radius wedged between two parallel hyperplanes. The sum of the squares of the bends is \small \rule[-1.5]{0.1}{0.1} n, and the square of their sum is \small \rule[-1.5]{0.1}{0.1} n^2, \small \rule[-1.5]{0.1}{0.1} n times as big. So \small \rule[-1.5]{0.1}{0.1} R must be proportional to the difference of the square of the sum of the bends and \small \rule[-1.5]{0.1}{0.1} n times the sum of their squares, completing the proof.

Before looking at some consequences, there is another result which drops out of the proof. In the case \small \rule[-1.5]{0.1}{0.1} n=1, \small \rule[-1.5]{0.1}{0.1} R is proportional to \small \rule[-1.5]{0.1}{0.1} (b_1 + b_2 + b_3)^2 - b_1^2 - b_2^2 - b_3^2 = 2(b_1 b_2 + b_2 b_3 + b_3 b_1), so \small \rule[-1.5]{0.1}{0.1} Q is proportional to \small \rule[-1.5]{0.1}{0.1} r_1 r_2 r_3 (r_1 + r_2 + r_3). Now, given any triangle with sides \small \rule[-1.5]{0.1}{0.1} a, \small \rule[-1.5]{0.1}{0.1} b and \small \rule[-1.5]{0.1}{0.1} c, and semiperimeter \small \rule[-1.5]{0.1}{0.1} s, circles centred at the vertices with radii \small \rule[-1.5]{0.1}{0.1} s-a, \small \rule[-1.5]{0.1}{0.1} s-b and \small \rule[-1.5]{0.1}{0.1} s-c will be mutually tangent. So by the above work, the square of the area of the triangle should be proportional to \small \rule[-1.5]{0.1}{0.1} (s-a)(s-b)(s-c)(s-a + s-b + s-c) = s(s-a)(s-b)(s-c). To find out what the scale factor is, consider the right-angled triangle with sides \small \rule[-1.5]{0.1}{0.1} 3, \small \rule[-1.5]{0.1}{0.1} 4 and \small \rule[-1.5]{0.1}{0.1} 5 units. Then \small \rule[-1.5]{0.1}{0.1} s = 6, \small \rule[-1.5]{0.1}{0.1} s-a = 3, \small \rule[-1.5]{0.1}{0.1} s-b = 2, \small \rule[-1.5]{0.1}{0.1} s-c = 1, \small \rule[-1.5]{0.1}{0.1} s(s-a)(s-b)(s-c) = 36, and the square of the area is \small \rule[-1.5]{0.1}{0.1} 6^2, which is also \small \rule[-1.5]{0.1}{0.1} 36. It follows that for any triangle the area is \small \rule[-1.5]{0.1}{0.1} \sqrt{s(s-a)(s-b)(s-c)}. This is Heron's formula.

Suppose we have a plane with three mutually tangent spheres, each of radius \small \rule[-1.5]{0.1}{0.1} 1, on it. Sitting nestled between the spheres and the plane is a sphere of radius \small \rule[-1.5]{0.1}{0.1} r. We can find \small \rule[-1.5]{0.1}{0.1} r using the method outlined above. Let the bend of the smaller sphere be \small \rule[-1.5]{0.1}{0.1} b. Then \small \rule[-1.5]{0.1}{0.1} (b + 3)^2 = 3(b^2 + 3), so \small \rule[-1.5]{0.1}{0.1} 2b^2 - 6b = 0, so \small \rule[-1.5]{0.1}{0.1} b = 3, and the inner sphere has radius \small \rule[-1.5]{0.1}{0.1} \frac{1}{3}. Although we had to solve a quadratic equation, we ended up with a rational value. But that isn't surprising in this case: We already knew one of the roots would be \small \rule[-1.5]{0.1}{0.1} 0, which is rational, as there is clearly another plane tangent to all three spheres and parallel (that is, tangent at infinity) to the other plane. The rationality of the other root follows from the fact that the sum of the roots of a quadratic equation in \small \rule[-1.5]{0.1}{0.1} x is minus the coefficient of \small \rule[-1.5]{0.1}{0.1} x.

Applying this in the rather well behaved \small \rule[-1.5]{0.1}{0.1} 3-dimensional setting gives another pleasant result in the same spirit as the Descartes theorem. Fix three mutually tangent spheres of bends \small \rule[-1.5]{0.1}{0.1} u, \small \rule[-1.5]{0.1}{0.1} v and \small \rule[-1.5]{0.1}{0.1} w. Take some sphere \small \rule[-1.5]{0.1}{0.1} S_0 tangent to all three. What happens when we construct a sequence of spheres, taking each \small \rule[-1.5]{0.1}{0.1} S_{k+1} tangent to \small \rule[-1.5]{0.1}{0.1} S_k and to the original three spheres, but with \small \rule[-1.5]{0.1}{0.1} S_{k+1} different from \small \rule[-1.5]{0.1}{0.1} S_{k-1}? Well, the bends \small \rule[-1.5]{0.1}{0.1} b_{k-1} and \small \rule[-1.5]{0.1}{0.1} b_{k+1} of \small \rule[-1.5]{0.1}{0.1} S_{k-1} and \small \rule[-1.5]{0.1}{0.1} S_{k+1} are the roots of the quadratic \small \rule[-1.5]{0.1}{0.1} (x + b_k + u + v + w)^2 = 3(x^2 + b_k^2 + u^2 + v^2 + w^2), so they must sum to minus the coefficient of \small \rule[-1.5]{0.1}{0.1} x, namely \small \rule[-1.5]{0.1}{0.1} u + v + w + b_k. This gives \small \rule[-1.5]{0.1}{0.1} b_{k+1} = u + v + w + b_k - b_{k-1}. Letting \small \rule[-1.5]{0.1}{0.1} K = u + v + w, this gives the successive equations:
 \small
\begin{array}{lcccl}
b_2 &=& K + b_1 - b_0 && \\
b_3 &=& K + b_2 - b_1 &=& 2K - b_0 \\
b_4 &=& K + b_3 - b_2 &=& 2K - b_1 \\
b_5 &=& K + b_4 - b_3 &=& K + b_0 - b_1 \\
b_6 &=& K + b_5 - b_4 &=& b_0 \\
b_7 &=& K + b_6 - b_5 &=& b_1
\end{array}
and so on, with period \small \rule[-1.5]{0.1}{0.1} 6. So we get a chain of \small \rule[-1.5]{0.1}{0.1} 6 spheres threading around the original three. What's more, the spheres which are opposite in this chain have bends summing to \small \rule[-1.5]{0.1}{0.1} 2K, twice the sum of the bends of the original three. This result has also been cast in verse: 'The mean of the bends of each opposite pair is the sum of the three through the thoroughfare'. This couplet is taken from a longer poem in a somewhat obscure book that I don't have to hand at the moment. There's a neat illustration, together with a hint at another proof that you always get a chain of length \small \rule[-1.5]{0.1}{0.1} 6, here.

No comments: