8
The First Case of Fermat's Last Theorem D. R. Heath-Brown Fermat, around 1637, stated that the Diophantine equation x n + yn = z ~ has no solutions in positive integers, if n > 2. This statement, which has never been proved or disproved, has come to be known as "Fermat's Last Theorem" (which we shall abbreviate to FLT). A considerable literature has arisen from Fer- mat's assertion--the interested reader should refer to the books by Edwards [3] and Ribenboim [9]. The pur- pose of this article is to describe some recent work on FLT where, for the first time, the theory of the distri- bution of primes is brought into play. We shall see how the connection arises, and look at some of the prime number theory on which it is necessary to call. First, however, let us review a few other results on the problem. FLT is at present known to hold for all exponents n ~ 125000 (Wagstaff [10]). However, the most impressive result concerning FLT is perhaps the theorem below, which is a special case of the recent work of Faltings [4] (or see Spencer Bloch's article in The Mathematical Intelligencer, Vol. 6, No. 2). To express the result succinctly we shall say that x,y,z is a "prim- itive" solution of x n + yn = z n, if xyz ~ 0 and (x,y,z) = 1. (Hence any non-trivial solution may be reduced to a primitive one.) We then have: THEOREM (Faltings): For each exponent n >! 3 the equa- tion x n + yn = z n has at most a finite number of primitive solutions. Since it is known that there are no solutions for n = 4 one may restrict attention to prime exponents p. We then have the customary division into the first and second cases. One says that the first case of FLT holds for the exponent p, if there are no primitive solutions of xP + yP = zP in integers x,y,z coprime to p. Simi- larly, the second case holds if there are no primitive solutions with p[xyz. The first case, which we shall be concerned with here, is generally reckoned to be the easier. Indeed it is known to hold for all p ~< 6 x 109 (Lehmer [8]). The earliest result on the first case, and the one which introduced the division into cases, is the following: THEOREM (Sophie Germain, 1832): The first case of FLT holds for p whenever 2p + 1 is also prime. This deals satisfactorily with p = 3,5,11 for example, but not with p = 7 or 13. Many other criteria for the first case have been given, perhaps the most widely known being: THEOREM (Wieferich, Mirimanoff): The first case of FLT holds for p unless mP -1 -~ l(mod p2) (1) for bothm = 2andm = 3. (In fact the first case holds for p unless (1) is satisfied for each of m -- 2,3 ..... 36, as was, essentially, shown by Morishima.) No primes are known for which the conguence (1) holds for both m = 2 and m = 3; and it seems most unlikely that any such prime exists. Indeed the only primes p ~< 6 x 109 for which D. R. Heath-Brown 40 THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985

The first case of fermat’s last theorem

Embed Size (px)

Citation preview

Page 1: The first case of fermat’s last theorem

The First Case of Fermat's Last Theorem

D. R. Heath-Brown

Fermat , a r o u n d 1637, s t a ted tha t the D i o p h a n t i n e equation x n + yn = z ~ has no solutions in positive integers, if n > 2. This s tatement, which has never been p roved or d isproved, has come to be known as "Fermat 's Last T h e o r e m " (which we shall abbreviate to FLT). A considerable literature has arisen from Fer- mat 's a s s e r t i o n - - t h e interested reader should refer to the books by Edwards [3] and Ribenboim [9]. The pur- pose of this article is to describe some recent work on FLT where, for the first time, the theory of the distri- but ion of pr imes is b rought into play. We shall see how the connect ion arises, and look at some of the pr ime number theory on which it is necessary to call.

First, however , let us review a few other results on the problem. FLT is at present k n o w n to hold for all exponents n ~ 125000 (Wagstaff [10]). However , the most impressive resul t concerning FLT is perhaps the theorem below, which is a special case of the recent work of Faltings [4] (or see Spencer Bloch's article in The Mathematical Intelligencer, Vol. 6, No. 2). To express the result succinctly we shall say that x,y,z is a "pr im- itive" solution of x n + yn = z n, if xyz ~ 0 and (x,y,z) = 1. (Hence any non-trivial solution may be reduced to a primitive one.) We then have:

THEOREM (Faltings): For each exponent n >! 3 the equa- tion x n + yn = z n has at most a finite number of primitive solutions.

Since it is k n o w n that there are no solutions for n = 4 one may restrict a t tent ion to prime exponen ts p. We then have the cus tomary division into the first and second cases. One says that the first case of FLT holds for the exponen t p, if there are no primitive solutions of xP + yP = zP in integers x,y,z copr ime to p. Simi- larly, the second case holds if there are no primitive solutions with p[xyz. The first case, which we shall be concerned with here, is generally reckoned to be the easier. Indeed it is k n o w n to hold for all p ~< 6 x 109 (Lehmer [8]). The earliest result on the first case, and the one which in t roduced the division into cases, is the following:

THEOREM (Sophie Germain, 1832): The first case of FLT holds for p whenever 2p + 1 is also prime.

This deals satisfactorily wi th p = 3,5,11 for example, bu t not with p = 7 or 13. Many other criteria for the first case have been given, perhaps the most widely k n o w n being:

T H E O R E M (Wieferich, Mir imanoff) : The first case of FLT holds for p unless

mP -1 -~ l (mod p2) (1)

for bothm = 2 a n d m = 3.

(In fact the first case holds for p unless (1) is satisfied for each of m -- 2,3 . . . . . 36, as was , essen t ia l ly , s h o w n by Mor i sh ima . ) No p r i m e s are k n o w n for which the conguence (1) holds for both m = 2 and m = 3; and it seems most unl ikely that any such pr ime exists. Indeed the only pr imes p ~< 6 x 109 for which

D. R. Heath-Brown

40 THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985

Page 2: The first case of fermat’s last theorem

2P -1 --- l (mod p2) a re p = 1093 and p = 3511. The following crude heuristic a rgument suggests that at most a finite n u m b e r of primes satisfy (1) for both m = 2 a n d m = 3.

One has 2p-1 = 1 + ap, 3p-1 = 1 + bp, by Fermat 's "Little Theorem". The residues modulo p of a and b seem to be r andomly distributed. Thus one might ex- pect the "probabi l i ty" of having pla and plb to be 1/p 2, and so the " expec t ed" number of primes for which (1) holds for bo th m = 2 and m = 3 is Ep-2 ( ~.

In spite of the appa ren t s trength of the above criteria it is still conceivable that they fail for all pr imes from some point onward . Indeed it has only recently been shown, by Adleman, Fouvry and the author , that the first case of FLT holds for infinitely m a n y p r i m e s - - and this is the result with which the presen t article will be concerned. It is still an open quest ion whe ther the second case holds for infinitely m a n y primes. The difficulty lies in the fact that criteria such as (1) are very hard to incorporate into an a rgumen t which in- volves more than one pr ime p. Indeed the criteria are very hard to get any sort of grip on.

Such a problem, asking for infinitely m a n y primes with a certain p roper ty , clearly falls into the realm of the analytic n u m b e r theorist. Looking at Sophie Ger- main 's criterion, for example, we would like to know that 2p + 1 is pr ime for infinitely m a n y p. Analytic n u m b e r theor is t s have long been familiar wi th this ques t i on , and k n o w qui te a lot a b o u t it. (But t hey haven ' t solved it!) The purpose of this article is to look at some of the pr ime number theory related to a gen- eralization of Sophie Germain 's criterion, and to show how it leads to the following result.

THEOREM (Adleman, Fouvry and Heath-Brown [1], [5]): The first case of Fermat's Last Theorem holds for infi- nitely many primes.

Indeed, if we define

S = {p: the first case holds for p},

then the me thod will in fact show that

# { p E S: p ~ x} >> x 2/3.

This notat ion means that the left hand side is at least cx 2/3 for some constant c > 0. Thus there are a "decen t n u m b e r " of pr imes for which the first case holds. (Re- member that the total number of pr imes up to x is a s ymp to t i c a l l y x/ log x, by the Pr ime N u m b e r The- orem.)

Generalization of Sophie Germain's Criterion

Let us begin by consider ing our criterion for the first case. Suppose that k is a positive integer not divisible by p, and let 2kp + 1 = q be prime. (We shall use p

and q to denote primes th roughout . ) Let xP + yP = zP with (x,y ,z) = 1 and p4"xyz. We shall need a result due to Furtw/ingler, which states that if x,y ,z are as above then dP-1 ~_ l (mo d p2) for any divisor d of the p roduc t xyz. (Thus, for example, one m a y deduce the case m = 2 of (1), since x,y,z cannot all be odd.) Let us see wh e th e r it is possible for q to divide xyz. By Furtw/in- gler ' s t h e o r e m this w o u l d enta i l qp-1 ~ l ( m o d p2). Howeve r , by the Binomial Theorem we have

q p - 1 = (2kp q- 1)P - 1 ~ 1 -b 2kp(p - 1) ~ l(mod p2),

since p t> 3 and p4"k. Hence n o n e of x,y ,z can be divis- ible by q.

We next choose y' such that yy ' =- l (mod q). This is possible since we now have q4"y. Then

( x y ' ) P q- 1 --- (xy')p + (yy ' )p = ( z y ' ) p ( m o d q).

Let X = (xy')P, Y = (zy')P, SO that X + 1 --- Y(mod q). On observing that q4"xy' we have X 2k = (xy ' )2kp =

(xy')q -1 ----- l (mo d q), by Fermat ' s "Little Theorem", and similarly we find yO_k _= l ( m o d q). Hence qlX 2k - 1 and q](X + 1) 2k - 1 . It follows that q]R k, where R k i s t h e

resul tant of the polynomials X 2k - 1 and (X + 1) 2k - 1. We shall need to know for which k one has R k = 0. This h ap p en s if and only if the two polynomials have a c o m m o n complex root, ot say. Then ]ot] = ]oL + 11 = 1 so that ot = exp( + 2"rri/3). It follows that 3]k, since we h a v e oL 2k = 1. Thus R k will be non-zero wheneve r k is relatively pr ime to 3. We shall also need an estimate for the size of R k. We note that R k is defined by a de te rminan t of order 4k, whose entries are either zeros or coefficients of the polynomials X 2k - 1 and (X + 1) 2k -- 1. Hence the entries are b o u n d e d by (k2k), and it follows that

IRk] ~ (4k)!(k2k) 4k ~ (4k)4k(22k) 4k ~ (24k)4k(22k) 4k = 224k 2.

(2)

We n o w define

T k = {p: 2kp + 1 is pr ime, bu t p ~E S}, U k = {p: 2kp + 1 = q is pr ime, and e i t h e r p l k o r

q]Rk}.

Then f rom wha t we have seen above, we have T k C U k. Moreover , if 34"k, then U k is a finite set, which can in principle be determined. In fact U k will often be empty . D6nes [2] has used this approach to generalize Sophie Germain 's criterion to the extent that it suffices for 2kp + 1 to be prime for any k ~ 52 with 34"k. If we knew that T k would always be emp ty for 34"k, we could show that the first case of FLT mus t hold for all pr imes p. One would merely take k = 3]' - p (whence 34"k) and use Dirichlet 's theorem on primes in arithmetic progress ions to find a value of j > p/3 such that 6pj +

THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985 41

Page 3: The first case of fermat’s last theorem

1 - 2p 2 = 2pk + 1 is prime. (Note that (6p, 1 - 2p 2) = 1.) Then p wou ld have to lie in S, a s Z 3 j _ p = ~ .

Unfor tunate ly it requires numerical investigation for each individual k to show that T k is empty . Ho wev e r it is easy mere ly to estimate the size of U k. Clearly k can have at mos t k pr ime factors p. Moreover , since 2kp + 1 I> 2, it follows from (2) that R k has at most 24k 2 pr ime factors q, if 3d'k. Hence

# T k ~ # U k ~ 24k 2 + k, (34"k). (3)

To pu t this into a more useful form w e introduce the count ing funct ions

�9 r(x;u,v) = #{q ~ x: q =- v(mod u)}

and, for 34u,

~r*(x;u) = #{q ~ x: q -~ l (mod u), 34q - 1}.

Hence if 34u we have

~*(x;u) = ~r(x;u,1) - ~r(x;3u,1). (4)

The func t ion ~(x ;u ,v ) is one of the basic objects of s tudy in prime n u m b e r theory. According to the prime number theorem for ari thmetic progressions, if u and v are fixed and x --~ % one has

r ~ Li(x) (u,v) = 1. (5) '

Here q0(u) is Euler 's funct ion and Li(x) = S~(log t ) - ld t . Since ~(p) = p - 1 and q0(3p) = 2p - 2 for p # 3, it follows from (4) that

~r*(x;p) ~ Li(x) 2p - 2 " (6)

We proceed to show h o w the bound (3) for the size of T k leads to an est imate for the sum

~1 = ~ ~*(x;p).

Here we shall take y = x ~ with a suitably chosen con- stant 0 in the range 0 < 0 < 1. By the definit ion of ,rr*(x;p) we see that E 1 is the number of pairs of primes p,q for which y < p ~ x, p (E S, q ~ x, Plq - 1 and 34q - 1. If we write q - 1 = 2kp, which mus t be possible, since p and q are odd , we see that k K x/(2p) < x/(2y) and that 3,)'k. Now, if p,q,k are as above, we will have p E T k, and since q is de te rmined by p and k we deduce that

~ , ~ ~ #V k ~ ~ 8k 2 + k "~ (x/y) 3, (7) k< x/2y k< x/2y

34"k

on applying (3). It turns out that, by quite different means , one can

show that the sum cor responding to ~1 but with the condit ion p (~ S omit ted, is O(x/log x). Consequent ly (7) tells us no th ing about the set S unless y = x ~ with

0 > 2/3. H o w e v e r for values of 0 close to 1, the b o u n d (7) pu ts severe restrictions on the primes p ~ S. To make this more precise we shall consider fur ther the s u m

E 2 : E ~r*(x;p), y<p~x

which was just ment ioned . Let us suppose, for illus- trative purposes , that (6) can be used uniformly for all p in the range y < p ~ 2 y - - a n assumpt ion which we cannot in fact justify. Then, if 2y ~ x, we would have

Li(x) U(x) ~ 2 ~ ~ ~*(x,/~)- ~ 2 p - 2 >~ ~ 4 y - 2 " y<p~2y y<p~2y y<p~2y

H o w e v e r the n u mb er of pr imes p be tween y and 2y is asymptot ical ly y/log y = y/(O log x), and Li(x) x/log x, wh en ce

y x 1 x

E 2 >~ 0 log-----x " log x" 4y----'U~- 2 >~ (log x) - - - - - - ~ ' (8)

since 0 is constant . We n o w compare the est imates (7) and (8). By taking

y = x ~ wi th 0 > 2/3 we see that E 1 = O(x 3-3~ = o(x(log X ) - 2 ) . Consequent ly one has E 1 < ~2 as soon as x is large enough , and so there mus t exist a prime p E S in the range y < p ~ 2y. It follows, by using an in- creasing sequence of values for x, that S is infinite. Notice particularly that it is essential to have 0 > 2/3 here. One cannot make do wi th 0 = 2/3 even.

The Bombieri-Vinogradov Prime Number Theorem

We mus t n o w consider the range of values of u for which one can use (5), and consequent ly the range of p for which (6) holds. The prob lem may be illustrated by examining the estimate

: { ~r(x;u,v) Li(x) 1 + , (u,v) = 1,

which is a form of (5) in which the dependence of the error te rm on u has been mad e explicit. For fixed u this does indeed imply the asymptot ic formula (5). How- ever, as soon as u ~ log x, one cannot even deduce that ~r(x;u,v) is non-zero. It turns out that better esti- mates are available, but the sharpes t result to date suf- fices on ly to s h o w that ~r(x;u,v) is posi t ive for u x v17, (and (u,v) = 1, of course) while we are interested in the range u ~ x 2/3. (What we have here is a form of Linnik's Theorem: If (u,v) = I t hen there exists a pr ime p = v(mod u) for which p ~ u 17. This is a quantitat ive version of Dirichlet 's theorem on primes in arithmetic progressions. The exponen t 17 is due to Chen.) It is conjectured that the estimate (5) holds uniformly for u

x 1-% for any positive constant e, but we are far f rom a proof of this as yet.

42 THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985

Page 4: The first case of fermat’s last theorem

One way to make progress is to call on the Bombieri- Vinogradov Theorem. This states that

~, max max ]'n(z;u,v)- l

u ~ z~x (u'~)=~ I qffu) ~ ( logx ) ~~ (9) Li(z)

for q~ ~ ~ - e, where ~ is a fixed positive constant. As we shall see in a moment , the theorem essentially says that (5) holds uniformly for "almost all" values of u in the range u ~ x 4. For our purposes we take z = x, v = 1, and we use only the terms u = p,3p. Then

~, ~(x;p,1) - Li(x) ~ x p*x~ p - 1 ( log x) 10 '

and similarly wi th p replaced by 3p. Thus (4) leads to

~ I~r,(x;p)_ Ci(x) I x 3<p~x6 2p - 2 ~ (lodx)10 . (10)

(In fact we have p ~ x ~ in one estimate and 3p ~ x 4 in the other, so a little "cleaning u p " is needed.)

We now take y = x ~ and define 6 by 2y = x ~. Then 0 < ~ implies c b ~ ~ - e, for a suitable ~ = e(0) > 0, if x is large enough. We shall use (10) to show that (6) holds for "almost all" p in the range y < p ~ 2y. Spe- cifically let 8 > 0 be given, and suppose there are Na such primes for which

[ ~r*(x;p)- Li(x) I Ci(x) 2p - 2 > 8 2 p - 2" (11)

Here we have

8 Li(x) ~ x / l ~ x 2p - 2 y y(log x) '

since ~ is constant. Hence (10) yields

X N~ Y log-----~ ~ x(log x)-10,

a n d there fore Na = O(y( log x)-9). Since the total number of primes y < p ~ 2y is asymptotically y/log y we see that (11) can hold for only a small proport ion of these primes, as claimed. In particular we mus t have

'n*(x;p) - Li(x) 82 pLi(x)_ 2 2p - 2 ~ (12)

for at least 1/2 �9 y/log y primes, if x is large enough. On choos ing 8 -- ~ we see f rom (12) tha t ,rr*(x;p) 1/2 �9 L i ( x ) / ( 2 p - 2) for at least 1/2 �9 y/log y primes, and consequently the estimate (8) follows as before.

We now have a rigorous proof that (8) holds. Un- fortunately the admissible range is 0 < 1/2, whereas we would like to take 0 > 2/3. The quest ion naturally arises: Is it possible to extend the set of values of ~b for which the Bombieri-Vinogradov Theorem (9) is valid? It is conjectured that any ~b < 1 is admissible. Some small but significant progress has been made recently, by Iwaniec and others, in certain special cases. How-

ever this does not relate directly to the s u m Y'2 in which we are interested, al though, as we shall see later, there is a less obvious connection.

An Argument of Chebychev, and the Brun-Titchmarsh Theorem

There is a me thod due to Chebychev which allows one to get lower bounds for E~*(x;p) out of upper bounds. We start f rom the fact that

log p = log n. (13) pmln

Here the sum on the left counts log p once for each power of p dividing n, so that, if pr is the largest power of p present , the corresponding contribution is r(log p), as required. We now observe that

~*(x,'pm)logp = ~ #{q ~ x:pm[q _ 1, 3d'q p~p ~ p~p~X - 1} logp

~ logp, q~x pm~x,p~a3

3"rq-1 pmlq-1

on changing the order of summation. The conditions pm ~ x and p # 3 in the inner sum are redundant , since q ~ x and 3~'q - 1, and hence we deduce that

~*(x;p m) logp = ~ log(q - 1), pm~x q~x p~3 3,rq - 1

by (13). We shall need an asymptotic formula for the sum

on the right. We have now reached the point in this discussion where it is no longer appropriate to give full details of all the estimations needed. For the sum in question, it would be a little tediousTbut not at all difficult, to construct a complete argument . Roughly speaking, one has here a sum of 1 + ,rr(x;3,2) terms (since either q = 3 or q -~ 2(mod 3)). By (5) this number is asymptotically 1/2 �9 x/log x. (Note that we are using (5) correctly, with a fixed value u = 3.) Moreover the vast majori ty of terms have x/log x ~ q ~ x, so that log(q - 1) - log x. Thus one obtains

1 'rr*(x;p m) log p = Y~ log(q - 1) -- ~ x. (14) pm~x q~x p# 3 3,rq - 1

We shall also evaluate the sum E~*(x;p m) log p for the shorter range pm ~ x ~ where q0 is a constant less than ~. Our starting point is the estimate

~ I ~r*(x;pm)- Ci(x) I ~ x ~ pm,xCb 26(p m) (log X) 1~ " p~a3

which is a straightforward generalization of (10). Since

THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985 43

Page 5: The first case of fermat’s last theorem

log p ~< log x for pm ~< x 6, we find that

Li(x) log p I x "rr*(x;p m) log p ~ - -

pm<~x ~b 2q~(P m) I (log X) 9 ' p#3

whence

~p~<-~ ~r*(x't~m)l~ q)(pm------) q- 0 ( ~ ) .

We have now to estimate the sum on the right, and again it is inappropriate to give full details. The terms with m /> 2 make a contribution O(1) to the s u m - - indeed the corresponding double infinite sum for all p and all m /> 2 converges. Moreover, log ply(p) (log p)/p and

log p log z. p<~z P

Thus one has

log p log x ~, P m<~xcb q~(pm)

p#3

and consequently �9

~r*(x;p m) log p - ~2Li(x)(log x ~) - -~- pm <~ x~b p#3

X.

We now subtract this from (14), whence

~r*(x;p m) log p x~Kpm<~x p'~3

1 - 6 - - X .

We must make allowance here for the terms pm with m/> 2. Al though we do not give full details, the basic idea is that in any range X < pm ~< 2X the total number of such terms is O(X�89 which is negligible compared to the number of primes in the interval. Consequent ly we may neglect the terms pm, m /> 2 wi thout altering the asymptotic formula. It follows that

1 - 6 ~r*(x;p) log p - - x. (15)

x~b<p<~x 2

and therefore that

~2 = ~ rr*(x;p) >> x---K-- ( 1 6 ) xd~<p<~x log x

fo r0 = 6<~2. The estimate (16) may be compared with (7) in the

same way as we used (8) earlier. As yet there is no advantage in this, since the allowable range 0 < �89 is the same as that arising from the analysis of the pre- vious section. Further progress is now possible, how-

ever, because of the explicit constant (1 - 6)/2 occur- ring in (15). We shall give an upper bound for the contribution to (15) arising from primes x 4' < p ~< x ~ and, providing that this upper estimate is less than { x(1 - 4)), we shall still be able to deduce (16). The point here is that only an upper bound for ar*(x;p) is required, and not an asymptot ic formula.

The bounds for "rr*(x;p) that we shall use are forms of the Brun-Titchmarsh Theorem, this being the state- ment that 1r(x;u,v) ~ Li(x)/rb(u ) uniformly for u ~< x 1-a (where 8 is any positive constant). In particular one has ~r*(x;p) ~ x/p log x for p ~< x 1 8. The constant im- plied by the ~ notation may of course depend on 8, and in practice the bounds one proves have the shape

7r*(x;p) C(r)x ( l~ p ) ~< r = - - , r < 1 . (17) p log x log x

It is i n t e re s t ing to compare this es t imate w i th the asymptotic formula (5). Both estimates have the same order of magnitude. However , in all known forms of (17) the constant C(r) tends to infinity as r--> 1. More- over, the best known value of C(�89 for example, is C({) = 1.6, whereas if (5) were available for primes p of order x4 one could essentially take C(�89 = 0.5. Thus in one sense the Brun -T i t chmarsh Theorem is m u c h weaker than (5). However (5) is not uniform at all, while (17) is applicable for all p < x.

Let us see how (17) may be used. We have

x ~ logp / log p rr*(x;p) log p ~< ~ ~ C k l o g x ]

x~<p<x0 log xCb<p<~x 0 p

The sum on the r ighthand side can be estimated by the technique of partial summat ion, using the Prime Number Theorem. Rather than give the details here, let us merely observe that the density of primes near the n u m b e r t is r o u g h l y 1 in log t, by the Pr ime Number Theorem. Thus it seems plausible to replace the sum Ef(p) by a corresponding integral ff(t)dt/log t - - a n d this is precisely wha t partial summation allows one to do. This leads to

7r*(x,/~) l~ P ~< x ; x~ ~log t~ log t dt x&<p<~xO I - ~ ;x~ C ~l-~J - -F - " log t

L 0

= x C(r)dr.

(Here we use the notation A(x) <~ B(x) to mean that A(x) <~ (1 + o(1))B(x) as x --~ ~.)

M a n y d i f fe ren t vers ions of the Brun-Ti tchmarsh Theorem have been es tabl ished, wi th various con- stants C(r). The simplest form of the theorem (see Hal- berstam and Richert [6; Theorem 3.7]) gives C(r) = (1 - r)-1 + e for any �9 > 0, and x/> x(�9 It follows from this that

44 THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985

Page 6: The first case of fermat’s last theorem

~, ~r*(x;p) log p ~< x log 1 - ~b 1 - 0 x6<p~x 0

On comparing this wi th (15) one finds that

(1 ogl > ~r*(x;p) log p ~> ~ 1 xO<p<~x

X.

Since ~b may be taken as close to ~ as required, we conclude that Y~2 >> x/log x providing that Iog(1/(2 - 20)) < 14. This results in an admissible range for (16) given by 0 < 1 - �89 = 0.611 . . . . So we still may not take 0 > 2/3, but we are getting closer!

Sieve Methods

We must now look briefly at sieves and their applica- t ion to the Brun-T i t chmarsh Theorem. The general sieve problem is as follows. We are given a finite set A of positive integers, and a parameter z > 1. We then wish to give uppe r est imates for the quant i ty S(A,z) = #{n E A: (n,P) = 1}, where

P = l - [ P . p<~z

Clearly the number of pr imes in the set A is at most S(A,z) + z for any z > 1.

The fundamenta l idea of the sieve me thod is to pick ou t the condi t ion (n,P) = 1 us ing carefu l ly chosen coefficients ~'d such that ~1 = 1 and

~d/> 0 (18) din

for all n I> 1. Thus

;1 , (n,P) = 1, ad

d{(n,P) L O, (n,P) > 1,

whence

S(A,z) ~ ~ hd#{n E A: n =-- 0(mod d)}. dip

One possibility would be to take ~d = p , (d) , the M6bius function. We would then have equality in (18) for n /> 2. However , as we shall see later, one can make better choices for the kd'S.

We shall now assume the existence of approxima- tion formulae of the shape

#{n E A: n - 0(mod d)} = p(d)X + R d.

Here X is an approximat ion to #A, the funct ion p(d) is multiplicative (i.e., p(mn) = p(m)p(n) w h e n e v e r (m,n) = 1) a n d R d is a r e m a i n d e r t e rm w h i c h has to be "smal l" in an appropr ia te sense. (Thus, for example, one might take A to be the set of numbers p - 1, for pr imes p ~< x. Then, according to (5), one should take X = Li(x) and p(d) = qb(d)-L) One now has

S(A,z ) <~ X ~ ~kdP(d ) q- ~ )kda d. (19) d{P d{P

It will tu rn out that the first te rm of this uppe r bound produces a leading term, while the second t e r m - - t h e " r ema inde r s u m " - - i s relatively small. This, however , depends on a judicious choice of the )td'S, as we shall see later.

For ease of exposition we shall illustrate the esti- mates not wi th ~r*(x;p), but with the simpler funct ion ~r(x;p,1). Consequent ly we shall take A = {n ~< x: n =- l (mod p)}. We choose X = x/p, so that # A = X + O(1). If p{d then {n E A: n -~ 0(mod d)} = Q, and so we take p(d) = 0, R d = 0 in this case. When p~'d the condit ions n -= l (mo d p), n --- 0(mod d) define a single residue class modulo pd, by the Chinese Remainder Theorem. In this case we therefore have

x X #{n E A: n =-0(mod d)} = ~ + O(1) = ~ + O(1).

Thus we define p(d) = d-1 for p~'d and p(d) = 0 oth- erwise, and we have R d = O(1) in both cases. Let us now explore the consequences of the choice Kd = ~t(d). The remainder sum ~ KdRd will be b o u n d ed by the number of divisors d of P, which is 2~(z). (Here ~(z) is the n u m b e r of primes not exceeding z.) If we are to obtain even the estimate ~r(x;p,1) ~ x/p we shall need to have 2~'(z) ~ X. However , since ~(z) - z/log z, this entails z ~ (log X)(loglog X). Unfortunately, it turns out (we shall not prove it here) that the "main t e rm" XX KdP(d) is of order X(log z)-1. Thus, with the choice Kd = p,(d), the estimate (19) can at best be O(X(loglog X)-1). H o w e v e r the B r u n - T i t c h m a r s h T h e o r e m re- qu i res a b o u n d O(X(log X) -1) for S(A,z) , and so a better set of coefficients ~'d is desirable.

Here we see an example of the under ly ing problem of sieve methods : to choose the Kd'S so as to make the main term as small as possible, while still having the remainder sum u n d e r control. For a wide class of sets A, including the one described above, there is a con- struction due to Rosser which gives essentially an op- timal result. One chooses a parameter D and defines a set S D in a ra ther intricate manner , but such that d <~ D for eve ry d E S D. Rosser t hen takes ~'d ---- ~l,(d) for d E S D and )t d = 0 otherwise. The construction of S D is such that (18) holds for each n I> 1. Now, if R d = O(1) for all d, this choice of hd will make the remainder sum O(D). Moreover , for the particular function p(d) in t roduced in connect ion wi th ~r(x;p,1), it turns out that

X ~ )tap(d ) - 2X ~ (log D ) - ' , as D---> 0% (20)

uni formly for/34 ~< z <~ D. Up to now, where we have omi t ted details they have

genuinely been minor matters. However , the asser-

THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985 45

Page 7: The first case of fermat’s last theorem

tions above concerning the Rosser sieve are quite dif- ferent. They represent very considerable quantities of technically difficult mathematics, for which the inter- ested reader should consult Iwaniec [7]. Fortunately, the outcome, n a m e l y the es t imate O(D) for the re- mainder sum, together with the asymptotic formula (20) for the main term, is easy to apply. If we select z = D = X(log X) -2 we immediately have

~r(x;p,1) <~ S(A,z) + z

2X P (log D)-I + O(D) + z p - 1

< 2Xp ( logX)_l ' p - 1

as x/p = X tends to infinity. Indeed if we allow both x/p and p to tend to infinity we have w(x;p,1) <~ 2X(log X) -1. P u r s u i n g a similar analys is for the func t ion ~r*(x;p), one w o u l d ob ta in the same b o u n d , bu t wi thout the factor 2. Consequent ly one finds that (17) holds with C(r) = (1 - r)-1 + e, as described earlier, at least w h e n � 9 - �9

The Averaged Brun-Titchmarsh Theorem

There are m a n y ways of making improvements to the Brun-Titchmarsh Theorem. The constant C(r) in (17) arises from the main term in the sieve bound (19). Since this main term is evaluated by means of (20), the only way that we can reduce C(r) is by increasing D. Thus one has to use a non-trivial b o u n d for the re- mainder sum in (19), to show that it is still smaller than the main term, even though D may be larger than before.

One important way of doing this, due to Hooley, is to use (19) for the set

A = {n ~ x: n =- l (mod p)} (= Ap, say)

and to sum over the primes p from an interval Q < p 2Q. In this way one bounds

~, ,rr(x;p,1) Q<p<~2Q

rather than the individual terms ~(x;p,1). However this is quite e n o u g h for our t r ea tment of ~2 (except, of course, that one wants ~*(x;p) in place of ~(x;p,1)). The parameter X, the function p(d) and the remainders R d, which we described in connection with the set A = Ap, depended on the prime p; we shall denote them by Xp, p(d;p) and Rd, p respectively. On the other hand we shall take z and D to depend on Q but not on the individual values of p. In particular kd will be inde- pendent of p. One then has

S(Ap,Z) ~ E Xp E kdP(d;P) Q<p<~2Q Q<p<~2Q d{P

+ E E kdRd,p" Q<p<~2Q d{P

We m a y now evaluate the first double sum on the right by using (20) as before. The result will have order of magni tude x(log x)-2, since log Q and log X will both be of order log x in our application. (Here we use the facts that Xp = O(x/Q) and that the number of primes p occurring is O(Q/log Q).) Thus one wants the double remainder sum XXKdRd, p to be o(x(log x)-2) for as large a value of D as possible.

If one uses only the est imate Rd, p = O(1) then the double sum is O(Q/log Q �9 D) since k d = 0 for d > D. This permits one to use D = x/Q(log x) -2, which is essentially the same value as before. However a su- perior t reatment of the remainder sum may be given by using the variable p non-trivially. On recalling the definitions of Xp, p(d;p) and Rd, p we have

E Rd, p = ~ (#{n E Ap: n ~ 0(mod d)} Q<p<~2Q Q<p<-2Q

- p(d;p)Xp)

#{p: Q < p <~ 2Q, n = l (mod p)} n~x,d{n

_ xd-1 ~, p-1 Q<p~2Q,p't'd

say, where

N(y;d,a) =

= N(x + 1; d, - 1 ) - co(d)x,

E Cm, (21) m<~y,m=-a(modd)

with

c m = # { p : Q < p < ~ 2 Q , m-- -O(modp)}

and

co(d) = d - 1 ~ p - 1. Q<p<~2Q,p.rd

(22)

Hence we have

E E Q'<p<~2Q d{P ~'dRd'p <~ ~ {N(x + 1; d , - 1) - co(d)x{.

d~D (23)

The sum on the right has similarities to that in the Bombier i -Vinogradov T h e o r e m (9). We have c m in place of the character is t ic func t ion of the p r imes , N(y;d,a) is the analogue of ~r(y;d,a), and co(d) corre- sponds to 1/qb(d). It turns out that, for a certain class of coefficients c m, and in particular for the c m defined by (22), the estimate corresponding to (9) does indeed hold, and with the same range qb < �89 Such an estimate shows that the double remainder sum (23) is O(x(log x) -1~ = o(x(log x) -2) for D = x~ -~, with any fixed �9 > 0. Previously we were using D = X(log X) -2 with

46 THE MATHEMATICAL 1NTELLIGENCER VOL. 7, NO. 4, 1985

Page 8: The first case of fermat’s last theorem

X = x/p. Thus the n e w method permits a larger value of D whenever p /> x~ +*. In fact, we have improved the result by a factor which is essentially (log x/p)/(log x~) = 2(1 - r) for p = x r. Thus (17) holds on average, with C(r) = 2 + ~, for any e > 0. This leads to an admissible range 0 < 5/8 = 0.625 for (16), by the same argument as before. Slowly we get closer to 0 = 2/3!

Further Improvements

There are several more ingredients which have to be incorporated into the sieve method in order to estab- lish (16) for a value 0 > 2/3. They are all extremely complicated technically, and it would be out of place to describe them in detail. The principal idea is to use generalizations of the Bombieri-Vinogradov Theorem involving functions such as the sum N(y;d,a) given by (21). We ment ioned earlier, in connection wi th the es- timates (9) and (10), that it would be highly desirable to extend the admissible range 0 < {. This has not yet proved possible for the Bombieri-Vinogradov Theorem (9) itself. However for the particular function N(y;d,a) given by (21) and (22), one may indeed establish a satisfactory estimate for the sum (23) for certain D > x~, at least for suitable values of Q. To see this one has only to look at the extreme case in which Q = 1. Then c m = 0 or 1 according as m is odd or even, since p = 2 is the only relevant prime. Thus N(x + 1; d, - 1) is 0 for even d, and x/2d + O(1) for odd d. However co(d) is 0 or 1/2d in these two cases, and one concludes that

IN(x + 1; d, - 1) - oo(d)x I ~ D ~ x(log x) -1~ d ~ D

for D ~ x1-% For the more re levant va lues of Q, namely those be tween x~ and x~, a much more com- plicated a rgument is needed, and the resulting range for D is not as good.

It turns out that the theory of exponential sums can be brought to bear on the problem, and in particular that one requires information about the Kloosterman sum, defined by

c

S(m,n;c) = ~ exp(2~ri(mk + n~)/c). k = l

(k,c) = 1

(Here J~ is a solution of k~ = 1(rood c).) This sum was considered by Weil. As a consequence of his proof of the " R i e m a n n H y p o t h e s i s " for curves over finite fields, he showed that S(m,n;c) ~ c�89 +* (for any e > 0) providing that (m,n,c) = 1. Thus one has a sum of roughly c terms, all of unit modulus, which cancels to the extent that the sum is O(c�89 This cancellation effect feeds back to give superior es t imates for the double remainder sum in the sieve problem. In this way Iwaniec was able to establish a number of im- proved versions of the averaged Brun-Titchmarsh The- orem, leading to the admissible range 0 < 0.638 for

(16). In fact Weil's estimates for exponential sums have m a n y app l ica t ions in d iverse p rob lems in ana ly t ic number theory, a situation which owes much to the work of Hooley. By compar ison , there are, as yet , rather few applications of Deligne's bounds for mul- tiple exponential sums.

In the context of the genera l ized Bombieri-Vino- g radov T h e o r e m , the K l o o s t e r m a n sums S(m,n;c) occur, or can be made to occur, as certain averages over m,n and c. It is natural to ask whether any saving can be obtained from cancellations in these averages. For example the Weil estimate alone yields

S(1,1;c) ~ C3/2+% c ~ C

but one might hope for a sharper estimate for the sum on the left, since the terms S(1,1;c) have varying signs. Indeed it turns out that the sum above is O(C7/6+*), as was shown by Kuznietsov. This result has been gen- eralized in m a n y ways by Deshouillers and Iwaniec, so as to include various kinds of averages over the parameters m and n as well as c. As before there are consequent improvements in the averaged Brun-Titch- marsh Theorem, which now suffice to yield (16) for 0 < 0.6563.

Just as Weil's estimate for the Kloosterman sum led to improvements of m a n y results in analytic number theory , so also have these b o u n d s for averages for Kloosterman sums had important repercussions. This field, n icknamed "Kloostermania" on account of the way it has so rapidly been applied to such a large range of problems, is perhaps the most exciting recent de- ve lopment in analytic number theory. Nonetheless, it m u s t be said tha t the technical i t ies involved have proved too forbidding for all but a few pioneers.

As far as the techniques for estimating these sums of Kloosterman sums are concerned, let two comments suffice. First, S(m,n;c) arises as a coefficient in certain modular functions f(x + iy). (These functions are de- fined on the upper half plane {z E C: Im(z) > 0} and are invariant under PSL2(Z ).) Secondly, one can extract information about the coefficients of such functions by s tudying their eigenfunction expansions with respect to the non-Euclidean Laplacian operator -y2(32/3x2 + 32/0y 2) - - t h i s opera to r also be ing invar ian t u n d e r PSL2(Z).

Conclus ion

We have now seen how a generalization of Sophie Germain 's criterion leads to an upper bound (7) for s whi le sieve me t h o d s provide a contras t ing es t imate (16) for Y'2. We have met a variety of techniques which improve the range of val idi ty of (16). However , in order to find primes p ~ S we need to show that s < s and this requires a value of 0 > 2/3. In fact almost

continued on page 55

THE MATHEMATICAL INTELLIGENCER VOL. 7, NO. 4, 1985 47