On Irrational and Transcendental Numbers2 Irrational numbers 2.1 A theorem for the numbers eand ˇ In this subsection all integers are rational integers. Let c2R >0.Suppose f(x) is

Matthijs J. Warrens

On Irrational and Transcendental Numbers

Bachelor thesis, 9 augustus 2012

Supervisor: Dr. Jan-Hendrik Evertse

Mathematical Institute, Leiden University

1

Contents

1 Introduction 31.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 The Riemann zeta function . . . . . . . . . . . . . . . . . . . . 41.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Irrational numbers 52.1 A theorem for the numbers e and π . . . . . . . . . . . . . . . . 52.2 Auxiliary results for the irrationality of ζ(3) . . . . . . . . . . . 72.3 The irrationality of ζ(3) . . . . . . . . . . . . . . . . . . . . . . 11

3 The Hermite-Lindemann approach 143.1 Hermite’s identity . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 The number e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Algebraic integers and the house of an algebraic number . . . . 163.4 The number π . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.5 The Lindemann-Weierstrass theorem . . . . . . . . . . . . . . . 19

4 The Gel’fond-Schneider theorem 234.1 Some auxiliary results . . . . . . . . . . . . . . . . . . . . . . . 234.2 The case α, β ∈ R, α > 0 . . . . . . . . . . . . . . . . . . . . . . 254.3 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2

1 Introduction

1.1 History

Number theory is the branch of mathematics that is devoted to the study ofintegers, subsets of the integers like the prime numbers, and objects madeout of the integers. An example of the latter are the rational numbers, thenumbers that are fractions, ratios of integers. Examples are 1

2 and 227 . The

irrational numbers are those numbers that cannot be represented byfractions of integers. An example of an irrational number that was alreadyknown in Ancient Greece is

√2. The irrationality of e, the base of the

natural logarithms, was established by Euler in 1744. The irrationality of π,the ratio of the circumference to the diameter of a circle, was established byLambert in 1761 (see Baker 1975).

A generalization of the integers are the algebraic numbers. A complexnumber α is called algebraic if there is a polynomial f(x) 6= 0 with integercoefficients such that f(α) = 0. If no such polynomial exists α is calledtranscendental. The numbers

√2 and i are algebraic numbers since they are

zeros of the polynomials x2− 2 and x2 + 1 respectively. The first to prove theexistence of transcendental numbers was Liouville in 1844, using continuedfractions. The so-called Liouville constant

∑∞n=1 10−n! = 0.1100010... was the

first decimal example of a transcendental number (see Burger, Tubbs 2004).

In 1873 Hermite proved that e is transcendental. This was the first numberto be proved transcendental without having been specifically constructed forthe purpose. Building on Hermite’s result, Lindemann showed that π istranscendental in 1882. He thereby solved the ancient Greek problem ofsquaring the circle. The Greeks had sought to construct, with ruler andcompass, a square with area equal to that of a given circle. If a unit length isprescribed this amounts to constructing two points in the plane at a distance√π apart. In 1837 Wantzel showed that the constructible numbers are a

subset of the algebraic numbers. Lindemann showed that√π is however

transcendental. For a historical overview, see Burger and Tubbs (2004), andShidlovskii (1989).

In 1874 Cantor showed that the set of algebraic numbers is countablyinfinite. This follows from the fact that the polynomials with integercoefficients form a countable set and that each polynomial has a finitenumber of zeros. In the same paper Cantor also showed that the set of realnumbers is uncountably infinite. Since the algebraic numbers are countablewhile the real numbers are uncountable, it follows that most real numbersare in fact transcendental (see Dunham 1990).

At the Second International Congress of Mathematicians in 1900, Hilbertposed a set of 23 problems “the study of which is likely to stimulate thefurther development of our science”. In the 7th of these problems he

3

conjectured that if α and β are algebraic numbers, α 6= 0, 1 and β irrational,then αβ is transcendental. In 1934 both Gel’fond and Schneiderindependently and using different methods obtained a proof of Hilbert’sconjecture. It follows from the Gel’fond-Schneider theorem that the numbers2√

2 and eπ are transcendental (see Shidlovskii 1989).

Since irrational and transcendental numbers are defined by what they arenot, it may be difficult, despite their abundance, to show that a specificnumber is irrational or transcendental. For example, although e and π areirrational, it is unknown whether e+π, e−π, eπ, 2e, πe or π

√2 are irrational.

1.2 The Riemann zeta function

The Riemann zeta function is defined as

ζ(s) =∞∑n=1

1ns

=11s

+12s

+13s

+ · · ·

for a complex number s with Re s > 1. For any positive even integer 2n wehave the expression

ζ(2n) = (−1)n+1 B2n(2π)2n

2(2n)!,

where B2n is the 2n-th Bernoulli number (see Abramowitz, Stegun 1970,chapter 23). The first few Bernoulli numbers are B0 = 1, B1 = −1/2,B2 = 1/6, B4 = −1/30, B6 = 1/42 and B8 = −1/30. The odd Bernoullinumbers B3, B5, . . . are zero. The expression for ζ(2n) is due to Euler(Dunham 1990). It is unknown whether there is such a simple expression forodd positive integers.

Since π is a transcendental number, it follows from the above expression foreven numbers that ζ(2n) is transcendental. In 1979 Apery showed that thenumber ζ(3) is irrational. It is unknown if ζ(3) is also transcendental.Furthermore, it is unknown whether ζ(5), ζ(7), ζ(9) and ζ(11) are allirrational, although Zudilin (2001) showed that at least one of them isirrational. Moreover, Rivoal (2000) showed that infinitely many of thenumbers ζ(2n+ 1), where 2n+ 1 is an odd integer, are irrational.

1.3 Outline

In this bachelor thesis we consider the proofs of some results on irrationaland transcendental numbers. The thesis is organized as follows. In Section 2we consider the irrationality of e, π and ζ(3). In Section 3 we prove thetranscendence of the numbers e and π. We also consider theLindemann-Weierstrass theorem in this section. In Section 4 we discuss theGel’fond-Schneider theorem.

4

2 Irrational numbers

2.1 A theorem for the numbers e and π

In this subsection all integers are rational integers. Let c ∈ R>0. Supposef(x) is a function that is continuous on [0, c] and positive on (0, c).Furthermore, suppose there is associated with f an infinite sequence {fi}∞i=1

of anti-derivatives that are integer-valued at 0 and c and satisfy f ′1 = f andf ′i = fi−1 for i ≥ 2. Theorem 1 shows that if such a f exists for c, then thenumber c is irrational. The proof comes from Parks (1986). It is anextension of a simple proof by Niven (1947) that π is irrational.

Theorem 1. Let c and f(x) be as above. Then c is irrational.

Proof: Suppose c is rational. Then there are m,n ∈ Z such that c = m/n.

First, let Pc be the set of polynomials p(x) ∈ R[x] such that p(x) and all itsderivatives are integer-valued at 0 and c. The set Pc is closed under addition.Furthermore, repeated application of the product rule shows that Pc is alsoclosed under multiplication. Consider

p0(x) = m− 2nx.

Since p0(0) = m, p0(c) = −m and p′(x) = −2n are all integers, we havep0(x) ∈ Pc. Next, let k ∈ Z≥1 and let

pk(x) =xk(m− nx)k

k!.

Using induction on k we will show that pk(x) ∈ Pc. For p1(x) = x(m− nx)we have p1(0) = p1(c) = 0 and p′1(x) = p0(x). Hence, p1(x) ∈ Pc. Next,suppose p`(x) ∈ Pc and consider

p`+1(x) =x`+1(m− nx)`+1

(`+ 1)!.

We have p`+1(0) = p`+1(c) = 0. Furthermore, using the chain rule we have

p′`+1 =x`(m− nx)`

`!(m− 2nx) = p`(x)p0(x).

Since Pc is closed under multiplication, and since p0(x) and p`(x) are in Pc,it follows that p`+1(x) ∈ Pc.

Next, since f(x) is continuous on [0, c], it attains a maximum on [0, c]. LetM denote this maximum. Furthermore, since pk(x) is a polynomial for all kit is continuous and differentiable on [0, c]. Hence, pk(x) attains a maximumon [0, c], either in an endpoint or in the interior (0, c) where p′k(x) = 0. The

5

derivative of pk(x) is pk−1(x)p0(x). Since pk−1(x) is only zero at x = 0 andx = c, we must have p0(x) = 0, or x = m/2n, in order to have p′k(x) = 0. Atx = m/2n we have

pk

(m2n

)=

(m2

4n

)kk!

.

Replacing both f(x) and pk(x) by their maxima, we obtain

∫ c

0f(x)pk(x)dx ≤

M(m2

4n

)kk!

∫ c

0dx =

Mc(m2

4n

)kk!

.

The expression on the right-hand side of the inequality goes to 0 whenk →∞. Hence, for sufficiently large k we have the strict inequality∫ c

0f(x)pk(x)dx < 1.

On the other hand, using integration by parts we obtain∫ c

0f(x)pk(x)dx = f1(x)pk(x)

∣∣∣∣∣c

x=0

−∫ c

0f1(x)p′k(x)dx.

The first term on the right-hand side is an integer by hypothesis. Byrepeating integration by parts a number of times equal to the degree of p(x),repeatedly integrating the ‘f(x)’ part, while differentiating the ‘p(x)’ part,we obtain a sum of integers. Hence, the integral

∫ c0 f(x)pk(x)dx is an integer

for all k.

Since∫ c0 f(x)pk(x)dx is an integer, since f(x) is positive on (0, c), and since

pk(x) is positive at c/2 and equal to zero only at 0 and c for all k, it followsthat

∫ c0 f(x)pk(x)dx is a positive integer, that is,∫ c

0f(x)pk(x)dx ≥ 1,

for all k. Hence, we have a contradiction, and we conclude that c isirrational. �

Corollary 2. π is irrational.

Proof: π is a positive real number, and sin(x) is continuous on [0, π] andpositive on (0, π). As a sequence of anti-derivatives of sinx we may take− cosx, − sinx, cosx, sinx, etc., which all have values from {−1, 0, 1} atx = 0 or x = π. �

Corollary 3. Let a ∈ R>0, a 6= 1. If log a is rational, then a is irrational.

6

Proof: Since 1/a is rational if and only if a is rational, andlog(1/a) = − log(a) is rational if and only if log a is rational, it suffices toprove the corollary for a > 1.

Suppose a is rational. Then there are m,n ∈ Z such that a = m/n. Sincea > 1, we have log a > 0. Let c = log a and apply Theorem 1 withf(x) = nex. Then we may take the anti-derivatives of f all equal to f . Wehave f(0) = n and

f(c) = f(

logm

n

)= m,

which are both integers. It follows from Theorem 1 that log a is irrational.This contradicts the hypothesis. Hence, we conclude that a is irrational. �

Corollary 4. e is irrational.

Proof: e is a real number, e 6= 1. Since log e = 1 is a rational number, itfollows from Corollary 3 that e is irrational. �

2.2 Auxiliary results for the irrationality of ζ(3)

In the next subsection we show that

ζ(3) =∞∑n=1

1n3

= 1 +18

+127

+ · · ·

is irrational. We give a proof by Beukers (1979). We first prove some lemmas.

Lemma 5. If f(x) ∈ Z[x], then for any j ∈ Z≥0 all the coefficients of thej-th derivative f (j)(x) are divisible by j!.

Proof: Since differentiation is a linear operation, it suffices to prove thelemma for the polynomial xk for k > 0. The j-th derivative is 0 if j > k andif j ∈ {1, 2, . . . , k} then it is equal to

k!(k − j)!

xk−j = j!(k

j

)xk−j ,

in which(kj

)is an integer. �

Lemma 6. Let ε > 0. Then there is an Nε such that if n ≥ Nε, then

dn := lcm(1, 2, . . . , n) < e(1+ε)n.

Proof: Let p be a positive prime number and r ∈ R>0. If pr divides a numberin the set {1, 2, . . . , n}, then pr ≤ n, and we have r ≤ log n/ log p. On the

7

other hand, p[logn/ log p] does divide one such number, namely itself. Thus,

dn =∏p≤n

p[logn/ log p].

Let π(n) be the prime-counting function that gives the number of primes lessthan or equal to n. The prime number theorem states that

limn→∞

π(n) log nn

= 1.

Hence, for n sufficiently large, we have

dn =∏p≤n

p[logn/ log p] ≤ exp

∑p≤n

log n

= eπ(n) logn < e(1+ε)n.

�

Lemma 7. Let r, s ∈ Z>0. If r > s, then∫ 1

0

∫ 1

0− log(xy)

1− xyxrysdxdy (1)

is a rational number whose denominator when reduced divides d3r. If r = s we

have ∫ 1

0

∫ 1

0− log(xy)

1− xyxrysdxdy = 2

(ζ(3)−

r∑k=1

1k3

).

Proof: Using the identity

11− x

=∞∑k=0

xk = 1 + x+ x2 + · · · , |x| < 1,

we obtain that (1) is equal to

−∫ 1

0

∫ 1

0

∞∑k=0

log(xy)xr+kys+kdxdy. (2)

Since x, y ∈ [0, 1], the series∞∑k=0

xr+kys+k

is convergent, and it follows that∫ 1

0

∞∑k=0

∣∣∣log(xy)xr+kys+k∣∣∣ dx <∞.

8

Hence, applying Fubini’s theorem we obtain that (2) is equal to

−∫ 1

0

( ∞∑k=0

∫ 1

0log(xy)xr+kys+kdx

)dy. (3)

Let k ≥ 0. Using integrating by parts we obtain∫ 1

0(log x)xr+kdx = lim

ε→0

∫ 1

ε(log x)xr+kdx

= limε→0

log xxr+k+1

r + k + 1

∣∣∣∣∣1

x=ε

− limε→0

∫ 1

ε

xr+k

r + k + 1dx

= 0− limε→0

xr+k

(r + k + 1)2

∣∣∣∣∣1

x=ε

=−1

(r + k + 1)2.

Using this identity and log(xy) = log x+ log y in (3), we obtain that theexpression in (3) and hence (1) is equal to

−∞∑k=0

∫ 1

0

(ys+k log yr + k + 1

− ys+k

(r + k + 1)2

)dy.

Integrating next with respect to y we obtain, in a similar fashion,

∞∑k=0

(1

(r + k + 1)(s+ k + 1)2+

1(r + k + 1)2(s+ k + 1)

). (4)

For r > s, we have

r − s(r + k + 1)(s+ k + 1)2

+r − s

(r + k + 1)2(s+ k + 1)

=r − s

(r + k + 1)(s+ k + 1)

(1

s+ k + 1+

1r + k + 1

)=(

1s+ k + 1

− 1r + k + 1

)(1

s+ k + 1+

1r + k + 1

)=

1(s+ k + 1)2

− 1(r + k + 1)2

.

Hence, if r > s, (4) and hence (1) can be written as

1r − s

∞∑k=0

(1

(s+ k + 1)2− 1

(r + k + 1)2

)=

1r − s

∞∑k=1

(1

(s+ k)2− 1

(r + k)2

)

=1

r − s

r−s∑k=1

1(s+ k)2

.

9

The least common multiple of (r − s)(s+ 1)2, (r − s)(s+ 2)2, . . . , (r − s)r2 isa divisor of d3

r , which completes the first part of the lemma.

Finally, if r = s (4) and hence (1) becomes

2∞∑k=0

1(r + k + 1)3

= 2∞∑k=1

1(r + k)3

= 2

(ζ(3)−

r∑k=1

1k3

).

�

Lemma 8. Let D = {(u, v, w) : u, v, w ∈ (0, 1)}. Then the function f givenby

f(u, v, w) =(u, v,

1− w1− (1− uv)w

)is a bijection from D to D. Furthermore, its Jacobian determinant is

∂f(u, v, w)∂(u, v, w)

=−uv

(1− (1− uv)w)2.

Proof: Note that f is defined on D. We first show that f(D) ⊂ D. Let(u, v, w) ∈ D. Since 0 < 1− uv < 1, we have 0 < 1− w < 1− (1− uv)w < 1,or

0 <1− w

1− (1− uv)w< 1,

and hence f(u, v, w) ∈ D, and it follows that f is well-defined.

Next, let f2 = f ◦ f denote the two times iteration of f . We have

f2(u, v, w) = f

(u, v,

1− w1− (1− uv)w

)=

(u, v,

1− 1−w1−(1−uv)w

1− (1− uv) 1−w1−(1−uv)w

)

=(u, v,

1− (1− uv)w − (1− w)1− (1− uv)w − (1− uv)(1− w)

)= (u, v, w) ,

that is, f is self-inverse. In particular, f is bijective.

Finally, if we denote f(u, v, w) = (x, y, z), then we have

∂z

∂w=

−uv(1− (1− uv)w)2

,

and the Jacobian determinant equals

∂(x, y, z)∂(u, v, w)

= det

1 0 00 1 0∂x∂w

∂y∂w

∂z∂w

=∂z

∂w=

−uv(1− (1− uv)w)2

.

10

�

Lemma 9. In the region D = {(u, v, w) : u, v, w ∈ (0, 1)}, the function

f(u, v, w) =u(1− u)v(1− v)w(1− w)

1− (1− uv)w

is bounded from above by 1/27.

Proof: Let (u, v, w) ∈ D. Using the arithmetic-geometric means inequalitywe obtain the inequality

1− (1− uv)w = (1− w) + uvw ≥ 2√

1− w√uvw.

Hence, we have

f(u, v, w) ≤ u(1− u)v(1− v)w(1− w)2√

1− w√uvw

=12√u(1− u)

√v(1− v)

√w(1− w).

For t ∈ [0, 1], the maximum of√t(1− t) occurs at t = 1/3 and the maximum

of√t(1− t) occurs at t = 1/2. Hence, we have

f(u, v, w) ≤ 12· 1√

3

(1− 1

3

)· 1√

3

(1− 1

3

)·

√12

(1− 1

2

)=

127.

�

2.3 The irrationality of ζ(3)

Theorem 10. The number ζ(3) is irrational.

Proof: The n-th shifted Legendre polynomial is given by

Pn(x) =1n!

dn

dxn(xn(1− x)n) .

The first three polynomials are

P1(x) = 1− 2x

P2(x) = 1− 6x+ 6x2

P3(x) = 1− 12x+ 30x2 − 20x3.

Consider the double integral∫ 1

0

∫ 1

0− log(xy)

1− xyPn(x)Pn(y)dxdy.

It follows from Lemma 5 that Pn(x) ∈ Z[x]. Since Pn(x) is of degree n, thequantity Pn(x)Pn(y) is a sum of terms of the form aijx

iyj where

11

i, j ∈ {0, 1, . . . , n}, and aij ∈ Z. Since aii is a square for each i, we haveaii > 0 for each i. Note that the double integral can be written as a sum ofdouble integrals of the form in Lemma 7. It follows from Lemma 7 that thedouble integral is a sum of rational numbers whose denominators divide d3

n

plus a positive integer multiple of ζ(3). Hence, there exists integers An andBn > 0 such that the double integral equals (An +Bnζ(3))/d3

n.

Next, we find a second expression for the double integral. Since

− log(xy)1− xy

= − log(1− (1− xy)z)1− xy

∣∣∣∣∣1

z=0

=∫ 1

0

11− (1− xy)z

dz,

the double integral becomes∫ 1

0

∫ 1

0

∫ 1

0

Pn(x)Pn(y)1− (1− xy)z

dxdydz. (5)

For k ∈ {0, 1, . . . , n− 1} the multiple derivative (dk)/(dxk) (xn(1− x)n) canbe expressed as a sum of terms each having both x and 1− x as a factor.Switching order of integration and integrating by parts repeatedly, the tripleintegral (5) becomes

1n!

∫ 1

0

∫ 1

0

∫ 1

0Pn(y)

dn

dxn (xn(1− x)n)1− (1− xy)z

dxdydz

=1n!

∫ 1

0

∫ 1

0

∫ 1

0Pn(y)

11− (1− xy)z

d

(dn−1

dxn−1(xn(1− x)n)

)dydz

=1n!

∫ 1

0

∫ 1

0

∫ 1

0Pn(y)yz

dn−1

dxn−1 (xn(1− x)n)

(1− (1− xy)z)2dxdydz

= · · · = 1n!

∫ 1

0

∫ 1

0

∫ 1

0Pn(y)n!(yz)n

xn(1− x)n

(1− (1− xy)z)n+1dxdydz

=∫ 1

0

∫ 1

0

∫ 1

0

xnynzn(1− x)nPn(y)(1− (1− xy)z)n+1 dxdydz. (6)

Applying the transformation of Lemma 8 we have u = x, v = y,

zn =(1− w)n

(1− (1− uv)w)n

and

(1− (1− xy)z)n+1 =(

1− (1− uv)1− w

1− (1− uv)w

)n+1

=(uv)n+1

(1− (1− uv)w)n+1 .

The triple integral then becomes∫ 1

0

∫ 1

0

∫ 1

0

unvn(1− u)n(1− w)nPn(v) (1− (1− uv)w)n+1

(1− (1− uv)w)n (uv)n+1· uv

(1− (1− uv)w)2dudvdw

=∫ 1

0

∫ 1

0

∫ 1

0(1− u)n(1− w)n

Pn(v)1− (1− uv)w

dudvdw.

12

With the same arguments we used to show that the triple integral in (5) isequal to the integral in (6), but now with respect to v instead of x, we finallyobtain the identity∫ 1

0

∫ 1

0− log(xy)

1− xyPn(x)Pn(y)dxdy

=∫ 1

0

∫ 1

0

∫ 1

0un(1− u)nvn(1− v)nwn(1− w)n

dudvdw

(1− (1− uv)w)n+1 .

Applying Lemma 9 and Lemma 7 (with r = s = 0) we obtain

0 <∫ 1

0

∫ 1

0− log(xy)

1− xyPn(x)Pn(y)dxdy ≤

(127

)n ∫ 1

0

∫ 1

0

∫ 1

0

dudvdw

1− (1− uv)w

=(

127

)n ∫ 1

0

∫ 1

0− log(uv)

1− uvdudv

= 2ζ(3)(

127

)n.

For a positive integer n and integers An and Bn we have

0 <|An +Bnζ(3)|

d3n

< 2ζ(3)(

127

)n.

Assume now that ζ(3) = a/b for some integers a, b with b > 0. By Lemma 6,we have, for sufficiently large n,

0 < |bAn + aBn| ≤ 2ζ(3)(

127

)nd3nb

< 2ζ(3)(

127

)n(2.8)3nb = 2ζ(3)

((2.8)3

27

)nb < 2ζ(3)(0.9)nb.

Since bAn + aBn is an integer, we obtain a contradiction for sufficiently largen. Hence, ζ(3) is irrational. �

13

3 The Hermite-Lindemann approach

In this section we prove the transcendence of the numbers e and π. We alsopresent the Lindemann-Weierstrass theorem. In our proof we follow Baker(1975) and Shidlovskii (1989). We first prove a lemma.

3.1 Hermite’s identity

Lemma 11. Let f ∈ C[x] with deg f = m, u ∈ C, and let

I(u; f) =∫ u

0eu−tf(t)dt (7)

be the integral along the line segment from 0 to u. Then

I(u; f) = eum∑j=0

f (j)(0)−m∑j=0

f (j)(u). (8)

Proof: Using integration by parts we obtain the relation

I(u; f) = −eu−tf(t)∣∣∣∣ut=0

+∫ u

0eu−tf ′(t)dt

= euf(0)− f(u) +∫ u

0eu−tf ′(t)dt.

If we repeat this process m− 1 times we obtain identity (8). �

Identity (8) is also called Hermite’s identity (Shidlovskii 1989).

3.2 The number e

The proof of Theorem 12 is a simplified version of the original proof byHermite. This version can be found in Baker (1975) and Shidlovskii (1989).

Theorem 12. e is transcendental.

Proof: Suppose e is algebraic. Then there are a1, a2, . . . , an ∈ Z with a0 6= 0such that

n∑k=0

akek = a0 + a1e+ · · ·+ ane

n = 0. (9)

Let p be a prime number with p > max {n, |a0|} and define

f(x) = xp−1(x− 1)p · · · (x− n)p. (10)

14

Using this f with deg f = m = (n+ 1)p− 1 and I(u; f) in (7), we define thequantity

J =n∑k=0

akI(k; f) = a0I(0; f) + a1I(1; f) + · · ·+ anI(n; f)

We first derive an algebraic lower bound for |J |. Since (9) holds, thecontribution to the first summand on the right-hand side of (8) to J is 0, andwe have

J = −n∑k=0

m∑j=0

akf(j)(k).

The polynomial f(x) in (10) has 0 as a root of multiplicity p− 1 and1, 2, . . . , n as roots of multiplicity p. Hence, we have

J = −m∑

j=p−1

a0f(j)(k) +

m∑j=p

n∑k=1

akf(j)(k). (11)

Since f(x) in (10) can be written as

f(x) = xp−1((−1)(−2) · · · (−n) + b1x+ b2x

2 + · · ·+ bnxn)p

for some b1, . . . , bn ∈ Z, we have

fp−1(0) = (p− 1)!(−1)np(n!)p.

Due to Lemma 5 each term on the right-hand side of (11) is divisible by p!,except for fp−1(0) since p > n. Furthermore, since p > |a0|, it follows that Jis an integer which is divisible by (p− 1)! but not by p. Hence, J is aninteger with |J | ≥ (p− 1)!.

Next, we derive an analytic upper bound for |J |. On the interval x ∈ [0, n]each of the factors x− k for k ∈ {0, 1, . . . , n} is bounded by n. Thus,

|f(x)| = |xp−1(x− 1)p · · · (x− n)p| ≤ n(n+1)p−1 ≤(nn+1

)p,

for x ∈ [0, r]. Moreover, we have

|I(k; f)| ≤∫ k

0|ek−tf(t)|dt ≤

(∫ k

0dt

)ek max

t∈[0,k]|f(t)| ≤ kek

(nn+1

)pfor k ∈ {0, 1, . . . , n} and, using the triangle inequality,

|J | ≤n∑k=0

|ak||I(k; f)| ≤n∑k=0

|ak|kek(nn+1

)p ≤ c1cp2,for some constants c1 and c2 that are independent of p. Since we also have|J | ≥ (p− 1)!, we obtain a contradiction for sufficiently large p. Thecontradiction proves the theorem. �

15

3.3 Algebraic integers and the house of an algebraic number

Recall that a complex number α is called algebraic if there is a non-zeropolynomial f with integer coefficients such that f(α) = 0. There is a uniquepolynomial Fα ∈ Z[x] such that Fα(α) = 0, Fα is irreducible in Q[x], theleading coefficient of Fα is positive, and the coefficients of Fα have greatestcommon divisor 1. This polynomial Fα is called the minimum polynomial ofα. The other zeros in C of the minimum polynomial of α are called theconjugates of α.

An algebraic number α is said to be an algebraic integer if its minimumpolynomial has leading coefficient 1. The algebraic integers form a subring ofC. If α is algebraic we have

bnαn + bn−1α

n−1 + · · ·+ b1α+ b0 = 0

for certain b0, . . . , bn ∈ Z with bn 6= 0. If we multiply this equation by bn−1n

we obtain

(bnα)n + bn−1(bnα)n−1 + · · ·+ bn−2n b1(bnα) + bn−1

n b0 = 0.

Hence, if α is an algebraic number and bn is the leading coefficient of itsminimal polynomial, then bnα is an algebraic integer.

Let α1 ∈ C be an algebraic number and let αi for i ∈ {2, 3, . . . , n} denote theconjugates of α1 in C. The house of α1 denoted by α1 is defined as

α1 = max {|α1|, |α2|, . . . , |αn|} .

The following lemma will be used in the proof of the Gel’fond Schneidertheorem in Section 4.

Lemma 13. Let α1 ∈ C, α1 6= 0 be algebraic and degα1 = n. Let T ∈ Z,T > 0 be such that Tα1 is an algebraic integer. Then

|α1| ≥1

Tn α1n−1 .

Proof: Let αi for i ∈ {2, 3, . . . , n} denote the conjugates of α1. Since thenumbers Tαi for i ∈ {1, 2, . . . , n} are algebraic integers, the numberTα1Tα2 · · ·Tαn = Tnα1 · · ·αn is an algebraic integer. Since the minimumpolynomial of Tα1 is given by

(x− Tα1)(x− Tα2) · · · (x− Tαn) ∈ Z[x],

it follows that Tnα1 · · ·αn ∈ Z, and thus that |Tnα1 · · ·αn| ≥ 1. Hence,

|α1| ≥|α1 · · ·αn|α1

n−1 =|Tnα1 · · ·αn|Tn α1

n−1 ≥ 1Tn α1

n−1 .

16

�

From here on we make a distinction between rational integers, which aresimply elements of Z, and algebraic integers.

3.4 The number π

The proof of Theorem 14 is a simplified version of the original proof byLindemann. This version can be found in Baker (1975) and Shidlovskii(1989).

Theorem 14. π is transcendental.

Proof: Suppose π is algebraic. Then πi is also algebraic. Let α1 = πi withdegα1 = d, and let α2, . . . , αd be the conjugates of α1. Since 1 + eπi = 0, weobtain

d∏`=1

(1 + eα`) = (1 + eα1) · · · (1 + eαd) = 0.

If we expand this product, we obtain

d∏`=1

(1 + eα`) =1∑

ε1=0

· · ·1∑

εd=0

eε1α1+···+εdαd

The exponents inside the multiple sum include some which are non-zero, forexample, ε1 = 1 and ε2 = · · · = εd = 0, and also some which are zero, forexample, ε1 = · · · = εd = 0. Call the exponents θ1, θ2, . . . , θ2d and let the firstn be the non-zero ones. We have n < 2d, and

2d − n+ eθ1 + eθ2 + · · ·+ eθn = 0. (12)

It turns out that the numbers θ1, . . . , θn are the zeros of a polynomialg(x) ∈ Z[x] of degree n. We have the polynomial

h(x) =1∏

ε1=0

· · ·1∏

εd=0

(x− (ε1α1 + · · ·+ εdαd))

with deg h = 2d. If we consider h(x) as a polynomial in α1, . . . , αd, then h(x)is symmetric in α1, . . . , αd. Since α1, . . . , αd are a complete set of conjugates,it follows from the theory of elementary symmetric functions thath(x) ∈ Q[x]. The zeros of h(x) are θ1, . . . , θn, and 0 with multiplicity 2d − n.Hence, the polynomial h(x)/x2d−n ∈ Q[x] of degree n has precisely thenumbers θ1, . . . , θn as its zeros. If we let r be the least common denominatorof the coefficients of h(x)/x2d−n, then the polynomial

g(x) =r

x2d−n h(x) ∈ Z[x]

17

has also precisely θ1, . . . , θn as its zeros.

Next, let p be a prime number, let b be the leading coefficient of g(x), anddefine

f(x) = b(n−1)pxp−1gp(x) = bnpxp−1(x− θ1)p · · · (x− θn)p

with deg f = m = (n+ 1)p− 1. Furthermore, using I(u; f) in (7) we define

J =n∑k=1

I(θk; f) = I(θ1; f) + I(θ2; f) + · · ·+ I(θn; f).

We first derive an algebraic lower bound for |J |. Using (8) and (12) we canwrite J as

J = −(

2d − n) m∑j=p−1

f (j)(0)−m∑j=p

n∑k=1

f (j)(θk). (13)

It turns out that the inner sum over k is a rational integer. Indeed, first notethat since bα` for ` ∈ {1, 2, . . . , d} is an algebraic integer, bθk fork ∈ {1, 2, . . . , n} is also an algebraic integer. Furthermore, since g(x) ∈ Z[x]we have that f(x) ∈ Z[x]. Hence, since the sum over k is a symmetricpolynomial in bθ1, . . . , bθn with coefficients in Z and thus a symmetricpolynomial with rational integer coefficients in the 2d numbersb(ε1α1 + · · ·+ εdαd), it follows from the theory of elementary symmetricfunctions that the sum over k is a rational integer.

Since f (j)(θk) = 0 for j < p, it follows from Lemma 5 that the double sum in(13) is a rational integer divisible by p!. Furthermore, we have f (j)(0) = 0 forj < p− 1 and f (j)(0) is divisible by p! for j ≥ p due to Lemma 5. It followsfrom the theory of elementary symmetric functions that

f (p−1)(0) = bnp(p− 1)!(−1)np(θ1θ2 · · · θn)p,

is divisible by (p− 1)!. However, if p is sufficiently large f (p−1)(0) is notdivisible by p!. Hence, if p > 2d − n it follows that |J | ≥ (p− 1)!.

Similar to the proof of Theorem 12 we can derive that |J | ≤ c1cp2 where c1and c2 are constants that are independent of p. We get a contradiction,which completes the proof. �

18

3.5 The Lindemann-Weierstrass theorem

Theorems 12 and 14 on the transcendence of e and π are special cases of amore general result which Lindemann sketched in 1882. The result was laterrigorously demonstrated by Weierstrass in 1885 (see Baker 1975). The proofof Theorem 15 comes from Baker (1975).

Theorem 15. For any distinct numbers α1, . . . , αn ∈ Q, and non-zeronumbers β1, . . . , βn ∈ Q, we have β1e

α1 + β2eα2 + · · ·+ βne

αn 6= 0.

Proof: Supposeβ1e

α1 + β2eα2 + · · ·+ βne

αn = 0. (14)

We can assume that the βi are rational integers. If this is not the case, weconsider the product of all the expressions formed by substituting for one ormore of the βj one of its conjugates. Suppose βj has degree mj , let its mj

conjugates be denoted by βj(ij) for ij ∈ {1, 2, . . . ,mj}, and put

M =n∏j=1

mj .

The product is given bym1∏i1=1

· · ·mn∏in=1

(β1(i1)eα1 + · · ·+ βn(in)eαn)

=∑

j1,...,jn

β(j1, . . . , jn)ej1α1+···+jnαn ,

where the latter sum is taken over all tuples of non-negative integers(j1, . . . , jn) with j1 + · · ·+ jn = M and β(j1, . . . , jn) is a polynomialexpression in b1(1), . . . , βn(mn) which has rational integer coefficients andwhich is invariant under any permutation of (βi(1), . . . , βi(mi)) fori ∈ {1, 2, . . . , n}. Hence, all β(j1, . . . , jn) ∈ Q. Let γ1, . . . , γt be the distinctnumbers among the j1α1 + · · ·+ jnαn. Then the product becomes

δ1eγ1 + · · ·+ δte

γt ,

where each δi is the sum of some of the terms β(j1, . . . , jn). Hence,δ1, . . . , δt ∈ Q. To complete, we multiply the rational numbers by a commondenominator.

We now show that at least one of the new coefficients δj is non-zero. To thisend, we define on C a lexicographic ordering ≺ such that ζ ≺ η if Re ζ < Re ηor Re ζ = Re η and Im ζ < Im η. If ζ1, . . . ζr, η1, . . . ηr are complex numberswith ζ1 ≺ η1,. . ., ζr ≺ ηr, then it holds that ζ1 + · · ·+ ζr ≺ η1 + · · ·+ ηr. Weassume without loss of generality that α1 ≺ · · · ≺ αn and γ1 ≺ · · · ≺ γt.Hence, we have γt = Mαn and j1α1 + · · ·+ jnαn < γt for(j1, . . . , jn) 6= (0, . . . ,M), and thus δt = (βn(1) · · ·βn(mn))m1···mn 6= 0.

19

Next, we can assume that the set {α1, . . . , αn} is closed under conjugation,that is, it contains all conjugates of each element occurring in it, andmoreover, for any two indices j and k such that αj and αk are conjugates, wehave βj = βk.

If this is not the case, let K be any finite normal extension of Q containingα1, . . . , αn, and let {σ1, . . . , σm} be the Galois group of K/Q. Then clearly,

m∏i=1

(β1eσi(α1) + · · ·+ βne

σi(αn)) = 0.

By expanding the product on the left-hand side, we get

n∑i1=1

· · ·n∑

im=1

βi1 · · ·βim exp(σ1(αi1) + · · ·+ σm(αim)) = 0.

By grouping together those terms for which the exponentsσ1(αi1) + · · ·+ σm(αim) have equal values we obtain an identity of the form

δ1eγ1 + · · ·+ δte

γt = 0,

where γ1, . . . , γt are the distinct numbers among the exponentsσ1(αi1) + · · ·+ σm(αim). Clearly, {γ1, . . . , γt} is closed under conjugation,and δj = δk whenever γj and γk are conjugate to one another.

It remains to show that at least one of the numbers δk is non-zero, and forthis, we use the argument from above. For i ∈ {1, 2, . . . ,m}, let ji be theindex j for which σi(αj) is the largest among σi(α1), . . . , σi(αn) in thelexicographic ordering. This index ji is unique since α1, . . . , αn are distinct.Then σ1(αj1) + · · ·+ σm(αjm) = γk is in the lexicographic ordering largerthan all other exponents σ1(αi1) + · · ·+ σm(αim) and thus, the coefficientδk = βi1 · · ·βim 6= 0.

For the remainder of the proof we can now assume that

β1eα1 + β2e

α2 + · · ·+ βneαn = 0, (15)

where α1, . . . , αn are distinct and the βi are rational integers, and that thereare integers 0 = n0 < n1 < · · · < nr such that αnt+1, . . . , αnt+1 is a completeset of conjugates for each t, and

βnt+1 = βnt+2 = · · · = βnt+1 .

Since the α1, . . . , αn and β1, . . . , βn are algebraic, we can choose a non-zerorational integer b such that bα1, . . . , bαn and bβ1, . . . , bβn are algebraicintegers. Let p be a prime number and define for i ∈ {1, 2, . . . , n} thefunctions

fi(x) = bnp[(x− α1) · · · (x− αn)]p

(x− αi)

20

with deg fi = m = np− 1. Using these fi(x) and I(u; f) in (7) we define fori ∈ {1, 2, . . . , n} the quantities

Ji =n∑k=1

βkIi(αk; fi) = β1Ii(α1; fi) + · · ·+ βnIi(αn; fi)

We first derive an algebraic lower bound for |J1 · · · Jn|. Using (8) and (15)we obtain

Ji = −m∑j=0

n∑k=1

βkf(j)i (αk).

Using a modification of Lemma 5, we find that f (j)i (αk) is p! times an

algebraic integer unless j = p− 1 and k = i. In this particular case we have

f(p−1)i (αi) = bnp(p− 1)!

n∏k=1,k 6=j

(αi − αk)p.

Hence, f (p−1)i (αi) is an algebraic integer divisible by (p− 1)! but not by p! if

p is sufficiently large. It then follows that Ji is an algebraic integer that isdivisible by (p− 1)!.

Next, we show that Ji 6= 0. For sufficiently large p, the number Ji can bewritten as

Ji = −m∑j=0

r−1∑t=0

βnt+1

[f

(j)i (αnt+1) + · · ·+ f

(j)i (αnt+1)

].

Note that by construction, fi(x) can be written as a polynomial whosecoefficients are polynomials in the αi, with rational integer coefficientsindependent of the αi. Thus, noting that the αi form a complete set ofconjugates and using the fundamental theorem on symmetric polynomials asin the previous proof, we see that the product of the Ji is in fact a rationalnumber. Since it is an algebraic integer, it is an integer. Thus, J1 · · · Jn is arational integer, and it is divisible by ((p− 1)!)n. Thus,|J1 · · · Jn| ≥ [(p− 1)!]n.

Finally, using the triangle inequality we have, for each i,

|Ji| ≤n∑k=1

|βk||Ii(αk; fi)|.

Hence, similar to the proofs of Theorems 12 and 14 we can derive that|J | ≤ c1cp2 where c1 and c2 are constants that are independent of p. We get acontradiction, which completes the proof. �

The transcendence of e and π follows directly from Theorem 15. We alsohave the following corollaries.

21

Corollary 16. If α 6= 0 is algebraic, then eα is transcendental.

Proof: If eα = β is algebraic, then we have eα − βe0 = 0, which contradictsTheorem 15. �

Corollary 17. If α 6= 0 is algebraic, then sinα and cosα are transcendental.

Proof: We have

sinα =eiα − e−iα

2i, and cosα =

eiα + e−iα

2.

If sinα = β is algebraic, then eiα − e−iα − 2iβe0 = 0, which contradictsTheorem 15. �

Corollary 18. If α ∈ C\ {0, 1} is algebraic, then logα is transcendental forevery branch of the logarithm.

Proof: If logα = β, then eβ = α. By Corollary 16, since α is algebraic, βmust be transcendental. �

22

4 The Gel’fond-Schneider theorem

In this section we prove the Gel’fond-Schneider theorem. We first prove someanalytic lemmas. Before presenting the lemmas we introduce the followingnotation.

Let w ∈ C, R ∈ R>0, and let

D(R,w) = {z ∈ C : |z − w| < R}

andD(R,w) = {z ∈ C : |z − w| ≤ R} .

If w = 0 we write D(R) and D(R). Furthermore, let the maximum of |f(z)|on D(R,w) be denoted by M(R,w, f). If w = 0 we write M(R, f). If f(z) isanalytic on D(R) and continuous on D(R), then it follows from themaximum modulus principle that |f(z)| attains its maximum on |z| = R. Iff(z) is analytic on D(R,w), then N(R,w, f) will be used to denote thenumber of zeros of f(z) in D(R,w).

4.1 Some auxiliary results

Lemma 19. Let a1(t), . . . , an(t) be non-zero polynomials in R[t] of degreesd1, . . . , dn respectively. Let w1, . . . , wn be pairwise distinct real numbers.Then

f(t) =n∑j=1

aj(t)ewjt

has at most n− 1 +∑n

j=1 dj real zeros.

Proof: By multiplying through by e−wnt if necessary, we may suppose thatwn = 0 and wj 6= 0 for j ∈ {1, 2, . . . , n− 1}. Let E = n+

∑nj=1 dj . We

proceed by induction on E.

If E = 1, then n = 1 and d1 = 0. In this case there are no zeros, that is,there are at most E − 1 = 0 zeros.

Next, suppose the lemma holds for ` ∈ {2, 3, . . . , E − 1} and consider ` = E.We have the first derivative

f ′(t) =n−1∑j=1

[a′j(t) + wjaj(t)

]ewjt + a′n(t).

Since the wj are pairwise distinct, and since wj 6= 0 for j ∈ {1, 2, . . . , n− 1},a′j(t) + wjaj(t) has exactly degree dj for j ∈ {1, 2, . . . , n− 1}. Furthermore,since we may suppose that wn = 0, the derivative a′n(t) has degree dn − 1. Itfollows from the induction hypothesis that f ′(t) has at most(n− 2) +

∑nj=1 dj real zeros.

23

Finally, let N denote the number of real zeros of f(t), and letb1 < b2 < . . . < bN denote these zeros. Since f(t) is continuous on theintervals [bi, bi+1] and differentiable on (bi, bi+1) for i ∈ {1, 2, . . . , N − 1}, itfollows from Rolle’s theorem that f ′(t) has at least N − 1 real zeros. Hence,N − 1 ≤ (n− 2) +

∑nj=1 dj , or N ≤ (n− 1) +

∑nj=1 dj . �

Lemma 20. Let r,R ∈ R with 1 ≤ r ≤ R. Let f1(z), f2(z), . . . , fm(z) beanalytic in D(R) and continuous on D(R). Let y1, y2, . . . , ym ∈ C with|yi| ≤ r for i ∈ {1, 2, . . . ,m}. Then the determinant

∆ = det

f1(y1) · · · fm(y1)...

. . ....

f1(ym) · · · fm(ym)

satisfies the inequality

|∆| ≤(R

r

)−m(m−1)/2

m!m∏j=1

M(R, fj).

Proof: Consider the determinant

h(z) = det(fj(yiz)) = det

f1(y1z) · · · fm(y1z)...

. . ....

f1(ymz) · · · fm(ymz)

.

Since the yi satisfy |yi| ≤ r, the functions fj(yiz) are analytic in D(R/r) andcontinuous on D(R/r). Since it is a sum of products of the fj(yiz), thedeterminant h(z) itself is analytic in D(R/r) and continuous on D(R/r).

Next, let K = m(m− 1)/2. Since the fj(yiz) are analytic functions onD(R/r) they can be expanded into power series on D(R/r). It follows that

fj(yiz) =K−1∑k=0

bk(j)yki zk + zKgij(z),

where bk(j) ∈ C for each k and gij(z) is analytic in D(R/r) and continuouson D(R/r). Since the determinant is linear in each of its columns, we canview h(z) as zK times an analytic function on D(R/r) plus terms involvingthe factor

zn1+n2+···+nm det(ynj

i

)= zn1+n2+···+nm det

yn11 · · · ynm

1...

. . ....

yn1m · · · ynm

m

,

where n1, n2, . . . , nm ∈ Z≥1 and nj ∈ {0, 1, . . . ,K − 1}. The determinant inthe last expression is zero if two of the nj are identical. Therefore, thenon-zero terms of this form satisfy

n1 + n2 + · · ·+ nm ≥ 0 + 1 + · · ·+ (m− 1) =m(m− 1)

2= K.

24

Hence, we deduce that h(z) is divisible by zK .

Finally, since h(z) is analytic in D(R/r) and continuous on D(R/r), andsince h(z) is divisible by zK , it follows that h(z)/zK is analytic in D(R/r)and continuous on D(R/r). Since h(z)/zK is analytic in D(R/r) andcontinuous on D(R/r), it follows from the maximum modulus principle thath(z)/zK attains its maximum value on the boundary ∂D(R/r). Hence, forw ∈ D(R/r), we have the inequality∣∣∣∣h(w)

wK

∣∣∣∣ ≤M (R

r,h(z)zK

)=( rR

)KM (R/r, h(z)) .

For |z| = R/r we have |yiz| ≤ R. The determinant of a m×m matrix is thesum of m! products, where each product consists of m entries, such that foreach row and column only one entry is part of a product. For each row indexj we have |fj(yiz)| ≤M(R, fj) for i ∈ {1, 2, . . . ,m}. Thus,

M(R/r, h(z)) ≤ m!m∏j=1

M(R, fj).

Since |∆| = h(1) and 1 ≤ R/r ≤ R we obtain

|∆| ≤( rR

)KM(R/r, h(z)) ≤

( rR

)Km!

m∏j=1

M(R, fj),

from which the desired inequality follows. �

4.2 The case α, β ∈ R, α > 0

We first present a proof of the Gel’fond-Schneider theorem for α, β ∈ R andα > 0. The proof comes from course notes by Filaseta (2011). The proof isbased on the method of interpolation determinants developed by Laurent(1994).

Theorem 21. If α, β ∈ Q ∩ R with α > 0 and α 6= 1, and β /∈ Q, then αβ istranscendental.

An equivalent formulation of Theorem 21 is the following. Assume thatα, β, αβ ∈ Q ∩ R and α > 0. Then β ∈ Q.

Proof: Part of our arguments will be needed also in the proof of the generalGel’fond-Schneider theorem (Theorem 25), where the condition α, β ∈ R,α > 0 is not needed. It is only when we apply Lemma 19 above that we haveto assume α, β ∈ R, α > 0. For the moment we assume α, β, αβ ∈ Q withα 6= 0, 1, where αβ = eβ logα is any choice of the branch of the logarithm.When we are at the point to apply Lemma 19, we use the assumptionα, β, αβ ∈ R and deduce that β ∈ Q.

25

Let L0, L1, S ∈ Z≥2, L = (L0 + 1)(L1 + 1), and K ∈ R. K, L0, L1 and L willbe increasing functions of S, and repeatedly we will choose S sufficientlylarge so that K is sufficiently large. The functions K, L0 and L1 and S mustbe chosen such that the inequalities

KL0 logS ≤ L, KL1S ≤ L and L ≤ (2S − 1)2

hold. This can be done by taking, for example, S large, L0 = bS logSc,L1 = bS/ logSc and K = log logS. If we combine the first two inequalitieswe obtain

KL (L0 logS + L1S) ≤ 2L2. (16)

Next, consider some arrangement

(s1(i), s2(i))(2S−1)2

i=1

of the pairs (s1, s2) ∈ {0, . . . , S − 1} × {0, . . . , S − 1}. Furthermore, let

(u(j), v(j))Lj=1

be an arrangement of the pairs (u, v) ∈ {0, . . . , L0} × {0, . . . , L1}. We definethe (2S − 1)2 × L matrix

M =(

(s1(i) + s2(i)β)u(j)(αs1(i)+s2(i)β

)v(j)).

Letfj(z) = zu(j)αv(j)z = zu(j)ev(j)z logα

for j ∈ {1, 2, . . . , L} be functions in the complex variable z, let

yi = s1(i) + s2(i)β

for i ∈ {1, 2, . . . , L}, and consider the determinant ∆ = det(fj(yi)) of anarbitrary L× L submatrix (fj(yi)). We will show that all L× L submatricesof M have determinant zero. Under the assumption that ∆ 6= 0 we derive ananalytic upper bound and an algebraic lower bound for log |∆|, from whichwe derive a contradiction.

We first derive an upper bound. We have αv(j)z = exp(v(j)z logα), and fj(z)represents an entire function for each j ∈ {1, 2, . . . , L}. For z1, z2 ∈ C we have

|ez1z2 | = eRe(z1z2) ≤ e|z1z2| = e|z1||z2|.

Hence, for any R ∈ R>0, we have

M(R, fj) = M(R, zu(j)ev(j)z logα) ≤ Ru(j)ev(j)R| logα|. (17)

Taking the log on both sides of (17) we obtain the inequality

logM(R, fj) ≤ u(j) logR+ v(j)R| logα| ≤ L0 logR+ L1R| logα|. (18)

26

Next, applying Lemma 20 to ∆ with r = S(1 + |β|) and R = e2r we obtainthe inequality

|∆| ≤ e−L(L−1)L!L∏j=1

M(R, fj). (19)

Taking the log on both sides of (19), and using inequality (18), we obtain

log |∆| ≤ −L(L− 1) + logL! +L∑j=1

logM(R, fj)

≤ −L2 + L+ L logL+ L max1≤j≤L

{logM(R, fj)}

≤ −L2 + L (1 + logL+ L0 logR+ L1R| logα|)

orlog |∆| ≤ −L2 + c1L (L0 logS + L1S) (20)

for some absolute constant c1 ∈ R independent of S. If we choose S suchthat K ≥ 4c1, inequality (16) becomes

c1L (L0 logS + L1S) ≤ L2

2.

Combining this inequality with (20) we obtain

log |∆| ≤ −L2

2, (21)

which specifies an upper bound for log |∆|.

Next, we derive a lower bound for log |∆| under the assumption that ∆ 6= 0.Fix T ∈ Z>0 such that Tα, Tβ and Tαβ are algebraic integers. ThenTL0+2L1S has the property that TL0+2L1S times any element of M, andhence TL0+2L1S times any element of the matrix describing ∆ is an algebraicinteger. Hence, TL(L0+2L1S)∆ is an algebraic integer in Q

(α, β, αβ

). Thus

TL(L0+2L1S)∆ is a zero of a monic polynomial of degree N , where N is atmost the product of the degrees of the minimal polynomials of α, β and αβ.

The house ∆ is the maximum of the absolute values of ∆ and itsconjugates. We have the upper bound

∆ ≤ L!SL0L(1 + β

)L0L (1 + α )L1LS(

1 + αβ)L1LS

.

In the latter inequality we have taken into consideration that β , α and αβ

may be smaller than 1. If ∆ 6= 0, it follows from Lemma 13 that

|∆| ≥ T−NL(L0+2L1S) ∆1−N ≥ T−NL(L0+2L1S) ∆

−N.

Combining these two inequalities we obtain

|∆| ≥ T−NL(L0+2L1S)(L!)−NS−NL0L(1 + β

)−NL0L (1 + α )−NL1LS(

1 + αβ)−NL1LS

.

27

Since N logL! ≤ NL logL, we obtain, after taking the log on both sides, theinequality

log |∆| ≥ −NL(L0 + 2L1S) log T −NL logL−NL0L logS

−NL0L log(1 + β

)−NL1LS log (1 + α )−NL1LS log

(1 + αβ

).

Since N , T , log(1 + β

), log (1 + α ) and log

(1 + αβ

)are constants that

only depend on α and β, there is an absolute constant c2 ∈ R independent ofS for which

log |∆| ≥ −c2L (L0 + logL+ L0 logS + L1S) .

If we choose S sufficiently large we obtain the inequality

log |∆| ≥ −c3L(L0 logS + L1S), (22)

for some absolute constant c3 ∈ R independent of c. Furthermore, if wechoose S such that K ≥ 6c3, inequality (16) becomes

c3L (L0 logS + L1S) ≤ L2

3.

Combining this inequality with (22) we obtain

log |∆| ≥ −L2

3, (23)

which specifies a lower bound for log |∆|. We now get a contradictionbetween the upper bound in (21) and the lower bound in (23). Since ∆ wasan arbitrary submatrix, this shows indeed that all L× L submatrices of Mhave determinant zero.

Since ∆ = det(fj(yi)) = 0 for any sub-determinant ∆, it follows that thecolumns of the matrix (fj(yi)) are linearly dependent over R. Hence, thereexists b1, b2, . . . , bL ∈ R, not all 0, such that

L∑j=1

bjfj(yi) = 0, for i ∈{

1, 2, . . . , (2S − 1)2}. (24)

Since fj(yi) = yu(j)i αv(j)yi , identity (24) is equal to

L∑j=1

bjyu(j)i αv(j)yi = 0, for i ∈

{1, 2, . . . , (2S − 1)2

}. (25)

If we consider identity (25) for all pairs (u, v) with u ∈ {0, 1, . . . , L0} andv ∈ {0, 1, . . . , L1}, we obtain

L1∑v=0

(L0∑u=0

b(L0+1)v+u+1yui

)αvyi = 0, for i ∈

{1, 2, . . . , (2S − 1)2

}. (26)

28

Choosing

av(t) =L0∑u=0

b(L0+1)v+u+1tu, wv = v logα, and t = yi = s1(i) + s2(i)β,

we can write the left-hand side of (26) as

L1∑v=0

(L0∑u=0

b(L0+1)v+u+1yui

)αvyi =

L1∑v=0

av(t)ewvt.

Each of the L values of yi is a zero of∑L1

v=0 av(t)ewvt. Note that this sum

consists of L1 + 1 polynomials, each of degree L0. We now at last use ourassumption α, β ∈ R, α > 0, and apply Lemma 19. By that lemma, there areat most

L0(L1 + 1) + (L1 + 1)− 1 = L− 1

distinct zeros. Since L− 1 < L ≤ (2S − 1)2, two of the yi must be the same,and we have

s1(i) + s2(i)β = s1(i′) + s2(i′)β for some i, i′ with 1 ≤ i < i′ ≤ (2S − 1)2.

However, since the pairs (s1(i), s2(i)) and (s1(i′), s2(i′)) are distinct, itfollows that

β =s1(i′)− s1(i)s2(i)− s2(i′)

∈ Q.

�

4.3 The general case

In this subsection we consider the Gel’fond-Schneider theorem for thecomplex case. In the proof for the real case in the previous subsection, onlyin Lemma 19 we used the assumptions α, β ∈ R and α > 0. The idea is toreplace Lemma 19 by Proposition 22 for complex numbers. The followingresult comes from Tijdeman (1971).

Proposition 22. Let a1(z), . . . , an(z) be non-zero polynomials in C[z] ofdegrees d1, . . . , dn respectively, let w1, . . . , wn be pairwise distinct complexnumbers, let

f(z) =n∑k=1

ak(z)ewkz,

and put

E = n+n∑k=1

dk, and m = maxk|wk|.

29

Furthermore, let R, s, t ∈ R>0, s > 1, and let y ∈ C. Then

N(R, y, f) ≤ 1log s

((E − 1) log

st+ s+ t

t+ (st+ s+ 2t)Rm+

1s

). (27)

We first prove the following result of Tijdeman (1971, Lemma 1).

Lemma 23. Let R, s, t ∈ R>0, s > 1, and let f 6= 0 be analytic onD((st+ s+ t)R). Then

N(R, f) ≤ 1log s

logM((st+ s+ t)R, f)

M(tR, f).

Proof: Let w ∈ D(tR) such that |f(w)| = M(tR, f). It then follows that

D(R) ⊂ D((1 + t)R,w) (28)

andD((st+ s)R,w) ⊂ D((st+ s+ t)R). (29)

By Jensen’s formula (Greene, Krantz 2006, p. 279) we have∫ sR

0

N(r, w, f)r

dr =1

2πlog∣∣∣f (w + sReiθ

)∣∣∣ dθ − log |f(w)|.

We also have∫ sR

0

N(r, w, f)r

dr ≥∫ sR

R

N(R,w, f)r

dr = N(R,w, f) log s.

Combining the two previous formulas we obtain

N(R,w, f) ≤ 1log s

M

(sR,w, log

|f ||f(w)|

).

This inequality, together with the inclusions (28) and (29), implies that

N(R, f) ≤ N((1 + t)R,w, f) ≤ 1log s

M

(sR(1 + t), w, log

|f ||f(w)|

)≤ 1

log slog

M((st+ s+ t)R, f)M(tR, f)

.

�

Lemma 23 is used in the proof of Proposition 22. The following result fromBalkema and Tijdeman (1973, Theorem 2) is also used in the proof ofProposition 22.

30

Lemma 24. Let a1(z), . . . , an(z) be non-zero polynomials in C[z] of degreesd1, . . . , dn respectively, let w1, . . . , wn be pairwise distinct complex numbers,let

f(z) =n∑k=1

ak(z)ewkz,

and put

E = n+n∑k=1

dk, and m = maxk|wk|.

Furthermore, let R, γ ∈ R>0, γ > 1. Then

M(γR, f) ≤ γE − 1γ − 1

eRm(γ+1)M(R, f).

We are now ready to present the proof of Proposition 22.

Proof of Proposition 22: Let γ ∈ R>1. Using Lemma 24 we obtain theinequality

M(γtR, f) ≤ γE − 1γ − 1

etRm(γ+1)M(tR, f).

Taking γ = (st+ s+ t)/t we have

γE − 1γ − 1

≤ t

st+ s

(st+ s+ t

t

)E≤(

1 +1s

)(st+ s+ t

t

)E−1

.

Combining the previous two inequalities we obtain

M((st+ s+ t)R, f) ≤(

1 +1s

)(st+ s+ t

t

)E−1

e(st+s+2t)RmM(tR, f).

Combining this inequality with Lemma 23 we obtain the desired inequality.�

In the complex case, the Gel’fond-Schneider theorem holds for every branchof the complex logarithm. By replacing Lemma 19 with Proposition 22 inthe proof of Theorem 21 we obtain the following result.

Theorem 25. Let α, β ∈ Q and α 6= 0, 1, and β /∈ Q. For any branch oflog z we have that αβ = eβ logα is transcendental.

Proof: Following the same set up and arguments as in the proof of Theorem21, we find at the end that each of the (2S − 1)2 values of yi is a zero off(z) =

∑L1v=0 av(z)e

wvz. Let E, R and m be as defined in Proposition 22,and let 0 be the center of the disc D(R). Taking s = 5 and t = 1

5 ininequality (27), and using that

log 31log 5

< 2.2 and32

5 log 5< 3.9,

31

we obtainN(R, f) ≤ 3(E − 1) + 4Rm. (30)

The sum f(z) consists of L1 + 1 polynomials, each of degree L0. Hence,

E = L1 + 1 + L0(L1 + 1) = L.

The complex numbers are of the form yi = s1(i) + s2(i)β where s1, s2 ∈ Zwith |s1|, |s2| < S. Hence, since we consider the disc with center 0, we havethe upper bound

R ≤ S(1 + β

).

Finally, we have

m = maxv∈{0,...,L1}

|wv| = maxv∈{0,...,L1}

|v logα| = L1| logα|.

Using the values of E and m and the upper bound for R in inequality (30)we obtain that the number of zeros of f satisfy

N(f) ≤ 3(L− 1) + 4SL1

(1 + β

)| logα|.

Using here the specific definitions L0 = bS logSc and L1 = bS/ logSc, weobtain for sufficiently large S that

N(f) ≤ 3(S2 + S logS +

S

logS

)+

4S2

logS(1 + β

)| logα|

< 4S2 − 4S + 1 = (2S − 1)2.

Hence, at least two of the yi must be the same, which completes the proof. �

References

Abramowitz M, Stegun IA (1970) Handbook of Mathematical Functions(with Formulas, Graphs and Mathematical Tables). Dover Publications,New York.

Apery R (1979) Irrationalite de ζ(2) et ζ(3). Asterisque, 61, 11-13.

Baker A (1975) Transcendental Number Theory. Cambridge UniversityPress, Cambridge.

Balkema AA, Tijdeman R (1973) Some estimates in the theory ofexponential sums. Acta Mathematica Academiae ScientiarumHungaricae, 24, 115-133.

Beukers F (1979) A note on the irrationality of ζ(2) and ζ(3). Bulletin ofthe London Mathematical Society, 11, 268-272.

32

Burger EB, Tubbs R (2004) Making Transcendence Transparent.Springer, New York.

Cantor GFLP (1874) Uber eine Eigenschaft des Inbegriffes aller reellenalgebraischen Zahlen. Journal fur die reine und angewandte Mathematik,77, 258-262.

Dunham W (1990) Journey through Genius: The Great Theorems ofMathematics. John Wiley & Sons.

Filaseta M (2011) Transcendental Number Theory. Course notes,University of South Carolina.

Greene RE, Krantz SG (2006) Function Theory of One ComplexVariable. American Mathematical Society, Rhode Island.

Laurent M (1994) Linear forms in two logarithms and interpolationdeterminants. Acta Arithmetica, 66, 181-199.

Niven I (1947) A simple proof that π is irrational. Bulletin of theAmerican Mathematical Society, 53, 509.

Parks AE (1986) Pi, e and other irrational numbers. The AmericanMathematical Monthly, 93, 722-723.

Rivoal T (2000) La fonction zeta de Riemann prend une infinite devaleurs irrationnelles aux entiers impairs. Comptes Rendus de l’Academiedes Sciences. Serie I. Mathematique, 331, 267270.

Shidlovskii AB (1989) Transcendental Numbers. Walter de Gruijter,Berlin.

Tijdeman R (1971) On the number of zeros of general exponentialpolynomials. Indagationes Mathematicae, 74, 1-7.

Zudilin W (2001) One of the numbers ζ(5), ζ(7), ζ(9), ζ(11) is irrational.Russian Mathematical Surveys, 56, 774-776.

33

Documents

On Irrational and Transcendental Numbers2 Irrational numbers 2.1 A theorem for the numbers eand ˇ In this subsection all integers are rational integers. Let c2R >0.Suppose f(x) is