51
THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM A Writing Project Presented to The Faculty of the Department of Mathematics San Jos´ e State University In Partial Fulfillment of the Requirements for the Degree Master of Arts by Simon A. Ward May 2009

Divergence of Prime Reciprocals

Embed Size (px)

DESCRIPTION

A thorough investigation of the sum over prime reciprocals.

Citation preview

Page 1: Divergence of Prime Reciprocals

THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM

A Writing Project

Presented to

The Faculty of the Department of Mathematics

San Jose State University

In Partial Fulfillment

of the Requirements for the Degree

Master of Arts

by

Simon A. Ward

May 2009

Page 2: Divergence of Prime Reciprocals

c© 2009

Simon A. Ward

ALL RIGHTS RESERVED

Page 3: Divergence of Prime Reciprocals

APPROVED FOR THE DEPARTMENT OF MATHEMATICS

Dr. Daniel Goldston

Dr. Marylin Blockus

Dr. Mohammed Saleem

APPROVED FOR THE UNIVERSITY

Page 4: Divergence of Prime Reciprocals

ABSTRACT

THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM

by Simon A. Ward

The divergence or convergence of the series of prime reciprocals was finally resolvedby Euler in 1744. Euler showed directly that this series is divergent, which shed some lighton the density of primes. The divergence shows the primes are not so few such that the sumof their reciprocals converges. The asymptotic formula for the partial sums of reciprocals ofprimes is the cornerstone of Mertens’ theorem. The formula is derived and the constant in theformula is analyzed to show that the probability that a random integer is prime decreases tozero with large numbers. This probability estimate is known as the product form of Mertens’theorem. Many proofs of the divergence of the series of prime reciprocals are reviewed indetail, including modern proofs by Erdos, Dux, Clarkson and Niven. Chebyshev’s theoremprovides an upper and lower bound for the number of primes up to any given number.The proof of this theorem is explained in detail, and the divergence of the series of primereciprocals is shown to be a consequence. This paper details many of the important resultsin the history of the development of the distribution of prime numbers. The sum of primereciprocals is shown from many different approaches; and along the way many supportinglemmas and other useful results in elementary number theory are discussed and proved. Thereader will come away with a good understanding of the problem of counting prime numbersand a motivation for understanding and proving the prime number theorem.

Page 5: Divergence of Prime Reciprocals

ACKNOWLDEGEMENTS

I would like to thank Daniel Goldston for the guidance and resources he provided. Thankyou to committee members Marylin Blockus and Mohammed Saleem for many helpful sug-gestions regarding this project. Thank you to my wife Nora and my family for their supportwhile I completed this project.

v

Page 6: Divergence of Prime Reciprocals

TABLE OF CONTENTS

§1: Introduction

1.1: Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2: A Formula for Prime Numbers . . . . . . . . . . . . . . . . . . . . . . 21.3: The Prime Number Theorem . . . . . . . . . . . . . . . . . . . . . . 41.4: Euler’s Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5: The Divergence of the Sum of Prime Reciprocals . . . . . . . . . . . . . . 7

§2: The Divergence of∑

p

1

p

2.1: Euler’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2: Erdos’ Proof. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3: Dux’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4: Clarkson’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.5: Niven’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

§3: Chebyshev’s Theorem

3.1: Theorem (Chebyshev). . . . . . . . . . . . . . . . . . . . . . . . . 163.2: The Chebyshev Function Theorem. . . . . . . . . . . . . . . . . . . . 173.3: Proof of Chebyshev’s Theorem . . . . . . . . . . . . . . . . . . . . . 183.4: Another Small Step Towards the Prime Number Theorem . . . . . . . . . . 22

§4: Mertens’ Theorem

4.1: Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2: Preliminary Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . 244.3: Proof of Mertens’ Theorem. . . . . . . . . . . . . . . . . . . . . . . 26

§5: The Product Form of Mertens’ Theorem

5.1: The Euler-Mascheroni Constant . . . . . . . . . . . . . . . . . . . . . 315.2: The Product Form as a Probability of a Number Being Prime. . . . . . . . . 325.3: Proof of the Product Form of Mertens’ Theorem˙ . . . . . . . . . . . . . . 355.4: Another Proof of the Product Form of Mertens’ Theorem. . . . . . . . . . . 39

§6: Conclusion

6.1: Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416.2: Further Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

vi

Page 7: Divergence of Prime Reciprocals

LIST OF FIGURES

Figure

1.3.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5.2.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.2.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

vii

Page 8: Divergence of Prime Reciprocals

§1: Introduction

1.1 Prime Numbers

The prime numbers are the natural numbers which have exactly two unique divisors,one and themselves. The Chinese thought of the prime numbers as “macho” numbers,which attempted to resist any attempt to break them down into a product of smallerintegers [17]. From the fundamental theorem of arithmetic, we can conclude that the primenumbers are the most basic elements of the integers. That is, every positive integer greaterthan 1 can be written uniquely as a product of primes, with the prime factors in theproduct written in nondecreasing order [15]. A composite number is a number greater than1 that is not prime. Thus 1 is the only natural number which is neither prime norcomposite. The integers also form an integral domain, whose field of quotients forms therational numbers, which in turn can be used to construct the real numbers [11]. Thefundamental importance of prime numbers to mathematics makes them worthy of specialstudy.

The ancient civilizations which emerged in Babylon, Egypt and China have leftevidence that they had an understanding of prime numbers. However, the properties ofprime numbers were first formally documented by the Pythagoreans before 500 B.C. inGreece [21]. In the third century BC, the Greek Eratosthenes invented his famous sieve,which could, with relative ease, determine all the primes up to a given number. Forexample, suppose we want to create a list of primes up to a modest number, say 100. TheSieve of Eratosthenes works by starting with 2, and deleting all numbers greater than 2that are multiples of 2 (even numbers), all numbers greater than 3 that are multiples of 3,and for each successive prime not exceeding

√100 = 10, deleting all multiples of that

prime. We do not need to consider a prime greater than 10 since any composite numberk > 10 with prime factors greater than 10 would exceed 100 (the first such integer being112 = 121). The numbers that remain in your list must be prime. This method allowedmathematicians from antiquity to begin making tables of prime numbers [17]. Around thesame time as Eratosthenes, Euclid’s book Elements was published. This book contained aprove that there are infinitely many primes. The elegance of the proof allows us to presentit here in one short paragraph.

Suppose there are only n primes, and consider the sequence of primes 2, 3, 5, . . . , pn.Take the product of these n primes 2 · 3 · · · pn−1 · pn, and then add one to it, forming theinteger N=2 · 3 · · · ·pn−1 · pn+1. By the fundamental theorem of arithmetic, N must have aprime divisor. But if the prime divisor is any one of the pn, then pn must divideN − 2 · 3 · · · ·pn−1 · pn = 1. This implies pn divides one, which is a contradiction. Thus ourassumption that there are finitely many primes is false, and hence there must be infinitelymany primes. Euclid was responsible for the only known proof of the infinitude of theprimes for over 2,000 years!

1.2 A Formula for Prime Numbers?A natural question one might have when considering the sequence of prime numbers

2, 3, 5, . . . , pn, . . . is whether or not there is a formula that will yield the nth prime. That is,

1

Page 9: Divergence of Prime Reciprocals

can one find a formula involving n which will produce the nth prime? Mathematicians havetried for centuries to find such a formula, however no formula is known. It is highly likelythat no such formula exists, since the gaps between consecutive primes are apparentlyrandom. One might then search for a weaker function, that is a function that assumes onlyprime values, not necessarily sequentially. Over the course of mathematical history, therehave been a few formulas that have been conjectured to be prime valued. One suchfunction was proposed by Pierre de Fermat (1605-1665) to always be prime valued, andproduces the nth “Fermat” number Fn = 22n + 1. The first four Fermat numbers 5,17, 257,65,537, are all prime. The reader might recognize Fn from a well known proof by KarlGauss (1777-1855), that if Fn is prime, then a regular polygon of Fn sides can be inscribedin a circle with only a compass and straightedge. Fermat’s conjecture was proved false byEuler in 1732, when he showed that 641 is a divisor of the fifth Fermat numberFn = 232 + 1 = 4, 294, 967, 297. Another theorem from Fermat, provides a way to finddivisors of very large numbers. [15]

Fermat’s Little Theorem: If p is a prime and a is a positive integer with (a, p) = 1, thenap−1 ≡ 1 mod p, where (m,n) denotes the greatest common divisor of m and n.

Proof: Consider the p− 1 integers, a, 2a, 3a, . . . , (p− 1)a. None of these integers aredivisible by p since if p|ka, then since (a, p) = 1, p|k. But 1 ≤ k ≤ (p− 1), and hence it isnot possible that p|k. Furthermore, no two of these integers are congruent mod p. For ifsa ≡ ta mod p, then since (a, p) = 1, s ≡ t mod p. But 1 ≤ s < t ≤ p− 1, which makestheir equivalence impossible. Since none of the integers are divisible by p, or equivalent toeach other, they represent in some order the equivalence classes mod p. That is, they areequivalent to 1, 2, . . . , p− 1. Therefore, by the multiplication properties of congruence, wehave ap−1(p− 1)! ≡ (p− 1)! mod p. Now since ((p− 1)!, p) = 1, it follows that ap−1 ≡ 1mod p.

The power of this theorem is that it gives information about the divisors of very largeintegers, for Fermat’s little theorem states that ap−1 − 1 is divisible by p, whenever(a, p) = 1. For example, since (6, 97) = 1, 696− 1 is divisible by 97, and therefore composite.

Another formula that is used today to compute the largest known primes is theformula 2n− 1. This function is certainly not always prime valued, since 24− 1 = 15 = 3 · 5.However, when this number is prime, then n must be a prime by the following [10].

Theorem: If n > 1 and an − 1 is prime, then a = 2, and n is prime.

Proof: an − 1 = (a− 1)(an−1 + an−2 + · · ·+ a+ 1), and hence if a 6= 2 then an − 1 iscomposite. Therefore a = 2. If n = st where 1 < s ≤ t < n, then2n − 1 = (2s)t − 1 = (2s − 1)((2s)t−1 + (2s)t−2 + · · ·+ 2 + 1). Since this number is prime, wemust have s = 1. This is a contradiction, and hence n is prime.

When 2p − 1 is prime, it is known as a “Mersenne” prime, named after the Frenchmonk Marin Mersenne (1588-1648). Mersenne claimed that this formula gave primenumbers for n = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127 and 257. It turns out that 67 and 257 do notgive Mersenne primes. However, once a new prime P is found, then one knows that 2P − 1is a good candidate for a large prime. The largest known prime to date is the Mersenneprime 243,112,609 − 1 [18] which has about 12, 900, 000 digits!

2

Page 10: Divergence of Prime Reciprocals

Other formulas that have been conjectured to be only prime valued are n2 − n+ 41,which is prime for all 0 ≤ n ≤ 40, and n2 − 79n+ 1601, which is prime for all 0 ≤ n ≤ 79.One can stop the search for a prime valued polynomial function with integral coefficientsby the following [10]

Theorem:No nonconstant polynomial f(n) with integral coefficients, can be prime for all n, or for allsufficiently large n.

Proof: We can assume that the leading coefficient of f(n) is positive, so that f(n)→∞ asn→∞. Suppose that for n > N , for some large N , and f(n) > 1, then

f(n) = asns + as−1n

s−1 + · · ·+ a1n+ a0 = M > 1.

Thenf(kM + n) = as(kM + n)s + as−1(kM + n)s−1 + · · ·+ a1(kM + n) + a0

is divisible by M for every integer k.

Hence f(n) is composite for infinitely many values. It follows that no such polynomial canbe strictly prime valued.

Not to be dissuaded, one might search for a function whose range contains an infinityof primes. A trivial example is f(n) = n. G. Lejeune Dirichlet (1805-1859) has given us thefollowing.

Theorem:If a and b are relatively prime, then there are infinitely many primes of the form an+ b.

There is no known simple proof of this theorem. Dirichlet’s original proof used complexvariables. A very complicated elementary proof was found in the 1950’s by Alte Selberg (b.1917) [15]. We will prove a special case here, that is

Theorem: There are infinitely many primes of the form 4n+ 3.

To prove this theorem we will require the following lemma.

Lemma: If a and b are integers of the form 4n+ 1, then ab is also of this form.

Proof: If a = 4r + 1, and b = 4s+ 1, then ab = 16rs+ 4r + 4s+ 1 = 4(4rs+ r + s) + 1,which is of the form 4n+ 1.

Now we shall prove the desired result.

Proof: Suppose there are only finitely many primes of the form 4n+ 3, sayp0 = 3, p1, . . . , pk. Let

Q = 4p1 · p2 · · · pk + 3.

Then by the fundamental theorem of arithmetic, Q must have a prime factorization. Atleast one of the primes in this factorization must be of the form 4n+ 3; because the oddprimes must be be in either the residue classes {1} or {3} mod 4. If none were in {3}, thelemma above implies Q is also of the form 4n+ 1, which would be a contradiction. Now,none of the p0 = 3, p1, . . . , pk divides Q. 3 does not divide Q for if 3|Q then

3

Page 11: Divergence of Prime Reciprocals

3|(4p1 · p2 · · · pk), which would lead to a contradiction. Similarly, none of the pk can divideQ, because if pk|Q, then pk|3, which is clearly a contradiction. Therefore there areinfinitely many primes of the form 4n+ 3. It is still an open question whether simplequadratic forms such as n2 + 1 assume infinitely many prime values.

In the last 50 years or so, several techniques have been developed for generating primenumbers involving sieving and systems of Diophantine equations. However the formulas arevery complicated and not practical for use [9].

1.3 The Prime Number TheoremAs was shown in 1.2, finding a prime number formula, or even a formula that outputs

infinitely many primes is a very difficult problem. Mathematicians attempted to aim for amore realistic goal. They began approximating the number of primes up to any given N .The search for primes has gone on for thousands of years, and extensive tables have beenconstructed. With centuries of tables of primes as empirical data, mathematicians had alot of evidence to work with. The famous mathematician Karl Gauss once told a friendthat when he had 15 minutes to spare, he would count the number of primes in a chiliad (arange of 1,000 numbers). By the end of his mathematical career, Gauss had computed allprimes up to 3,000,000 [7]. In 1792, at the age of fifteen, Gauss made his now famousprime number conjecture, which stated that if π(N) represented the number of primes less

than or equal to N , then π(N) ∼ N

logN. By ∼ we mean “asymptotic”. That is, the ratio of

π(N) toN

logNapproaches 1 as N grows without bound. Gauss did not prove this, so a

conjecture it remained. The great mathematician Pafnuty Chebyshev (1821-1894) wrote aproof now known in number theory simply as “Chebyshev’s theorem”. He showed thatthere exist constants a and A, 0 < a < 1 < A, such that if N is sufficiently large, we have

aN

logN< π(N) < A

N

logN.

This theorem was an important step in the direction of proving the prime numberconjecture. We will review “Chebyshev’s theorem” in section 3 of this paper, which willalso provide a direct proof that the series of prime reciprocals is divergent. Chebyshev’stheorem was still a long way from proving the prime number theorem, which was finallyproven independently by Hadamard and Poisson in 1896 [7]. In making his conjecture,Gauss actually used a different and more accurate estimation for π(N) known as thelogarithmic integral of N or Li(N). From his table of primes, Gauss noticed that given anumber N , the probability that a number sufficiently close to N is prime is approximately

1

logN. Since the sum π(N) would increase by exactly one if N is prime, Gauss obtained

the estimation

π(N) ≈ Li(N) =

∫ N

2

1

log xdx =

N

logN+

∫ N

2

1

(log x)2dx.

This sum is much more accurate thanN

logN. However as the prime number theorem

4

Page 12: Divergence of Prime Reciprocals

states, they are both asymptotic to π(N). The formulaN

logNis easier to use, and when

dealing with large enough numbers is usually sufficient for the estimation of π(N).

200 400 600 800 1000

25

50

75

100

125

150

175

Fig.1.3.1:{

N + 1log (N + 1)

}(bottom), {π(N + 1)} (middle), {Li(N + 1)} (top) for 1 ≤ N ≤ 1, 000.

1000 2000 3000 4000 5000

100

200

300

400

500

600

700

Fig.1.3.2:{

N + 1log (N + 1)

}(bottom), {π(N + 1)} (middle), {Li(N + 1)} (top) for 1 ≤ N ≤ 5, 000.

It is interesting to notice that Li(N) is a much better estimate, and appears to slightlyoverestimate π(N). Empirically, this holds for extremely large numbers and it was at onetime widely conjectured to always overestimate π(N). However, John Littlewood(1885-1977) proved that there are actually infinitely many numbers for whichLi(N)− π(N) changes sign. The best modern estimates to the first such N , are of theorder 1.397 · 10316! [1]

5

Page 13: Divergence of Prime Reciprocals

1.4 Euler’s ProductAfter the fall of Ancient Greece, history saw little advancement in the theory of prime

numbers until the 17th century with significant contributions from Fermat and Mersennes,and later when the Swiss mathematician Leonhard Euler (1707-1783) published Variaeobsevationes circa series infinitas in 1744 [16]. Euler introduced his famous product thatnow bears his name,

∞∑n=1

1

n=∏p

(1− 1

p

)−1

. (1.4.1)

Throughout this paper the index p runs over all the primes. Fortunately, there were a fewimportant results from medeival mathematics, and perhaps the most important result isdue to Nicole Oresme (1323-1382), that the summation on the left hand side of (1.4.1) orthe “harmonic” series, is divergent. Oresme noticed an inequality when he doubled theindex of the partial sums,

S2 = 1 +1

2

S4 = 1 +1

2+

1

3+

1

4> 1 +

1

2+

1

4+

1

4︸ ︷︷ ︸12

S8 = 1 +1

2+

1

3+

1

4+

1

5+

1

6+

1

7+

1

8> 1 +

1

2+

1

4+

1

4︸ ︷︷ ︸12

+1

8+

1

8+

1

8+

1

8︸ ︷︷ ︸12

...

S2n > 1 +n

2and since n was arbitrary, the series diverges. From this known result, (1.4.1) yielded thefirst new proof of the infinitude of prime numbers in two millennia. This is due to thesimple fact that for the product on the right hand side to diverge, we must necessarily haveinfinitely many indices (primes) in the product.

Euler’s product (1.4.1) is actually not very difficult to prove. The way Euler did it, ishe started by supposing that the harmonic sum converges to S, that is

S = 1 +1

2+

1

3+

1

4+

1

5+ · · ·

and dividing by twoS

2=

1

2+

1

4+

1

6+

1

8+

1

10+ · · ·

thereforeS

2= 1 +

1

3+

1

5+

1

7+

1

9+ · · · (1.4.2)

Since all the terms in (1.4.2) have odd denominators, in a similar fashion to the Sieve ofEratosthenes, we can eliminate all denominators that are multiples of 3 by dividing (1.4.2)by 3, to get

S

2 · 3=

1

3+

1

9+

1

15+

1

21+ · · · (1.4.3)

6

Page 14: Divergence of Prime Reciprocals

then subtracting (1.4.3) from (1.4.2) to obtain

1 · 22 · 3

S = 1 +1

5+

1

7+

1

11+

1

13+ · · ·

continuing this process indefinitely, Euler obtained

1 · 2 · 4 · · · (p− 1) · · ·2 · 3 · 5 · · · p · · ·

S = 1,

cross multiplying gives

S =2 · 3 · 5 · · · p · · ·

2 · 4 · · · (p− 1) · · ·=∞∑n=1

1

n=∏p

(1− 1

p

)−1

,

which is (1.4.1).Euler generalized (1.4.1) for any real number s > 1 [4]

∞∑n=1

1

ns=∏p

(1− 1

ps

)−1

. (1.4.4)

This product opened up many doors into exploring the divergence of the prime numbers.The left hand side is the Riemann zeta-function evaluated at s. When Bernhard Riemann(1822-1866) began plugging complex numbers in for s in (1.4.4), a whole new mathematicallandscape opened up. Riemann found that information in the complex zeros of thezeta-function could be used to find the best known formula for counting primes up to N ,and best of all, he found information about the zeros which corresponded to the exact errorin his formula for the number of primes up to N . He had found it; an exact formula for thenumber of primes up to any given N . To complete this program, one needs to prove whatis now known as the “Riemann Hypothesis” that is, the real part of any non-trivial zero of

the Riemann zeta-function is1

2[17]. The proof of the hypothesis is so challenging that a

correct proof will win a $1,000,000 reward offered by the Clay institute. This paper willmake no further attempt to describe the very deep topic of the Riemann Hypothesis. Wehave noted it here because it is at the crux of the distribution of the primes, and itillustrates the importance of Euler’s identity which will come in to play in some of theproofs in this paper.

1.5 The Divergence of the Sum of Prime ReciprocalsThe central topic of this paper is the divergence of∑

p

1

p. (1.5.1)

The divergence of this series is related to the distribution of the primes. While we knowthat the harmonic series is divergent, and from the prime number theorem, that primesbecome increasingly rare, the divergence of (1.5.1) shows us the primes do not thin out fast

7

Page 15: Divergence of Prime Reciprocals

enough for the sum to converge. On the other hand, since (1.5.1) diverges, then in a senseof contributing to a sum, there would have to be more prime numbers than square numberssince we know from another well known result of Euler [16] that

∞∑n=1

1

n2=π2

6.

Euler succeeded in showing the divergence of (1.5.1), and many mathematicians haveadded more proofs. Each proof is interesting in its own right since each offers a differentstrategy and insight to the problem. In this paper we will present many proofs of thedivergence of (1.5.1). We have selected proofs from: Euler, Erdos, Dux, Clarkson, Nivenand Chebyshev. Knowing the divergence of (1.5.1) is important. However, a theorem dueto the Polish mathematician Franz Mertens (1840-1927) not only shows the divergence of(1.5.1), but explains exactly how (1.5.1) diverges. A section of this paper will be devoted toproving this theorem. There are actually four results that are collectively referred to asMertens’ theorem: as N →∞ we have∑

n≤N

Λ(n)

n= logN +O(1);

∑p≤N

log p

p= logN +O(1), (1.5.2)

∫ N

1

ψ(t)

t2dt = logN +O(1), (1.5.3)

∑p≤N

1

p= log logN + C +O

(1

logN

), (1.5.4)

∏p≤N

(1− 1

p

)=

e−γ

logN(1 + o(1)), (1.5.5)

where Λ(N) is the “Von Mongoldt” function, ψ(N) is one of “Chebyshev’s” functions(section 3), γ is the Euler-Mascheroni constant, and C is a constant (sections 4 & 5). Afunction f(N) is “big O” of g(N), or O(g(N)) if and only if there exist constants K > 0and N0 such that |f(N)| ≤ K|g(N)| for all N ≥ N0. A function f(N) is “little o” of g(N),

or o(g(N)), if and only if for any K > 0, there exists N0 such that

∣∣∣∣f(N)

g(N)

∣∣∣∣ < K for all

N ≥ N0. Notice that theorem (1.5.4) shows for large N , the series of prime reciprocalsdiverges as log logN plus a constant C, where C ≈ .26 [13]. Theorem (1.5.5) follows from(1.5.4) and gives an intuitive probability of a random integer being prime, and will bereferred to as the “product form of Mertens’ theorem.” All of the theorems (1.5.2)-(1.5.5)will be proved in sections 4 & 5 of this paper.

8

Page 16: Divergence of Prime Reciprocals

5 10 15 20N

0.25

0.5

0.75

1

1.25

S

Fig.1.5.1: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of∑

p≤N

1p,

for 1 ≤ N ≤ 20.

20 40 60 80 100N

0.8

1

1.2

1.4

1.6

1.8

S

Fig.1.5.2: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of∑

p≤N

1p,

for 1 ≤ N ≤ 100.Fig.1.5.1 and Fig.1.5.2 provide evidence that the convergence of these functions

happens rather quickly, with the functions being very close at only N = 100. At N = 1, 000the graphs are virtually indistinguishable.

9

Page 17: Divergence of Prime Reciprocals

§2: The Divergence of∑

p

1

p

2.1 Euler’s Proof

Euler first proved that∏

p

(1− 1

p

)−1

diverges, then deduced from this that∑

p

1

pdiverges [1]. This is a direct proof and shows why the series is divergent. Define

P (N) =∏p≤N

(1− 1

p

)−1

, S(N) =∑p≤N

1

p, N ≥ 2.

Recall the formula for the sum of a finite geometric series

1 + x+ x2 + · · ·+ xm =1− xm+1

1− x<

1

1− x, for 0 ≤ x < 1. (2.1.1)

Letting x =1

p, then 0 < x < 1, and from (2.1.1) we get

1 +1

p+

1

p2+ . . .+

1

pm<

1

1− 1

p

=

(1− 1

p

)−1

. (2.1.2)

Repeatedly applying the multiplication property for inequalities over all primes p ≤ N , weget ∏

p≤N

(1 +

1

p+

1

p2+ . . .+

1

pm

)< P (N). (2.1.3)

Now each n ≤ N has a prime factorization n = 2α1 · 3α2 · · · pαnn where pn ≤ N. Therefore,

we must have for each αi, pαii ≤ N, or αi ≤logN

log pi, hence if we choose m such that

2m ≥ N , then the product on the left hand side of (2.1.3) will contain at least all terms of

N∑n=1

1

n,

hence ∏p≤N

(1 +

1

p+

1

p2+ · · ·+ 1

pm

)≥

N∑n=1

1

n. (2.1.4)

Now, the harmonic series on the right hand side of (2.1.4) is divergent from (1.4.1), whichshows P (N) is divergent, but in order to show the divergence of the series, we need to usethe following from integral calculus,

N∑n=1

1

n>

∫ N

1

dt

t= logN. (2.1.5)

10

Page 18: Divergence of Prime Reciprocals

From (2.1.3), (2.1.4) and (2.1.5), we obtain P (N) > logN , hence∏

p

(1− 1

p

)−1

diverges

as N grows without bound.To prove the divergence of the series, first consider the Maclaurin series expansion of

log(1 + x) = x− x2

2+x3

3− · · ·+ (−1)n−1xn

n+ · · · . Clearly

log

(1

1− x

)= x+

x2

2+x3

3+ · · ·+ xn

n+ · · · (2.1.6)

which is convergent by the ratio test for all x satisfying 0 < x < 1. Hence from (2.1.6) weobtain

log

(1

1− x

)− x =

x2

2+x3

3+ · · ·+ xn

n+ · · · < x2

2

(1 + x+ x2 · · ·+ xn + · · ·

), (2.1.7)

since the geometric series on the right hand side is convergent, we have

log

(1

1− x

)− x < x2

2(1− x), 0 < x < 1. (2.1.8)

Now setting x =1

pand adding the inequalities over all p ≤ N , we obtain

logP (N)− S(N) <1

2

∑p≤N

1

p(p− 1)<

1

2

∞∑n=2

1

n(n− 1)=

1

2. (2.1.9)

The equality on the right hand side follows from the convergent telescoping series. Nowfrom P (N) > logN and rearranging (2.1.9) we get that

log logN − 1

2< logP (N)− 1

2< S(N) ,

so that∑

p

1

pis divergent.

11

Page 19: Divergence of Prime Reciprocals

2.2 Erdos’ Proof

Paul Erdos (1913-1996) published the following proof in 1938 [6]. This proof is notablefor its lack of series manipulations. It is the first of four proofs by contradiction that I willreview. For the next four sections, slight variations of the following definitions will be used:Let a, b be given positive integers. Let

P = {p : a ≤ p ≤ b}, M = {n ∈ Z+ : p | n⇒ p ∈ P}, Mn = {m ∈M : m ≤ n}.

In words, M is the set of integers generated by P under multiplication, and Mn is atruncation of M up to and including n.

Now suppose that a = 1 and∑ 1

pis convergent, then there is a large enough b such that

∑p>b

1

p<

1

2.

Let m ∈Mn, from the prime factorization of m, and parity of the exponents, we can alwayswrite m = k2r where r is square-free. That is

r =∏S

p, where S ⊂ P.

Given any finite set A of n elements, there are 2n possible subsets. This is easily seen byinduction; because adding one more element to the set doubles the number of subsets.Since P is a finite set of primes, there are 2|P | − 1 non empty subsets of P . Including thepossibility that r = 1, there are 2|P | possible values for r. Since m = k2r, we havek2 ≤ m ≤ n which implies k ≤

√m ≤

√n, so that there are ≤

√n possible values of k, and

thus |Mn| ≤ 2|P |√n.

It is clear that for any fixed prime p, there can be no more thann

pintegers less than n

that are divisible by p. Since the number of integers less than n that are not divisible byany p ∈ P is

n− |Mn| ≤∑n≥p>b

n

p<n

2.

So thatn

2< |Mn| < 2|P |

√n or

√n < 2|P |+1,

which is clearly a contradiction for large enough n.

12

Page 20: Divergence of Prime Reciprocals

2.3 Dux’s Proof

This proof was published in 1956 by Erich Dux [6]. This proof is interesting since itinvolves rearranging the harmonic series. Borrowing from Erdos’s proof, the sets used inthis proof are defined as:

P = {p : 1 < p ≤ b}, M = {n : p | n⇒ p ∈ P}.

Assume that∑ 1

pis convergent, then there must be a b large enough so that∑

p>b

1

p= A < 1. Dux defines M ′ = {n′ > 1 : p | n′ ⇒ p > b} and M ′′ = M ∩ M ′, (where

M is the complement of M) that is M ′′ is all integers that have prime divisors in both Mand M ′. Since P is a finite set of primes, we have

∑M

1

n≤∏P

(1 +

1

p+

1

p2+ · · ·

)=∏P

(1− 1

p

)−1

<∞,

since each term in the sum on the left hand side, must appear at least once as a term inthe expansion of the product on the right hand side. This is due to the definition of M asall integers divisible only by primes less than b. By a similar argument, we find an upperbound for sums over M ′, that is

∑M ′

1

n′≤∑p>b

1

p+

(∑p>b

1

p

)2

+ · · · = A

1− A<∞,

by the initial assumption of convergence, and the formula for the sum of an infinitegeometric series. Now, it follows that∑

M ′′

1

n′′=∑M

1

n

∑M ′

1

n′−∑M ′

1

n′<∞,

since 1 ∈M , Dux subtracts off reciprocals over M ′, to ensure a sum over integers which arein M ′′. Since N = M ∪M ′ ∪M ′′ and M ∩M ′ ∩M ′′ = ∅, we must have

∞∑n=1

1

n=∑M

1

n+∑M ′

1

n′+∑M ′′

1

n′′<∞,

which is a contradiction of the divergence of the harmonic series. Therefore our initial

assumption was false and∑ 1

pdiverges.

13

Page 21: Divergence of Prime Reciprocals

2.4 Clarkson’s Proof

James Clarkson published the following proof in 1966 [6]. This proof is similar to theproof by Erdos, but with an interesting twist. Clarkson’s proof employs the trick thatEuclid used in his proof that there are infinitely many primes. This proof only requires theset P from the previous two proofs, that is P = {p : a ≤ p ≤ b}. Start by assuming that∑

p

1

pis convergent. Therefore, there must be a large enough a such that

∑P

1

p<

1

2

for all b. Now defineQ =

∏p<a

p

then for any fixed r, there is a large enough b such that all primes which divide 1 + iQ for1 ≤ i ≤ r are in P , since as in Euclid’s proof, p < a implies that p - 1 + iQ. Since each termof

r∑i=1

1

1 + iQ

which has a denominator which is a product of j (not necessarily distinct) primes, occursat least once in the expansion of (∑

P

1

p

)j

< 2−j

by assumption and repeated application of the multiplication property of inequalities.Summing over all j ≥ 1, we get

r∑i=1

1

1 + iQ<∑j≥1

2−j = 1.

Since the right hand side is the geometric series with first term 2−1. On the other hand,

since1

1 + iQ>

1

2Qi, and

r∑i=1

1

1 + iQ>

r∑i=1

1

2iQ=

1

2Q

r∑i=1

1

i;

because r was arbitrary, the series diverges as a fraction of the harmonic series, acontradiction to the upper bound of 1 for large enough r.

14

Page 22: Divergence of Prime Reciprocals

2.5 Niven’s Proof

The following is a proof published by Ivan Niven in 1971 [14]. This proof is very shortand employs the series expansion for ex. As we saw in Erdos’ proof, every positive integercan be expressed as a square-free integer r and a square k2. Let n be any positive integerand S = {r < n : r is square-free}. Then we have(∑

S

1

r

)(∑j<n

1

j2

)≥∑q<n

1

q.

The inequality follows from the fact that every term on the right hand side will appear atleast once as a term in the expansion of the product on the left hand side. Now the secondsum is p-series convergent, and is therefore bounded, but since the right hand side is theunbounded harmonic series as n→∞, the first sum over square-free integers must be

unbounded. Now suppose that∑ 1

pconverges to β. From the Maclaurin series expansion

ex = 1 + x+x2

2+ . . ., we obtain ex > 1 + x for x > 0. Now the last chain of inequalities

follows

eβ > ePp<n

1p =

∏p<n

e1p >

∏p<n

(1 +

1

p

)≥∑S

1

m,

which contradicts the unboundedness of the series over square-free integers, hence∑ 1

pdiverges.

15

Page 23: Divergence of Prime Reciprocals

§3: Chebyshev’s Theorem

As was stated in the introduction, Chebyshev made an important, although relative tothe proof of the prime number theorem; modest, step toward proving the prime numbertheorem. He actually found a sort of asymptotic upper and lower bound for the prime

counting function π(N) in terms ofN

logN, that is he proved the following theorem. [4]

3.1 Theorem (Chebyshev)

There exist constants a and A, 0 < a < 1 < A, such that if N is sufficiently large, wehave

aN

logN< π(N) < A

N

logN. (3.1.1)

It also follows from this theorem that there are infinitely many primes, and the sum ofreciprocals of primes is divergent. In order to prove (3.1.1), we will require some definitionsand lemmas.Chebyshev defined the following useful functions:

θ(N) =∑p≤N

log p, N > 0, p a prime.

andψ(N) =

∑1≤m,ppm≤N

log p.

Here, if m is the largest integer such that pm ≤ N , then log p occurs exactly m times in thesum. For example, ψ(9) = 3 log 2 + 2 log 3 + log 5 + log 7. It can be easily seen that eθ(N) isthe product of all primes less than or equal to N , and eψ(N) is the least common multiple ofall positive integers ≤ N.

If we let m be the largest integer such that 2m ≤ N , there are no primes p > 2 such

that pm ≤ N. This is so because if p > 2, such that pm ≤ N , then m ≤ logN

log p<

logN

log 2.

This is a contradiction since m ≤ logN

log 2, thus pm > N . Now, if we let m− in be the largest

integer such that pm−inn ≤ N , then pn ≤ N1

m−in , where pn is the nth prime for whichpn ≤ N . It follows that in the formula for ψ(N), log pn occurs exactly m− in times. On theother hand, log pn occurs exactly m− in times in the sum

θ(N) + θ(N12 ) + θ(N

13 ) + · · ·+ θ(N

1m−in ) + · · ·+ θ(N

1m ),

henceψ(N) = θ(N) + θ(N

12 ) + θ(N

13 ) + · · ·+ θ(N

1m ). (3.1.2)

This sum terminates since θ(N) = 0 for N < 2, and specifically, N1k < 2 when k >

logN

log 2.

If pm ≤ N < pm+1, N ≥ 1, then m log p ≤ logN < (m+ 1) log p, and m ≤ log n

log p< m+ 1,

16

Page 24: Divergence of Prime Reciprocals

hence log p occurs exactly m times in ψ(N), and m =

⌊logN

log p

⌋. Thus there is another way

to express ψ(N), that is

ψ(N) =∑p≤N

⌊logN

log p

⌋· log p. (3.1.3)

It turns out that Chebyshev’s functions are closely related to π(N), as seen in the followingtheorem, which is crucial in the proof of (3.1.1).

3.2 The Chebyshev Function Theorem

Let

l1 = limπ(N)N

logN

, L1 = limπ(N)N

logN

,

l2 = limθ(N)

N, L2 = lim

θ(N)

N,

l3 = limψ(N)

N, L3 = lim

ψ(N)

N.

Then l1 = l2 = l3, and L1 = L2 = L3.

Proof. From (3.1.2), we get that θ(N) ≤ ψ(N), and from (3.1.3) that

ψ(N) ≤∑p≤N

logN

log p· log p = logN

∑p≤N

1,

that is ψ(N) ≤ π(N) logN, therefore θ(N) ≤ ψ(N) ≤ π(N) logN. Dividing through by N ,we obtain

L2 ≤ L3 ≤ L1. (3.2.1)

Next, choose a constant real number α, such that 0 < α < 1. Let N > 1, then

θ(N) ≥∑

Nα<p≤N

log p,

and since log p > logNα, we have

θ(N) ≥ α logN∑

Nα<p≤N

1,

which implies that θ(N) ≥ α logN(π(N)− π(Nα)). Since there can be no more primesthan there are integers, we have π(Nα) < Nα, so that θ(N) > απ(N) logN − αNα logN,dividing through by N, we get

θ(N)

N> απ(N)

logN

N− α logN

N1−α .

17

Page 25: Divergence of Prime Reciprocals

Now since 0 < α < 1, it follows thatlogN

N1−α → 0, as N →∞. So it follows that L2 ≥ αL1,

for every real number α, such that 0 < α < 1. Therefore L2 ≥ L1, and combining this with(3.2.1) we can conclude that L1 = L2 = L3.

The proof that l1 = l2 = l3 is similar. From the same inequality by which we obtained(3.2.1), we must have l2 ≤ l3 ≤ l1. By the same steps as above, we obtain l2 ≥ α l1,therefore l2 ≥ l1 and thus l1 = l2 = l3.

This theorem states that if any one of the three functions

π(N)N

logN

,θ(N)

N,

ψ(N)

N,

has a limit as N tends to infinity, then so do the others, and all the limits are the same. Inother words, the three functions are asymptotic to each other. As was stated in theintroduction, the existence of the limit of any one of these functions is the real difficulty inproving the prime number theorem.

3.3 Proof of Chebyshev’s Theorem

We are now ready to prove (3.1.1). If we let

l = limπ(N)N

logN

, and L = limπ(N)N

logN

,

then we will prove the theorem by showing L ≤ 4 log 2 and l ≥ log 2. As we saw in theproof of the previous theorem, these inequalities can be exchanged for the following

L = limθ(N)

N≤ 4 log 2, (3.3.1)

l = limψ(N)

N≥ log 2. (3.3.2)

We will start by proving (3.3.1). We consider the binomial coefficient

M =

(2m

m

)=

(2m)!

(m!)2=

(m+ 1)(m+ 2) · · · (2m)

1 · 2 · · ·m.

In order to understand the properties of M , it is useful to find the prime factorization ofM . We can find this factorization by the formula for the prime factorization of N ! for anypositive integer N > 1. To derive this formula, we first consider how many times a givenprime p divides N !. The formula for this number is easily motivated by considering aspecial case, take N = 5, then 5! = 5 · 4 · 3 · 2 · 1 = 23 · 3 · 5 = 120. Note that the exponentof 2 is determined by the number of times 2 occurs as a factor, that is

⌊52

⌋= 2, added to

the number of times its square, 4 occurs as a factor⌊

54

⌋= 1. It can be shown by induction

on N that in general, the number of times that a prime p divides N ! is

18

Page 26: Divergence of Prime Reciprocals

⌊N

p

⌋+

⌊N

p2

⌋+ . . .+

⌊N

ps

⌋=∑s≥1

⌊N

ps

⌋(3.3.3)

where s is the largest integer for which ps ≤ N . This series terminates since once ps > N,⌊N

ps

⌋= 0. Therefore, the prime factorization of N ! is

N ! =∏p≤N

pbNpc+b N

p2c+···+b N

psc. (3.3.4)

We can use (3.3.4) to write

M =

(2m

m

)=

(2m)!

(m!)2=∏p≤2m

p(b 2mpc−2bm

pc)+

“b 2mp2c−2bm

p2c”+···+(b 2mps c−2bm

psc) (3.3.5)

where s is the largest integer for which ps ≤ 2m. Let r ≤ s be the largest integer for whichpr ≤ m, so that

2

⌊m

pr+1

⌋+ · · ·+ 2

⌊m

ps

⌋= 0.

In the prime factorization of M above, we see the expression b2mpkc − 2bm

pkc, 1 ≤ k ≤ s

occur many times as a term in the expression for the exponent. We can determine whatthe values of this expression may be. We drop the floor notation and write bm

pkc = m

pk− ε

for some 0 ≤ ε < 1, and b2mpkc = 2m

pk− δ for some 0 ≤ δ < 1. Then 2bm

pkc = 2m

pk− 2ε and we

get b2mpkc − 2bm

pkc = 2ε− δ, thus

−1 <

⌊2m

pk

⌋− 2

⌊m

pk

⌋< 2 hence

⌊2m

pk

⌋− 2

⌊m

pk

⌋= 0 or 1. (3.3.6)

From (3.3.5), M is always an even integer, because if r is the largest integer such that2r ≤ m, then m < 2r+1 ≤ 2m, and 2m < 2r+2 ≤ 4m, thus s = r + 1 and⌊

2m

2r+1

⌋− 2

⌊ m

2r+1

⌋= 1.

Therefore 2 is always a factor of M , and hence M is always even.Since 2m is even, M is the unique middle term in the binomial expansion

(1 + 1)2m = 1 + 2m+

(2m

2

)+ · · ·+

(2m

m

)+ · · ·

(2m

2m− 2

)+ 2m+ 1.

M is also the largest integer in the expansion. Since (1 + 1)2m = 22m; combining this andthe fact that there are 2m+ 1 terms in the expansion of (1 + 1)2m, we obtain

M < 22m, and 22m < (2m+ 1)M. (3.3.7)

19

Page 27: Divergence of Prime Reciprocals

From the product (3.3.5) we see that since the denominator of M is composed of(m!)2, it can not be divisible by any primes m < p ≤ 2m. It is also clear from (3.3.5), andthe fact that 2 divides M that M ≥

∏m<p≤2m p, thus

logM ≥∑

m<p≤2m

log p = θ(2m)− θ(m).

Now from (3.3.7) we obtain logM < 2m log 2, hence

θ(2m)− θ(m) < 2m log 2. (3.3.8)

If we set m = 1, 2, 22, . . . , 2k−1 in (3.3.8), we obtain a list of inequalities

θ(2)− θ(1) < 2 log 2

θ(4)− θ(2) < 4 log 2

θ(8)− θ(4) < 8 log 2

...

θ(2k)− θ(2k−1) < 2k log 2

adding these inequalities together, and using

k∑r=1

2r = 2k+1 − 2 < 2k+1

we get

θ(2k)− θ(1) < log 2k∑r=1

2r < 2k+1 log 2.

Now, since θ(1) = 0, this can be rewritten as

θ(2k) < 2k+1 log 2. (3.3.9)

If we now let N > 1, and k be a positive integer such that 2k−1 ≤ N < 2k. Thefunction θ is never decreasing, so (3.3.9) gives us

θ(N) < θ(2k) < 2k+1 log 2 ≤ 4N log 2.

Henceθ(N)

N< 4 log 2, which implies that

L = limθ(N)

N≤ 4 log 2,

thus by Theorem 3.2, L = limπ(N) logN

N≤ 4 log 2, which proves (3.3.1).

20

Page 28: Divergence of Prime Reciprocals

We now begin to prove (3.3.2), the second part of Chebyshev’s theorem. To prove(3.3.2) we need to consider the binomial coefficient from (3.3.5) and its prime factorization,that is

M =

(2m

m

)=

2m!

(m!)2=∏p≤2m

p(b 2mpc−2bm

pc)+

“b 2mp2c−2bm

p2c”+···+(b 2mps c−2bm

psc).

This product shows that M is divisible by p exactly vp times, where vp can be written

vp =∑k≥1

(⌊2m

pk

⌋− 2

⌊m

pk

⌋).

ThereforeM =

∏p≤2m

pvp .

Now since

⌊2m

pk

⌋=

⌊m

pk

⌋= 0 when pk > 2m, that is when k >

⌊log 2m

log p

⌋, we have

vp =

Mp∑k=1

(⌊2m

pk

⌋− 2

⌊m

pk

⌋), Mp =

⌊log 2m

log p

⌋. (3.3.10)

Combining this result with (3.3.6) we get that vp ≤Mp, hence

M =∏p≤2m

pvp ≤∏p≤2m

pMp . (3.3.11)

Now we also have from (3.1.3) and (3.3.10) that

ψ(2m) =∑p≤2m

⌊log 2m

log p

⌋· log p =

∑p≤2m

Mp log p,

so it follows thateψ(2m) =

∏p≤2m

pMp ,

and by (3.3.11), logM ≤ ψ(2m). It follows from (3.3.7) thatlogM > 2m log 2− log(2m+ 1), therefore for every positive integer m, we have

ψ(2m) > 2m log 2− log(2m+ 1). (3.3.12)

To finish the proof, let N be a positive integer, N > 2, and let m =

⌊N

2

⌋≥ 1, then

m =N

2− ε, for 0 ≤ ε < 1. Thus m >

N

2− 1, and 2m+ 2ε = N hence 2m ≤ N . From

(3.3.12), we getψ(N) ≥ ψ(2m) > (N − 2) log 2− log(N + 1),

and dividing through by N

ψ(N)

N>N − 2

Nlog 2− log(N + 1)

N,

21

Page 29: Divergence of Prime Reciprocals

hence

l = limψ(N)

N≥ log 2,

which combined with Theorem 3.2, proves Chebyshev’s theorem.It follows from Chebyshev’s theorem that π(N) diverges, and provides another proof

that∑

p

1

pover all primes p, diverges. The divergence of π(N) follows since the exists a

constant 0 < a < 1 such that for sufficiently large N , π(N) >N

logNa. The divergence of

prime reciprocals follows from Chebyshev’s theorem by letting pn be the nth prime, thenπ(pn) = n, and from Chebyshev’s theorem, we have

π(N) > aN

logN,

for sufficiently large N . Temporarily assuming n to be a real variable, and differentiating

f(n) = log n−√n, we get

2−√n

2n< 0 for n ≥ 4, and f(4) = 2 log 2− 2 < 0. Therefore

log n <√n or

√n log n < n, and hence

n

log n>√n for all n ≥ 4. It follows that

n = π(pn) > apn

log pn>√pn

if n is sufficiently large. Therefore log n >1

2log pn, hence log pn < 2 log n, it follows that

pna < n log pn < 2n log n,

or1

pn>

a

n log pn>

a

2n log n

for large enough n. Thus the series∞∑n=1

1

pn

diverges, in comparison to the divergent series

∞∑n=2

1

n log n.

3.4 Another Small Step Towards the Prime Number Theorem

In section 3.3 we showed that for large N , π(N) does not stray to far fromN

logNin an

asymptotic sense, that is for sufficiently large N ,

a <π(N)N

logN

< A.

22

Page 30: Divergence of Prime Reciprocals

In this section we will do even better, we will show that ifπ(N)N

logN

tends to a limit as

N →∞, then this limit is 1 [10], from which the prime number theorem would follow. Thisresult will be readily shown by applying the asymptotic formulas from 3.2 and thefollowing result which will be proven in 4.3, that is∫ N

1

ψ(t)

t2dt = logN +O(1). (3.4.1)

From (3.4.1) we will deduce that

limψ(N)

N≤ 1, lim

ψ(N)

N≥ 1.

This proof is by contradiction, if we assume that limψ(N)

N= 1 + α, for some α > 0,

then we have ψ(N) > (1 + α)N for all N greater than some N0. Therefore

∫ N

1

ψ(t)

t2dt >

∫ N0

1

ψ(t)

t2dt+

∫ N

N0

(1 +

α

2

)t

dt >(

1 +α

2

)logN +O(1),

which is a contradiction to (3.4.1), so that limψ(N)

N≤ 1.

Now suppose that limψ(N)

N= 1− α, α > 0. ψ(N) < (1− α)N for all N greater than

some N0. Therefore

∫ N

1

ψ(t)

t2dt <

∫ N0

1

ψ(t)

t2dt+

∫ N

N0

(1− α

2

)t

dt <(

1− α

2

)logN +O(1),

a contradiction to (3.4.1), thus limψ(N)

N≥ 1. Therefore, if limN→∞

ψ(N)

Nexists then it is

equal to 1. Now applying Theorem 3.2 we see that if the limit exists, then

limN→∞

π(N)N

logN

= 1.

Therefore, all that remains to proving the prime number theorem and showing that

π(N) ∼ N

logN, is the existence of this limit. As stated in the introduction, this is quite

difficult, and was not proven until independent proofs were given by Hadamard andPoisson in 1896.

23

Page 31: Divergence of Prime Reciprocals

§4: Mertens’ Theorem

4.1 Introduction

Franz Mertens (1840-1927) was a Polish mathematician. Mertens studied underKronecker and Kummer at the University of Berlin where he received his doctorate in1865. In 1874, Mertens proved three famous results on the distribution of primes nowcollectively referred to as “Mertens’ Theorem”. Before we state this theorem we will need afew definitions. Hans von Mangoldt (1854-1925) was a German mathematician who madecontributions to solving the prime number theorem. He defined the following function nowknown as the “von Mangoldt” function,

Λ(n) =

log p, if n = pm, m a positive integer

0, otherwise.

Notice how this function is related to the Chebyshev function,

ψ(N) =∑

1≤m,ppm≤N

log p =∑n≤N

Λ(n).

With these definitions, we can now state [4]:

Mertens’ Theorem

As N →∞ we have∑n≤N

Λ(n)

n= logN +O(1);

∑p≤N

log p

p= logN +O(1), (4.1.1)

∫ N

1

ψ(t)

t2dt = logN +O(1), (4.1.2)

∑p≤N

1

p= log logN + C +O

(1

logN

). (4.1.3)

where C is a constant. The proof of this theorem will require several lemmas, which will bethe content of the next section.

4.2 Preliminary Lemmas

We will begin with a weak form of Stirling’s formula, that is, as m→∞

logm! = m logm+O(m) (4.2.1)

The strong version of Stirling’s formula provides a better approximation to m!, but this ismore than is needed in the proof of Merten’s theorem. For a complete proof of Stirling’sformula see [20].

24

Page 32: Divergence of Prime Reciprocals

Proof:To prove this result, we will show that m logm− logm! < m for all m ≥ 2. We do not

need absolute value because this difference is always positive for m > 2 since for any m,logm+ logm+ . . .+ logm︸ ︷︷ ︸

m terms

> logm+ log (m− 1) + . . .+ log 2 + log 1. Since logm is always

increasing,∫ mm−1

log t dt < logm, hence∫ m

1log t dt < log 2 + . . .+ logm = logm!. Using

integration by parts, one finds the antiderivative of log t to be t log t− t. Using thefundamental theorem of calculus, we get

∫ m1

log t dt = m logm−m+ 1, so thatm logm− logm! < m− 1 < m for all m ≥ 2. Therefore, logm! = m logm+O(m) (4.2.1).

The next result that will be needed is, as m→∞

ψ(m) = O(m). (4.2.5)

This result follows easily from the theorems proved in 3.2 and 3.3. By Chebyshev’s

theorem, for large enough m, a ≤ π(m) logm

m≤ A, thus by Theorem 3.2

ψ(m)

m≤ A, thus

ψ(m) ≤ Am, and hence ψ(m) is O(m).The final result that will be required in the proof of Mertens’ theorem is the formula for

“partial summation”, or “Abel summation”. This formula is useful to number theorists asa systematic approach to the computation of finite sums of number theoretic functions. Let0 ≤ λ1 ≤ λ2 ≤ · · · be any divergent sequence of real numbers, and let an be a sequence ofcomplex numbers. Let

A(N) :=∑λn≤N

an,

and φ(N) a complex-valued function defined for N ≥ 0. Then if φ(N) has a continuousderivative in (0,∞), and N ≥ λ1, then

∑λn≤N

anφ(λn) = A(N)φ(N)−∫ N

λ1

A(t)φ′(t)dt. (4.2.3)

Proof: Let us define A(λ0) = 0, then

A(λ1)− A(λ0) = a1

A(λ2)− A(λ1) = a2

...

A(λn)− A(λn−1) = an

substituting these expressions for an into the sum, we see that

25

Page 33: Divergence of Prime Reciprocals

k∑n=1

anφ(λn) = (A(λ1)− A(λ0))φ(λ1) + (A(λ2)− A(λ1))φ(λ2) + · · ·+ (A(λk)− A(λk−1))φ(λk)

= A(λk)φ(λk)−k−1∑n=1

A(λn)(φ(λn+1)− φ(λn))

= A(λk)φ(λk)−k−1∑n=1

A(λn)

∫ λn+1

λn

φ′(t)dt

= A(λk)φ(λk)−k−1∑n=1

∫ λn+1

λn

A(t)φ′(t)dt

= A(λk)φ(λk)−∫ λk

λ1

A(t)φ′(t)dt. (4.2.4)

The last equation follows since φ(t) has a continuous derivative on (0,∞). Now, if we let kbe the largest integer such that λk ≤ N , then from integration by parts we have∫ N

λk

A′(t)φ(t)dt = A(N)φ(N)− A(λk)φ(λk)−∫ N

λk

A(t)φ′(t)dt.

Since A(t) is a step function hence is constant on λk ≤ t < λk+1, so the left hand side of theabove equation is zero, and rearranging we obtain

A(λk)φ(λk) = A(N)φ(N)−∫ N

λk

A(t)φ′(t)dt.

Substituting this back into (4.2.4), and again using the fact that A(t) is a step function weobtain our result (4.2.3)

∑λn≤N

anφ(λn) = A(N)φ(N)−∫ N

λ1

A(t)φ′(t)dt.

4.3 Proof of Merten’s theorem

We now have all the ingredients necessary to prove Mertens’ theorem. We will begin byproving (4.1.1), that is∑

n≤N

Λ(n)

n= logN +O(1);

∑p≤N

log p

p= logN +O(1).

Recall in (3.3.4) we showed that

N ! =∏p≤N

pbNpc+b N

p2c+···+b N

psc,

26

Page 34: Divergence of Prime Reciprocals

where s is the largest integer for which ps ≤ N . Taking logarithms of both sides of thisformula we obtain

logN ! =∑

(p,r): pr≤N

⌊N

pr

⌋log p =

∑n≤N

⌊N

n

⌋Λ(n). (4.3.1)

As in the proof of Chebyshev’s theorem, we write

⌊N

n

⌋=N

n− εn, where 0 ≤ εn < 1.

Substitution in (4.3.1) gives us

logN ! = N∑n≤N

Λ(n)

n−∑n≤N

εnΛ(n),

and by (3.3.1), we have ∑n≤N

εnΛ(n) <∑n≤N

Λ(n) = ψ (N) = O(N),

thus

logN ! = N∑n≤N

Λ(n)

n+O (N) .

By (4.2.1)logN ! = N logN +O(N),

and dividing through by N we get∑n≤N

Λ(n)

n= logN +O(1), (4.3.2)

which is the first formula in (4.1.1). To get the second formula in (4.1.1) we bound thedifference between (4.3.2) and our summation of interest by a constant, that is∣∣∣∣∣∑

n≤N

Λ(n)

n−∑p≤N

log p

p

∣∣∣∣∣ ≤∑p≤N

(1

p2+

1

p3+ · · ·

)log p <

∑p

log p

p(p− 1)<∑n≥2

log n

n(n− 1).

(4.3.3)Here the inequality on the left hand side of (4.3.3) follows since the von Mangoldt functionis nonzero for n = pm, hence all terms in both sums agree except those with denominatorswhich are powers of primes with m ≥ 2. The final series on the right hand side of (4.3.3) is

convergent since1

n(n− 1)≤ 2

n2, and from the proof of Chebyshev’s theorem, log n <

√n

for n ≥ 2, it follows that∑n≥2

log n

n(n− 1)< 2

∑n≥2

log n

n2< 2

∑n≥2

√n

n2= 2

∑n≥2

1

n3/2= C <∞.

27

Page 35: Divergence of Prime Reciprocals

The last inequality is the convergent p-series, p=32. This shows the second formula in

(4.1.1) holds since the difference between the two summations will always be boundedabove by this constant of convergence. That is by (4.3.2)∣∣∣∣∣∑

n≤N

Λ(n)

n−∑p≤N

log p

p

∣∣∣∣∣ = O(1).

Now

∑p≤N

log p

p=∑n≤N

Λ(n)

n+

(∑p≤N

log p

p−∑n≤N

Λ(n)

n

)= logN +O(1) +O(1)

= logN +O(1).

Therefore ∑p≤N

log p

p= logN +O(1),

which completes (4.1.1).Next we shall prove (4.1.2), that is∫ N

1

ψ(t)

t2dt = logN +O(1).

Recall that in section 3.4, we used this formula without proof to show that

limπ(N)N

logN

≥ 1 and limπ(N)N

logN

≤ 1.

Therefore, if

limN→∞

π(N)N

logN

exists, then the limit must be equal to 1. From this, we obtain the asymptotic formula inthe prime number theorem.

To begin the proof, recall the previously mentioned relation

ψ(t) =∑n≤t

Λ(n).

We substitute this expression for ψ(t) into the integral to obtain∫ N

1

ψ(t)

t2dt =

∫ N

1

∑n≤t

Λ(n)dt

t2.

28

Page 36: Divergence of Prime Reciprocals

Now, when considering t = 1, 2, 3, ..., N we see that we can rewrite∫ N

1

∑n≤t

Λ(n)dt

t2=

∫ 2

1

Λ(1)1

t2dt+

∫ 3

2

(Λ(1) + Λ(2))1

t2dt+· · ·+

∫ N

N−1

(Λ(1) + · · ·+ Λ(N − 1))1

t2dt

which can be rewritten

Λ(1)

∫ N

1

1

t2dt+ Λ(2)

∫ N

2

1

t2dt+ · · ·+ Λ(N − 1))

∫ N

N−1

1

t2dt

thus ∫ N

1

ψ(t)

t2dt =

∫ N

1

∑n≤t

Λ(n)dt

t2=∑n≤N

Λ(n)

∫ N

n

dt

t2. (4.3.4)

The integral on the right hand side of (4.3.4) is equal to1

n− 1

N, substituting this back in

and simplifying, we obtain ∫ N

1

ψ(t)

t2dt =

∑n≤N

Λ(n)

n− ψ(N)

N.

Now applying the first part of (4.1.1) and (4.2.2), we have∫ N

1

ψ(t)

t2dt = logN +O(1)

which is (4.1.2).Now we can prove (4.1.3) that is∑

p≤N

1

p= log logN + C +O

(1

logN

).

The proof requires defining the proper sequences and functions for use in Abel’ssummation (4.2.3). Letting λn = pn, where pn is the sequence of prime numbers in naturalorder. We define

A(N) =∑pn≤N

an, where an =log pnpn

and φ(pn) =1

log pn. Abel summation gives us

∑pn≤N

1

pn=A(N)

logN+

∫ N

2

A(t)

t(log t)2dt.

Now from the second formula in (4.1.1), letting A(N) = logN + E(N), so that∑pn≤N

1

pn= 1 +

E(N)

logN+

∫ N

2

log t+ E(t)

t(log t)2dt

29

Page 37: Divergence of Prime Reciprocals

where by (4.1.1), E(N) is O(1), thus |E(N)| ≤ K for all N ≥ 2, where K is some constant.We can move this around a little to get∑

pn≤N

1

pn= 1 +

E(N)

logN+

∫ N

2

dt

t log t+

∫ N

2

E(t)

t(log t)2dt.

Now the first integral is, ∫ N

2

dt

t log t= log logN − log log 2.

and substituting this back in and rearranging, we see that∑pn≤N

1

pn= log logN + 1− log log 2 +

E(N)

logN+

∫ N

2

E(t)

t(log t)2dt.

Using the fact that E(N) = O(1), we have|E(N)|logN

≤ K

logN. It remains to show that

∫ N

2

E(t)

t(log t)2dt = O

(1

logN

).

We write ∫ N

2

E(t)

t(log t)2dt =

∫ ∞2

E(t)

t(log t)2dt−

∫ ∞N

E(t)

t(log t)2dt.

Now the improper integral ∫ ∞2

E(t)

t(log t)2dt

is convergent since ∫ ∞2

E(t)

t(log t)2dt ≤ K

∫ ∞2

dt

t(log t)2=

K

log 2.

Now we re-write once again∑pn≤N

1

pn= log logN +

(1− log log 2 +

∫ ∞2

E(t)

t(log t)2dt

)+E(N)

logN−∫ ∞N

E(t)

t(log t)2dt,

where ∫ ∞N

E(t)

t(log t)2dt ≤ K

logN.

Hence ∣∣∣∣E(N)

logN−∫ ∞N

E(t)

t(log t)2dt

∣∣∣∣ ≤ 2K

logN,

for N ≥ 2, which shows the error term is O

(1

logN

). Setting

C = 1− log log 2 +

∫ ∞2

E(t)

t(log t)2dt,

(4.1.3) follows. This completes the proof of Mertens’ theorem.

30

Page 38: Divergence of Prime Reciprocals

§5: The Product Form of Mertens’ Theorem

The goal of this section is to prove an important result that follows from the proof ofMertens’ theorem in 4.3, that is [10]∏

p≤N

(1− 1

p

)=

e−γ

logN(1 + o(1)) ∼ e−γ

logN, (5.0)

where γ is the Euler-Mascheroni constant.

5.1 The Euler-Mascheroni ConstantThe Euler-Mascheroni or Euler’s constant was defined by Euler in 1734 as

γ = limN→∞

[N∑n=1

1

n− logN

]=

∫ ∞1

1

btc− 1

tdt. (5.1.1)

The symbol γ was first used by the geometer Lorenzo Mascheroni, who in 1790 used thesymbol γ instead of Euler’s C, hence the name Euler-Mascheroni. We can use the veryuseful Abel summation formula (4.2.3) to give us an approximation to γ. Let an = 1, so

that A(N) = bNc and let φ(t) =1

t; then Abel summation gives us

∑n≤N

1

n=bNcN

+

∫ N

1

btct2

dt. (5.1.2)

Letting bNc = N − εN and btc = t− εt where 0 ≤ εN < 1, 0 ≤ εt < 1, and substituting into(5.1.2) we get ∑

n≤N

1

n− logN = 1− εN

N−∫ N

1

εtt2dt. (5.1.3)

Letting N →∞ gives us another expression for γ,

γ = 1−∫ ∞

1

εtt2dt, (5.1.4)

thus we have 0 < γ ≤ 1. Now if we subtract and add∫∞N

εtt2dt to the right hand side of

(5.1.3), we can use (5.1.4) to write∑n≤N

1

n= logN + γ + E(N),

where

|E(N)| =∣∣∣∣∫ ∞N

εtt2dt− εN

N

∣∣∣∣ ≤ ∣∣∣∣∫ ∞N

εtt2dt

∣∣∣∣+∣∣∣εNN

∣∣∣ < 2

N.

Thus we have ∑n≤N

1

n− logN − 2

N< γ <

∑n≤N

1

n− logN +

2

N.

31

Page 39: Divergence of Prime Reciprocals

This inequality provides a method to compute the value of γ to several places. To fifteendecimal places, the value of γ is 0.577215664432730. [3]

The constant γ is closely connected to the Gamma function,

Γ(N) =

∫ ∞0

e−xxN−1 dx. (5.1.5)

This connection is due the mathematician Karl Weierstrass (1815-1897) and his formula

1

Γ(N)= NeγN

∏k>0

(1 +N

k

)e

0@−Nk

1A . (5.1.6)

It requires some complex analysis to show that (5.1.5) and (5.1.6) are equivalentrepresentations for Γ(N) [19]. Differentiating (5.1.5) with respect to N we obtain

Γ′(N) =

∫ ∞0

e−xxN−1 log x dx, Γ′(1) =

∫ ∞0

e−x log x dx. (5.1.7)

Logarithmically differentiating (5.1.6) we obtain

− log(Γ(N)) = logN + γN +∑k>0

(log

(1 +

N

k

)− N

k

)−Γ′(N)

Γ(N)=

1

N+ γ +

∑k>0

(1

k +N− 1

k

),

−Γ′(1) = γ (5.1.8)

since from (5.1.5), Γ(1) = 1. Therefore, from (5.1.7) and (5.1.8)

γ = −Γ′(1) = −∫ ∞

0

e−x log x dx. (5.1.9)

5.2 The Product Form as a Probability of a Number Being Prime

Intuitively, the product in (1) can be interpreted as the probablilty that a randomnumber N is not divisible by any primes p up to N ; in other words, the product representsthe probability that N is prime [8]. To see this, first consider the probability that a

number is divisible by 2, which is1

2, thus the probability that a number is not divisible by

2 is

(1− 1

2

)=

1

2. The probability that a number is divisible by 3 is naturally

1

3, and

therefore the probability that a number is not divisible by 3 is

(1− 1

3

)=

2

3. Now, what is

the probability that a number is not divisible by 2, or 3? Define the two events A: anumber is divisible by 2; and B: a number is divisible by 3. As a special case, consider the

first 20 numbers, 1, 2, 3, . . . , 20 then the probability of A =10

20=

1

2, and the probability of

32

Page 40: Divergence of Prime Reciprocals

B=6

20=

3

10. The probability that a number is divisible by 3, given that it is divisible by

2, is3

10which is just the probability of B, hence the events A and B are independent of

each other. Therefore the probability that a given number is not divisible by 2 or 3 is(1− 1

2

)(1− 1

3

)=

1

3. Similarly, the probability that a number is not divisible by 5 is(

1− 1

5

), therefore the probability that any random integer is not divisible by 2, 3 or 5 is(

1− 1

2

)(1− 1

3

)(1− 1

5

)=

4

15. Continuing this process for all primes p ≤ N , gives the

probability that N is not divisible by any of the p ≤ N as∏p≤N

(1− 1

p

).

The product form of Mertens’ theorem verifies that this product tends to zero

asymptotically as1

eγ logN. This limit does take a surprisingly long time to reach zero,

since the product form of Mertens’ theorem shows that at N = 100, 000,1

e.577216 log 100, 000≈ .049, thus according to this formulation, we still have about a 5%

chance that the integer is prime.

33

Page 41: Divergence of Prime Reciprocals

2 4 6 8 10N

0.2

0.3

0.4

0.5

0.6

0.7

0.8

p

Fig.5.2.1: The diamonds are the graph of∏

p≤N+1

(1− 1

p

), and the stars are the graph of

1e.577216 log(N + 1)

for 1 ≤ N ≤ 10.

20 40 60 80 100N

0.125

0.15

0.175

0.2

0.225

0.25

0.275

p

Fig.5.2.2: The diamonds are the graph of∏

p≤N+1

(1− 1

p

), and the stars are the graph of

1e.577216 log(N + 1)

for 1 ≤ N ≤ 100.

34

Page 42: Divergence of Prime Reciprocals

The reader may notice a sort of “paradox” between this interpretation of (5.0) and theprime number theorem. The prime number theorem asserts that the density of prime

numbers is1

logN, whereas Mertens’ theorem gives the quite non-asymptotic

1

eγ logN. In

fact compared to the prime number theorem, since eγ ≈ 1.78, the probabilityunderestimates the density of primes by nearly a factor of 2. One explanation is that theSieve of Eratosthenes is more efficient than random. Recall from the introduction that wefind all primes p ≤ N by sieving with primes p ≤

√N . Thus a number N can be

determined to be prime by testing whether it is divisible by any of the primes p ≤√N . If

in our probability formulation, we take the probability of N being prime to be∏p≤√N

(1− 1

p

)∼ 2

eγ logN≈ 1.12

logN,

then Mertens’ theorem is much closer to1

logN, but now overestimates by 12% the number

of primes. It turns out that our intuition in formulating the probability in this sectionrelied on assumption of independent events. It turns out that when using the Sieve ofEratosthenes approach, these events are not completely independent. In light of the prime

number theorem,1

logNis the true density of primes; but the density obtained by Mertens’

theorem is a decent estimate, and shows that that the product tends to zero.

5.3 Proof of the Product form of Mertens’ Theorem

Recall that in section 4.3 we found that∑p≤N

1

p= log logN + C +O

(1

logN

).

Because1

logN= o(1), Mertens’ theorem can be restated as

∑p≤N

1

p= log logN + C + o(1). (5.3.1)

where

C = 1− log log 2 +

∫ ∞2

E(t)

t(log t)2dt.

(5.0) will follow from (5.3.1) and the following theorem which gives an alternate form ofthe constant C in (5.3.1).

Theorem:

C = γ +∑p

[log

(1− 1

p

)+

1

p

]. (5.3.2)

35

Page 43: Divergence of Prime Reciprocals

If α ≥ 0, we have, by (2.1.8)

0 < − log

(1− 1

p1+α

)− 1

p1+α<

1

2p1+α(p1+α − 1)≤ 1

2p(p− 1).

We saw in (2.1.8) that the series is convergent. Thus by the Weierstrass m-test, we candefine for each α a uniformly convergent series

F (α) =∑p

[log

(1− 1

p1+α

)+

1

p1+α

]. (5.3.3)

Since the series is uniformly convergent for all α ≥ 0, we have F (α) is continuous, thusF (α)→ F (0) as α→ 0 through positive values. If we now suppose that α > 0, then fromthe equality between Euler’s product and the zeta-function discussed in the introduction,we have

F (α) = g(α)− log ζ(1 + α),

where

g(α) =∑p

1

p1+α.

We again call upon the Abel summation formula (4.2.3), with the following definitions:let the sequence λn be the sequence of positive integers with λ1 = 2,

an =

1

n, if n = prime

0, otherwise

and φ(N) =1

Nα. Now, from the proof of Mertens’ Theorem in 4.3 we have that

A(N) =∑p≤N

1

p= log logN + C + E(N),

with these definitions, (4.2.3) becomes∑p≤N

1

p1+α=A(N)

Nα+ α

∫ ∞2

A(t)

t1+αdt.

If we let N →∞, we have

g(α) = α

∫ ∞2

A(t)

t1+αdt

= α

∫ ∞2

log log t

t1+αdt+ αC

∫ ∞2

1

t1+αdt+ α

∫ ∞2

E(t)

t1+αdt.

Now,

α

∫ ∞2

log log t

t1+αdt = α

∫ ∞1

log log t

t1+αdt− α

∫ 2

1

log log t

t1+αdt.

36

Page 44: Divergence of Prime Reciprocals

Making the change of variables t = euα ,

α

∫ ∞1

log log t

t1+αdt =

∫ ∞0

e−u log(uα

)du = −γ − logα

by (5.1.9), and

α

∫ ∞1

1

t1+αdt = 1.

Therefore

g(α) + logα− C + γ = α

∫ ∞2

E(t)

t1+αdt− α

∫ 2

1

(log log t+ C)

t1+αdt.

It was shown in 4.3 that E(t) = O

(1

log t

), using this, and making the substitution

T = e1√α , ∣∣∣∣α ∫ ∞

2

E(t)

t1+αdt

∣∣∣∣ < Kα

∫ T

2

dt

t+

log T

∫ ∞T

dt

t1+α

< Kα log T +K

log T≤ 2K

√α→ 0 as α→ 0.

We also have ∣∣∣∣∫ 2

1

(log log t+ C)

t1+αdt

∣∣∣∣ < ∫ 2

1

(| log log t|+ |C|)t

dt = K,

since the integral converges at t = 1. Therefore g(α) + logα→ C − γas α→ 0.

Recall from section 1.2, the zeta-function, which can be written in the following form

ζ(s) =∞∑n=1

1

ns=

∫ ∞1

x−s dx+∞∑n=1

∫ n+1

n

(n−s − x−s) dx. (5.3.4)

Now, since s > 1, we have ∫ ∞1

x−s dx =1

s− 1.

Also

0 < n−s − x−s =

∫ x

n

st−s−1 dt <

∫ x

n

st−2 dt =s(x− n)

nx<

s

n2,

if n < x < n+ 1, and so

0 <

∫ n+1

n

(n−s − x−s) dx <∫ n+1

n

s

n2dx =

s

n2;

and the last term in (5.3.4) is positive and less than

s∞∑n=1

1

n2=sπ2

6.

37

Page 45: Divergence of Prime Reciprocals

Therefore

ζ(s) =1

s− 1+O(s),

and on taking logarithms, for |s− 1| < 1,

log ζ(s) = log1

s− 1+ log [1 +O(s− 1)] = log

1

s− 1+O(s− 1). (5.3.5)

Now, from (5.3.5) we get log ζ(1 + α) + logα→ 0 as α→ 0, and so F (α)→ C − γ.Therefore

C = γ + F (0),

which is (5.3.2).It is now relatively easy to prove (5.0), by means of (5.3.1) and (5.3.2). To see this,

using our new form of the constant C in (5.3.2) in (5.3.1) we write∑p≤N

1

p= log logN + γ +

∑p

[log

(1− 1

p

)+

1

p

]+ o(1). (5.3.6)

Now, (5.3.6) is equal to∑p≤N

1

p= log logN + γ +

∑p≤N

[log

(1− 1

p

)+

1

p

]+∑p>N

[log

(1− 1

p

)+

1

p

]+ o(1).

Thus ∑p≤N

1

p= log logN + γ +

∑p≤N

log

(1− 1

p

)+∑p≤N

1

p+ o(1).

Canceling and moving some terms around we get∑p≤N

log

(1− 1

p

)= − log logN − γ − o(1),

exponentiating we get

ePp≤N log(1− 1

p) = e− log logN−γ−o(1),

using properties of logarithms we obtain

eQp≤N log(1− 1

p) = e− log logNe−γe−o(1),

finally ∏p≤N

log

(1− 1

p

)=

1

logNe−γ(1 + o(1)) ∼ e−γ

logN,

which is (5.0).

38

Page 46: Divergence of Prime Reciprocals

5.4 Another Proof of The Product Form of Merten’s Theorem

The following proof uses many of the same steps and lemmas as the previous proof;but the integral evaluations for finding the expressions in the sum F (α) are simplified byan exponential substitution. This proof also does not require using the Gamma functionform of γ given by (5.1.9), only the definition of γ. We begin by using (5.3.5) to write

log ζ(1 + α) = log1

α+O(α).

From the Maclaurin series expansion

1− e−α = 1−(

1− α +α2

2!− α3

3!+ · · ·

)= α− α2

2!+α3

3!− · · ·

= α

(1 + α

(− 1

2!+α

3!− · · ·

))= α (1 +O(α)) , as α→ 0.

Taking logarithms we obtain

log ζ (1 + α) = − log(1− e−α

)+O(α)

=∞∑n=1

e−αn

n+O(α),

the last line follows from the Maclaurin series expansion for − log(1− x) as shown in(2.1.8). This relationship is a key difference between the two proofs and simplifies theanalysis considerably.

From (5.3.3) we have the second representation of log ζ(1 + α) :

log ζ(1 + α) =∑p

1

p1+α− F (α).

Now letting

H(t) =∑n≤t

1

n

P (t) =∑p≤t

1

p

by Abel summation ∑p

1

p1+α= α

∫ ∞0

P (et)e−αt dt;

39

Page 47: Divergence of Prime Reciprocals

and∞∑n=1

e−αn

n= α

∫ ∞0

H(t)e−αt dt.

Therefore

log ζ(1 + α) = α

∫ ∞0

H(t)e−αt dt+O(α)

and

log ζ(1 + α) = α

∫ ∞0

P (et)e−αt dt− F (α).

Subtracting to eliminate log ζ(1 + α) :

α

∫ ∞0

e−αt(H(t)− P (et)) dt = −F (α) +O(α). (5.4.1)

By (5.1..1)-(5.1.4) we have

H(t) = log t+ γ +O

(1

t

).

We have from (5.3.1) thatP (et) = log t+ C + o(1).

Therefore

α

∫ ∞0

e−αt(H(t)− P (et)) dt = α

∫ ∞0

e−αt(γ − C + o(1)) dt

= α

[[e−αt

−α

]∞0

(γ − C + o(1))

]= γ − C + o(1)

Thus by (5.4.1) letting α→ 0+

γ − C = −F (0)

orC = γ + F (0)

which is (5.3.2). As in the previous proof, once this form of the constant C in Mertens’theorem is known, the asymptotic formula follows.

40

Page 48: Divergence of Prime Reciprocals

6. Conclusion

6.1 Final Remarks

The distribution of the prime numbers throughout the integers is a fundamentalproblem in number theory. While the prime number theorem gives us certainty that the

density is approximately1

logN, this is only an asymptotic estimate. The actual difference

between π(N) andN

logNfor large N , can be very large. The better approximation

Li(N)=∫ N

2dt

log tcan still differ from π(N) considerably, and how considerably is a Clay

Institute million dollar price: the Riemann Hypothesis. Euler laid the foundation for themodern analysis of prime numbers. His wonderful product

∞∑n=1

1

ns=∏p

(1− 1

ps

)−1

provided the connection between the zeta-function and the primes. We relied on Euler’sproduct in 5.3 when we proved the product form of Merten’s theorem.

As was shown throughout this paper, the infinite sum of prime reciprocals is interwovenin the theory of the distribution of prime numbers. As we observed in section 2.1, Eulerused his product to prove that∑

p≤N

1

p> log logN − 1

2, for all N ≥ 2.

In 3.2 the sum’s divergence was shown to be a consequence of Chebyshev’s theorem, whichprovides upper and lower bounds for π(N). Further, we could not have proved theasymptotic formulas for the product form of Merten’s theorem without the formula∑

p≤N

1

p= log logN + C +O

(1

logN

),

as we saw in 4.3. The divergence of prime reciprocals also shows that there must be moreprime numbers than square numbers; and in fact more numerous than any integers of theform n1+α where α > 0, since the reciprocals of such numbers always converge.

6.2 Further Results

An intriguing continuation of the thesis of this paper is to attempt to do the same sortof analysis, but instead summing over reciprocals of twin primes. Twin primes are primesthat have a difference of two. We commonly refer to twin primes as twin prime pairs,(p, p+ 2). The first five twin prime pairs are (3, 5), (5, 7), (11, 13), (17, 19), (29, 31). Noticethat 5 occurs in two pairs, and is the only instance of a prime occurring in two differenttwin prime pairs. This is because there can be no other instances of “triple” primes asidefrom 3,5,7, since any other sequence of three consecutive odd numbers must contain a

41

Page 49: Divergence of Prime Reciprocals

number divisible by 3. Therefore, there will be no other repetitions of twin primes in thetwin prime pairs. It is still only a conjecture that there are infinitely many twin primepairs. Most mathematicians believe that there are infinitely many, but a proof still remainselusive.

In 1919, the Norwegian mathematician Viggo Brun (1885-1978), proved that in starkcontrast to the divergence of prime reciprocals∑

p a twin prime

1

p<∞. (6.2.1)

The proof of Brun’s Theorem is historically important because it introduced powerfulsieving techniques for estimating the size of a sifted set of integers (recall the Sieve ofEratosthenes). It takes a lot of work to develop Brun’s sieve, but the key result that isused to prove (6.1.1) is that

π2(N) = O

(N

(logN)2(log logN)2

),

where π2(N) is the number of primes p ≤ N for which p+ 2 is also prime (the number oftwin prime pairs with first number in the pair less than or equal to N). There is no knowneasy proof of this result; perhaps the most accessible is a proof by Edmund Landau(1877-1938) who proved this result by a detailed analysis of the properties of certainalternating binomial coefficients [12]. Brun’s theorem at least tells us that there are not toomany twin prime pairs.

42

Page 50: Divergence of Prime Reciprocals

References

[1] Bays, C. and Hudson,R., (1999). A new bound for the smallest x with π(x) > li(x).Mathematics of Computation vol.69, 231: pp.1285-1296.

[2] Bell, E.T., (1937). Men of Mathematics. Simon and Schuster Inc., New York, NewYork.

[3] Brent, R.P., (1977). Computation of the regular continued fraction for Eulersconstant. Mathematics of Computation vol. 31: pp. 771-777.

[4] Chandrasekharan, K., (1968). Introduction to Analytic Number Theory. BerlinHeidelberg New York: Springer-Verlag.

[5] Cojocaru, A. C, and Murty, M.R. (2006). An Introduction to Sieve Methods and theirApplications. Cambridge University Press, New York, New York.

[6] Eynden, C.V., (1980, May). Proofs that∑

1p

diverges. The American MathematicalMonthly vol. 87, No.5 : pp. 394-397.

[7] Goldstein, L.J., (1973). A history of the prime number theorem. AmericanMathematical Monthly 80 (June-July): pp. 599-615.

[8] Goldston, D.A., Are there infinitely many twin primes? Article.

[9] H. Halberstam and H.-E. Richert (1974), Sieve Methods. Academic Press, New York.

[10] Hardy, G.H., and Wright, E.M., (1962). The Theory of Numbers, fourth edition.Oxford University Press, Amen House, London.

[11] Hungerford, T., (1974). Algebra. Springer Science+Business Media, LLC, New York,New York.

[12] Landau, E., (1958). Elementary Number Theory. Chelsea Publishing Company, NewYork, N.Y.

[13] Lindqvist, P. and Peetre, J. (1997). On the remainder in a series of Merten’s.Expositiones Mathematicae 15, 467-477.

[14] Niven, I., (1971, May). A proof of the divergence of σ 1p. The American Mathematical

Monthly vol. 78, No.3: pp. 272-273.

[15] Rosen, K.H., (2000). Elementary Number Theory, 4th edition. AT&T Laboratoriesand Kenneth Rosen.

[16] Sandifer, E., (2006, March). How Euler did it. Mathematical Association of AmericaOnline. June 27, 2008.http://fermatslasttheorem.blogspot.com/2006/08/euler-product-formula.html

43

Page 51: Divergence of Prime Reciprocals

[17] Sautoy, M.D., (2003). Music of the Primes, 1st edition. Harper Collins Publishers Inc.,New York, New York.

[18] The Great Internet Mersenne Prime Search. July 16, 2008 athttp://www.mersenne.org/prime.htm

[19] Titchmarsh, E.C., (1939). The Theory of Functions, second edition. Oxford UniversityPress, Amen House, London.

[20] Whittaker, E. T., and Watson, G. N., (1963). A Course in Modern Analysis, fourthedition. Cambridge University Press, New York, New York.

[21] Wolframscience.com. November 18th, 2008 athttp://www.wolframscience.com/nksonline/page-908b-text?firstview=1

44