Math 506, Introduction to Number Theory Kansas State ... 506, Introduction to Number Theory Kansas State University Spring 2016 March 21, 2016 Todd Cochrane

Math 506, Introduction to Number Theory

Kansas State University

Spring 2016

March 21, 2016

Todd Cochrane

Department of MathematicsKansas State University

Contents

Notation for Math 506 5

Chapter 1. Divisibility, Congruences and Induction 71.1. Introduction 71.2. A brief look at the Axioms sheet 91.3. Divisibility Properties of Z 101.4. Greatest common divisors and least common multiples 111.5. The Euclidean algorithm 111.6. Euclidean Algorithm 121.7. Linear Combinations and the GCDLC theorem. 131.8. Euclid’s Lemma 151.9. Linear Equations in two variables 161.10. Introduction to Congruences 181.11. Principle of Induction 19

Chapter 2. Primes and Unique Factorization 212.1. Fundamental Theorem of Arithmetic 212.2. Gaussian Integers 222.3. Distribution of primes 24

Chapter 3. Arithmetic Functions 273.1. Multiplicative Functions 273.2. Perfect, Deficient and Abundant Numbers 293.3. Properties of multiplicative functions. 323.4. The Mobius Function 34

Chapter 4. More on Congruences 394.1. Counting Solutions of Congruences 394.2. Linear Congruences 394.3. Multiplicative inverses (mod m) 404.4. Chinese Remainder Theorem 414.5. Fermat’s Little Theorem and Euler’s Theorem 414.6. Orders of elements (mod m) 424.7. Decimal Expansions 434.8. Primality Testing 444.9. Public-Key Cryptography 45

Chapter 5. Polynomial Congruences 475.1. Lifting solutions from (mod p) to (mod pe) 475.2. Counting Solutions of congruences 495.3. Solving congruences (mod p) 50

3

4 CONTENTS

5.4. Quadratic Residues and the Legendre Symbol 515.5. Quadratic Reciprocity 535.6. Representing primes as sums of two squares 54

Appendix A. Axioms for the set of Integers Z 57

Appendix B. A Little Bit of Logic for Math 506 61

Notation for Math 506

N = {1, 2, 3, 4, 5, . . . } = Natural numbers

Z = {0,±1,±, 2,±3, . . . } = Integers

E = {0,±2,±4,±6, . . . } = Even integers

O = {±1,±3,±5, . . . } = Odd integers

Q = {a/b : a, b ∈ Z, b 6= 0} = Rational numbers

R = Real numbers

C = Complex numbers

Z[i] = {a+ bi : a, b ∈ Z} = Gaussian integers

P = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, . . . } = Primes

a ≡ b (mod m) means a is congruent to b mod m, that is, m|(b− a)

[a]m = {a+mk : k ∈ Z} = Residue class of a mod m

Zm = {[0]m, . . . , [m− 1]m} = Ring of integers mod m

a−1 (mod m) = multiplicative inverse of a (mod m)

φ(m) = Euler phi-function

(a, b) = gcd(a, b) = greatest common divisor of a and b

[a, b] = lcm[a, b] = least common multiple of a and b

a|b means a divides b

pe‖m means pe is the largest power of p dividing m

π(x) = the number of primes less than or equal to x.

τ(n) = d(n) = number of positive divisors of n

σ(n) = sum of the positive divisors of n

φ(n) = Euler phi-function= number of postive a < n with (a, n) = 1

µ(n) = Mobius function(a

p

)= Legendre symbol

|S| = order or cardinality of a set S; the number of elements in S

5

6 NOTATION FOR MATH 506

∩ intersection ∪ union

∅ empty set ⊆ subset

∃ there exists ∃! there exists a unique

∀ for all ⇒ implies

⇔ equivalent to iff if and only if

∈ element of ≡ congruent to

CHAPTER 1

Divisibility, Congruences and Induction

1.1. Introduction

N = Natural Numbers: 1, 2, 3, 4, 5, 6, . . . .

Kronecker: “God created the natural numbers. Everything else is man’s hand-iwork.”

Gauss: “Mathematics is the queen of sciences– and number theory is the queenof mathematics.”

Number Theory: The study of the natural numbers.

Questions: It is very easy to ask questions about the natural numbers. Letsask a few about the set of primes P = {2, 3, 5, 7, 11, 13, . . . }.

Q1. Are there infinitely many primes?Q2. How many primes are there up to a given value x?Q3. Are there infinitely many twin primes? (3,5), (5,7), (11,13), (17,19), etc.Q4. Which primes can be expressed as a sum of two squares? 5 = 12 + 22,

13 = 22 + 32, etc.

Which of these problems are easy to solve and which are hard?

Q1: We can answer this question affirmatively at the beginning of this semester.The proof goes back to Euclid.

Q2: This is a difficult problem, but there are now excellent estimates for thenumber of primes up to x. See Example 1.1.11.

Q3: This is still an open problem, but exciting progress has been made thepast couple years. We now know that there are infinitely many consecutive primespn, pn+1 with pn+1 − pn ≤ 246.

Q4: This is a problem that you can investigate right now. Make a table withprimes up to 43 and test them. What conjecture do you make? We will be able toanswer this question fully by the end of the semester.

1. Theory: Axioms, Properties, Theorems, Beauty, Art form, Depth.2. Puzzles, Patterns and Games: Amature mathematicians of all ages enjoy

such problems. This is important in early school education to get children interestedin mathematics and in thinking. People enjoy mathematical puzzles more thanis generally believed. Chess, Checkers, Tic-Tac-Toe, Card Games, Cross WordPuzzles, etc. all involve elements of mathematical reasoning and are valuable skills.

3. Applications: Communications–Cryptography and Error correcting codes.Physics and Chemistry– Atomic theory, quantum mechanics. Music–Musical Scales,acoustics in music halls. Radar and sonar camouflage.

7

8 1. DIVISIBILITY, CONGRUENCES AND INDUCTION

Example 1.1.1. Theory. Triangular numbers: 1, 3, 6, 10, 15, 21, 28, 36, 45, . . . , n(n+1)/2. Squares: 1, 4, 9, 16, 25, 36, . . . , n2.Pentagonal numbers: 1, 5, 12, 22, 35, . . . , n(3n−1)/2. Fermat (1640) Polygonal num-ber conjecture: Every whole number is a sum of at most three triangular numbers,at most 4 squares, at most 5 pentagonal numbers, 6 hexagonal numbers, etc. La-grange proved squares. Gauss proved triangular numbers. Cauchy proved generalcase.

The next few examples are patterns and puzzles.

Example 1.1.2. Squaring numbers that end in 5. Find an easy way for doingthis in your head by recognizing a pattern. Make a conjecture and prove it. 152 =225, 252 = 625, 352 = 1225, 452 = 2025, 552 = 3025, 652 = 4225.

Example 1.1.3. Palindromic sum of squares. Verify that 8162 +3572 +4922 =6182 + 7532 + 2942. Can you find other such examples?

Example 1.1.4. Euler conjectured that a sum of three fourth powers couldnever be a fourth power. Elkies (1988) proved there are infinitely many counterex-amples, such as 4224814 = 958004 + 2175194 + 4145604.

Example 1.1.5. Collatz conjecture. Open problem today. Start with anypositive integer. If it is even divide it by 2. If odd, multiply by 3 and add 1. Aftera finite number of steps one eventually ends up with 1.

Example 1.1.6. N. Elkies and I. Kaplansky. Every integer n can be expressedas a sum of a cube and two squares. Note that n may be negative, as also may bethe cube. For example, if n is odd, say n = 2k + 1, then

n = 2k + 1 = (2k − k2)3 + (k3 − 3k2 + k)2 + (k2 − k − 1)2

Example 1.1.7. There are just five numbers which are the sums of the cubesof their digits. 1 = 13. 153 = 13 + 53 + 33, 370 = 33 + 73 + 03, 371 = 33 + 73 + 13,407 = 43 + 03 + 73. This is an amusing fact, although challenging to prove. (extracredit).

Example 1.1.8. The Magic Number Start with any four digit number, not allof whose digits are equal. Rearrange the digits in decreasing order and in increasingorder and subtract the smaller number from the larger. Repeat the process withthe new value you obtained. Keep repeating this process. Start again with a newfour digit number. What do you discover?

Example 1.1.9. Consider the six digit number x = 142857. Note that 2x =285714, 3x = 428571, 4x = 571428, 5x = 714285, 6x = 857142. Is this just acoincidence? Are there any other six digit numbers with such a cyclical property?Have you ever seen the digits 142857 before? (Theres a little bit of theory going onin this problem. The result can be generalized).

Note that Examples 1.1.2, 1.1.3, 1.1.7, 1.1.8 and 1.1.9 depend on the base10 representation of natural numbers. The important properties of the naturalnumbers are those that are intrinsic, that is, properties that do not depend on themanner in which the number is represented.

Example 1.1.10. 13 + 23 = 9 = (1 + 2)2. 13 + 23 + 33 = 36 = (1 + 2 + 3)2.13 + 23 + 33 + 43 = 100 = (1 + 2 + 3 + 4)2. Maybe we’ve discovered a general

1.2. A BRIEF LOOK AT THE AXIOMS SHEET 9

formula. Lets see, is it always true that (x3 + y3) = (x+ y)2. No. But we suspectthat 13 + · · · + n3 = (1 + 2 + · · · + n)2. We shall use induction to prove results ofthis nature.

Example 1.1.11. Prime Numbers. 2,3,5,7,11,13,17,19,23,29,31,37,41,43,.. Theprime numbers are the building blocks of the whole numbers, in the sense that everywhole number is a product of primes. Do they just pop up at random? How manyprimes are there? Are there infinitely many twin primes? How many primes arethere up to N? Gauss, using a table of primes up to 100000, at the age of 15 madea table comparing the number of primes up to N with the function li(x) =

∫ x2

dtln t .

N π(N) li(N)

103 168 178104 1229 1246105 9592 9630106 78498 78628107 664579 664918108 5761455 5762209109 50847534 508492351010 455052512 455055614The ratio of these two quantities approaches 1 as N goes to infinity. This can

be proved, (although Gauss wasn’t able to prove it. But he was instrumental inthe development of Complex numbers, which are an essential tool for proving thisresult). Its called the prime number theorem, one of the jewels of mathematics. Itwas proven by J. Hadamard and C. de la Vallee Pousin (1896). Look briefly at theaxiom sheet. In particular, the associative law.

1.2. A brief look at the Axioms sheet

Look at the axiom sheet. The first page are axioms shared by the real numbersystem. What distinguishes the integers is their discreteness property. There arethree equivalent ways of expressing this property.

Well-Ordering Axiom of the Integers. Any nonempty subset S of positiveintegers contains a minimum element. That is, there is a minimal element m in Shaving the property that m ≤ x for all x ∈ S.

Note 1.2.1. (i) The rationals and reals do not have such a property. Considerfor example the set of real numbers on the interval (0,1).

(ii) It is this property that assures us that there is no integer hiding somewherebetween 0 and 1, in other words that 1 is the smallest positive integer. For if suchan integer a < 1 existed we could construct an infinite descending chain of positiveintegers a, a2, a3, . . . , with no minimal element.

Axiom of Induction. If S is a nonempty subset of N containing 1 and havingthe property that if n ∈ S then n+ 1 ∈ S then S = N.

Note 1.2.2. It is from this axiom that we obtain the Principle of Induction,which is the basis for induction proofs.

Note: We say that a set of integers S is bounded above if there is some numberL say, such that x < L for all x ∈ S.


Maximum Element Property of the integers. Any nonempty set S ofintegers bounded above contains a maximum element, that is, there is an elementM ∈ S such that x ≤M for all x ∈ S.

Note the use of the word “has”: If we say a set S has an upper bound, thisdoes not mean the upper bound is in S. If we say a set S has a maximum element,this does mean the maximum element is in S.

1.3. Divisibility Properties of Z

Definition 1.3.1. Let a, b ∈ Z, with a 6= 0. We say that a divides b, writtena|b, if there is an integer x such that ax = b.

Equivalently: a|b iff b/a is an integer. This formulation assumes that we havealready constructed the set of rational numbers. Our text book uses this as adefinition. In this class I want you to be able to write proofs about integers justusing the axioms for the integers (so avoid using the rationals).

Terminology: The definition above is mathematical wording “a divides b”. Thisis not a common usage of the word divides, it sounds like the number a is doingsomething to b. Note the difference between 3|6 and 3/6.

Other variations of 3|6: We can say 3 divides 6, 3 is a divisor of 6, 3 is a factorof 6, 6 is divisible by 3, 6 is a multiple of 3.

Example 1.3.1. 3|15, 5|15, but 7 - 15. What is wrong with saying 7|15 because7× 15/7 = 15.

Example 1.3.2. List all divisors of 12. What numbers divide 0? What numbersare divisible by 0?

Example 1.3.3. Find all positive n such that 5|n and n|60. 5|n so n = 5k forsome integer k. 5k|60 so 5kx = 60 for some integers k, x. Thus kx = 12 for somek, x. Thus k is a divisor of 12 so can let k = 1, 2, 3, 4, 6, 12, n = 5, 10, 15, 20, 30, 60.

Theorem 1.3.1. Transitive property of divisibility. If a|b and b|c thena|c.

Proof. You should be able to write a rigorous proof starting from the defini-tion of divisibility. Note the use of the associative law. �

Example 1.3.4. 7|42, 42|420 therefore, 7|420.

Theorem 1.3.2. Additive property of divisibility. Let a, b, c be integerssuch that c|a and c|b. Then (i) c|(a+ b),

(ii) c|(a− b) and(iii) For any integers x, y, c|(ax+ by).

Proof. Again, this is a basic proof. �

Example 1.3.5. Note that the additive property of divisibility can be reworded:If a and b are multiples of c then so is a+ b. Thus, a sum of two evens is even, ora sum of two multiples of 5 is a multiple of 5.

Example 1.3.6. 3|21, 3|15. Therefore 3|(21− 15), i.e. 3|6, and 3|(2 · 21 + 15),i.e. 3|57.

1.5. THE EUCLIDEAN ALGORITHM 11

1.4. Greatest common divisors and least common multiples

Definition 1.4.1. Let a, b be integers, not both 0.1) An integer d is called the greatest common divisor (gcd) of a and b, denoted

gcd(a, b) or (a, b), if (i) d is a divisor of both a and b, and (ii) d is the greatestcommon divisor, that is, if e|a and e|b then d ≥ e.

2) An integer m is called the least common multiple (lcm) of a, b denotedlcm[a, b] or [a, b], if (i) m > 0, (ii) m is a common multiple and (iii) m is the leastcommon multiple.

Note: (i) If a, b are not both 0, then (a, b) exists and is unique. Proof. Let Sbe the set of common divisors of a and b. Note, S is nonempty since 1 ∈ S. Also,S is bounded above by |a|, that is, if x ∈ S then x ≤ |a|. Thus by the Maximumelement principle S contains a maximum element.

(ii) For any a, b, not both zero, gcd(a, b) ≥ 1. Why? 1 is always a commondivisor, and 1 is the smallest positive integer.

(iii) (0, 0) is not defined?(iv) gcd(0, a) = |a| for any nonzero a.(v) lcm[a, 0] does not exist.

Example 1.4.1. (6,−2) = 2, (0, 17) = 17, [6,−2] = 6, [6, 10] = 30.

There are three ways of computing GCD’s: (i) Brute force. (ii) Factoringmethod. (iii) Euclidean Algorithm. For large numbers, the Euclidean algorithm ismuch faster. A PC can handle GCD’s of numbers with thousands of digits usingthe Euclidean algorithm in “no time”. But the fastest known algorithms cannotfactor a 200 digit numbers (in general), given any amount of time.

Example 1.4.2. Factoring Method. Find gcd(240, 108) given the factorizations240 = 24 · 3 · 5, 108 = 22 · 33. Find lcm[240, 108]. This method is a little out oforder, because we have not proven the fundamental theorem of arithmetic yet, butthis is a procedure you likely saw back in grade school.

Example 1.4.3. Find gcd(1127, 1129). Consider the three different methodslisted above.

1.5. The Euclidean algorithm

The Euclidean algorithm is based on the following two lemmas.

Lemma 1.5.1. gcd subtraction lemma. Let a, b be integers, not both 0. Thenfor any integer k, (a, b) = (a− kb, b).

Proof. Let S be the set of common divisors of a, b and T the set of commondivisors of a − kb, b. Claim S = T , and so S and T have the same maximalelement. �

Example 1.5.1. Apply subtraction lemma to find gcd(234, 182) = 26. We have

(234, 182) = (234− 182, 182) = (52, 182) = (52, 182− 3 · 52)

= (52, 26) = (52− 2 · 26, 26) = (0, 26) = 26.

This process is called the Euclidean algorithm.

In order to implement the Euclidean algorithm we use the Division algorithm.


Theorem 1.5.1. Division Algorithm. Let a, b be any integers with b > 0.Then there exist unique integers q, r such that a = qb+ r and 0 ≤ r < b. q is calledthe quotient, and r the remainder. Equivalently, we can write a

b = q + rb .

Proof. Existence: Let S = {x ∈ Z : xb ≤ a}. Then S is bounded above by |a|and so it contains a maximum element, say q. Define r = a− qb. Then a = qb+ r.By maximality of q we have qb ≤ a < (q + 1)b, and so 0 ≤ r < b.

Uniqueness: Suppose that a = qb + r = q′b + r′, with 0 ≤ r, r′ < b. Thenb|q−q′| = |r′−r| < b. Since the left-hand side is a multiple of b this is only possibleif |q − q′| = 0, that is, q = q′. It then follows from the identity qb + r = q′b + r′,that r = r′. �

Example 1.5.2. Find q, r when -392 is divided by 15. We first observe that392/15 = 26 + 2/15, so that 392 = 15 · 26 + 2 and so −392 = (−27)15 + 13.

In your homework you will prove the following alternate version of the divisionalgorithm, which will be used in the Fast Euclidean Algorithm.

Theorem 1.5.2. Division Algorithm with minimal remainder. Givenintegers a, b with b > 0, there exist integers q and r such that a = qb + r with|r| ≤ b/2.

1.6. Euclidean Algorithm

The Euclidean Algorithm is a procedure for calculating gcd’s by using successiveapplications of the division algorithm. There are two versions of it that we will lookat.

I) Traditional Euclidean Algorithm: In this version a positive remainder isalways chosen. Let a ≥ b > 0 be positive integers. Then, by the division algorithmand gcd subtraction lemma, we have

a = q1b+ r1, 0 ≤ r1 < b, (a, b) = (r1, b)(1.1)

b = q2r1 + r2, 0 ≤ r2 < r1, (a, b) = (r1, r2)(1.2)

. . .(1.3)

rk−3 = qk−1rk−2 + rk−1, (a, b) = (rk−1, rk−2)(1.4)

rk−2 = qkrk−1, (a, b) = rk−1.(1.5)

Since r1 > r2 > · · · > rk−1 we are guaranteed that this process will stop in a finitenumber of steps. A priori the number of steps is at most b, since r1 < b and theremainders decrease by at least 1 at each step. In fact, one can prove that theremainder must be cut in half every two steps, that is rn ≤ 1

2rn−2, and so theremainders geometrically decay in size.

II] Fast Euclidean Algorithm: If we allow ourselves to work with positive ornegative remainders as in Theorem 1.5.2, then the remainder in absolute value is cutby a factor of 2 at each step of the algorithm. Thus |r1| ≤ b/2, |r2| ≤ |r1|/2 ≤ b/4,. . . , |ri| ≤ b/2i. The algorithm terminates when |ri| < 1, for this would imply thatri = 0. Thus it suffices to have b < 2i and so the algorithm terminates in at most[log2 b]+1 steps; here [x] denotes the greatest integer less than or equal to x, calledthe floor function or greatest integer function.

Example 1.6.1. Find gcd(150, 51) both ways.

1.7. LINEAR COMBINATIONS AND THE GCDLC THEOREM. 13

1.7. Linear Combinations and the GCDLC theorem.

Definition 1.7.1. A linear combination of two integers a, b is an integer of theform ax+ by, with x, y ∈ Z. Thus, we say that an integer d is a linear combinationof a and b if there exist integers x, y such that d = ax+ by.

Example 1.7.1. Find all linear combinations of 9 and 15. Try to get thesmallest possible.

x y 9x+ 15y1 0 90 1 151 1 242 −1 3

Note that every linear combination is a multiple of 3, the greatest common divisorof 9,15.

Recall: We saw earlier that if d is a common divisor of a, b then d|ax + by forany x, y ∈ Z. In particular this holds for the greatest common divisor of a, b.

Claim: If d = gcd(a, b) then d can be expressed as a linear combination of aand b.

Example 1.7.2. gcd(20,8)=4. By trial and error, 4 = 1 · 20 + (−2)8.gcd(21,15)=3. By trial and error, 3 = 3 · 21− 4 · 15.

To prove the claim in general we again use the Euclidean Algorithm, togetherwith the method of back substitution.

Example 1.7.3. Find d = gcd(126, 49).

(1) 126 = 2 · 49 + 28, d = gcd(28, 49)

(2) 49 = 28 + 21, d = gcd(28, 21)

(3) 28 = 21 + 7, d = gcd(7, 21)

(4) 21 = 3 · 7, d = gcd(7, 0) = 7, STOP

Back Substitution: A method of solving the equation d = ax + by (withd = gcd(a, b)) by working backwards through the steps of the Euclidean algorithm.

Example 1.7.4. Use example above for gcd(126,49) to express 7 as a linearcombination of 126 and 49 by using the method of back substitution. Start withequation (3): 7 = 28 − 21. By (2) we have 21 = 49 − 28. Substituting thisinto previous equation yields 7 = 28 − (49 − 28) = 2 · 28 − 49. By (1) we have28 = 126− 2 · 49. Substituting this into previous equation yields 7 = 2 · (126− 2 ·49)− 49 = 2 · 126− 5 · 49, QED.

Theorem 1.7.1. GCDLC Theorem. The greatest common divisor of twointegers a, b (not both zero) can be expressed as a linear combination of a, b.

Proof. We will give two proofs, the first a constructive proof, and the secondan existence proof.

Constructive proof: Set d = (a, b). From the Euclidean algorithm as displayedabove we obtain from (1.4) that

(1.6) d = rk−1 = rk−3 − qk−1rk−2,


that is, d is a linear combination of rk−3 and rk−2. From the previous step of theEuclidean algorithm, we can express rk−2 as a linear combination of rk−3 and rk−4.Substituting this expression into (1.6), yields d as a linear combination of rk−3 andrk−4. Continuing this process of back substitution k-times, finally yields d as alinear combination of a and b.

Existence Proof: Let S = {ax+by : x, y ∈ Z}, the set of all linear combinationsof a and b. This set clearly contains positive integers, so let e be the smallest positiveinteger in the set (e exists by well ordering). Say e = ax0 +by0, for some x0, y0 ∈ Z.We claim that e = d. Since d|a and d|b, we know d|e, by a basic divisibility property.In particular, d ≤ e. Thus, it suffices to show that e is a common divisor of a andb, for this would imply that e ≤ d, the greatest common divisor of a and b.

Lets show that e|a. To do this, we shall compute a ÷ e and show that theremainder is 0. By the division algorithm, a = qe + r, for some q, r ∈ Z with0 ≤ r < e. Thus a = q(ax0 + by0) + r, so r = a(1− qx0)− bqy0 a linear combinationof a and b. Since r < e we must have r = 0 by the minimality of e in S. Thereforee|a. In the same manner we obtain e|b. QED �

Theorem 1.7.2. GCDLC Corollary Let a, b be integers, not both zero, andd = (a, b).

(i) Every linear combination of a, b is a multiple of d and conversely everymultiple of d is a linear combination of a, b.

(ii) In particular, d is the smallest positive linear combination of a and b.

Proof. The first part of (i) is just a special case of the additive property ofdivisibility. For the second part of (ii) let d = (a, b). Then we can write d = ax+byfor some integers x, y. Suppose that kd is an arbitrary multiple of d. Then kd =(kx)a+ (ky)b and so kd is a linear combination of a,b. Part (ii) is obvious from (i)since every positive multiple of d is ≥ d. �

Example 1.7.5. Suppose I tell you that a, b are whole numbers such that45a+ 37b = 1. What is (a, b)?

Example 1.7.6. Describe all positive integers that can be expressed as a linearcombination of 45 and 37.

Array Method. A more efficient method than back substitution for expressingthe greatest common divisor of two integers as a linear combination of them.

Example 1.7.7. We shall redo a previous example using the array method.Find gcd(49,126) and express it as a linear combination of 49 and 126. To begin,set up an array with the first three columns initialized as shown below. For agiven choice of x and y the linear combination 126x + 49y is given in the firstrow. Now, perform the Euclidean Algorithm on the numbers in top row, but dothe corresponding column operations on the entire array. Let C1 be the columnwith top entry 126, C2 the column with top entry 49, etc.. The first step in theEuclidean algorithm is to subtract 2 times 49 from 126, so we let the next columnC3 be given by C3 = C1− 2C2. Then C4 = C2−C3, C5 = C3−C4. The Euclideanalgorithm stops on the next step, but there is no need to include an extra columnwith zero at the top.

1.8. EUCLID’S LEMMA 15

126x+ 49y 126 49 28 21 7x 1 0 1 −1 2y 0 1 −2 3 −5

Thus we have discovered that 7 = gcd(49, 126) and that 7 = 2 · 126− 5 · 49.

Example 1.7.8. Find gcd(83, 17) and express it as a LC of 83 and 17.

83x+ 17y 83 17 15 2 1x 1 0 1 −1 8y 0 1 −4 5 −39

Thus gcd(83, 17) = 1 and 1 = 8 · 83− 39 · 17.

Example 1.7.9. Solve the equation 15x + 21y + 35z = 1, that is express 1 asa LC of 15,21 and 35, using the array method. Just start subtracting multiples ofone column from another any way you like, with the goal of producing a smallerentry in the top row. (There are many ways to produce a 1 in this manner. Thefirst three columns are initialized as before, then I did C2−C1, C3−C2, C1−C5).

15x+ 21y + 35z 15 21 35 6 14 1x 1 0 0 −1 0 1y 0 1 0 1 −1 1z 0 0 1 0 1 −1

thus 15 + 21− 35 = 1.

1.8. Euclid’s Lemma

Definition 1.8.1. We say two integers a, b are relatively prime if gcd(a, b) = 1,that is a, b have no common factor other than ±1.

Theorem 1.8.1. Let a, b ∈ Z (not both zero) with (a, b) = d. Then (ad ,bd ) = 1.

(Note the two fractions are integers.)

Proof. Proof 1: (This proof doesn’t require GCDLC.) Suppose that k is acommon positive divisor of a

d and bd , so that kx = a

d , ky = bd , for some integers

x, y. Then (kx)d = a and (ky)d = b, that is, (kd)x = a and (kd)y = b. Thuskd is a common divisor of a, b. Since d is the greatest common divisor of a and b,0 < kd ≤ d and so k = 1. This means (ad ,

bd ) = 1.

Proof 2: By GCDLC we have ax + by = d for some integers x, y. Dividing byd gives a

dx+ bdy = 1. Thus (ad ,

bd ) is a divisor of 1, and therefore must equal 1.

�

Lemma 1.8.1. Euclid’s Lemma. If d|ab and gcd(d, a) = 1 then d|b.

Proof. Since d|ab we have dz = ab for some integer z. Since gcd(d, a) = 1, byGCDLC Theorem, there exist integers x, y with dx+ ay = 1. Multiplying by b weobtain

b = b(dx+ ay) = d(bx) + (ab)y = d(bx) + (dz)y = d(bx+ zy),

and so d|b since bx+ zy is an integer. �

Note 1.8.1. In general, if a|bc can we conclude that a|b or a|c? No.


Definition 1.8.2. A positive integer p > 1 is called a prime if the only positivedivisors of p are 1 and p.

An important consequence of Euclid’s Lemma that we will need for provingunique factorization of integers is the following.

Lemma 1.8.2. If p is a prime and p|ab then p|a or p|b.

Proof. Suppose that p is a prime with p|ab. If p - a then (p, a) = 1. Thus, byEuclid’s Lemma, p|b. �

Note 1.8.2. Further applications of Euclid’s Lemma and the GCDLC Theo-rem, for homework.

i) Every common divisor of a and b is a divisor of gcd(a, b).ii) If a|c, b|c and (a, b) = 1, then ab|c.

1.9. Linear Equations in two variables

For integers a, b, c consider solving the linear equations

ax+ by = c (NH)

and

ax+ by = 0 (H)

in integers x, y. (NH) is called a nonhomogeneous equation and (H) a homogeneousequation. Geometrically, we are looking for integer points on a line in the plane.The GCDLC theorem immediately yields a simple criterion for when (NH) has asolution.

Theorem 1.9.1. Solvability of a Linear Equation. Let a, b, c ∈ Z with d =(a, b). The linear equation ax+ by = c has a solution in integers x, y if and only ifd|c.

Proof. The equation is solvable if and only if c is a linear combination of aand b. But the corollary to the GCDLC theorem states that this is possible if andonly if d|c. �

To find all solutions of (NH) we use principle that you are familiar with fromdifferential equations, namely that the general solution to (NH) is obtained byfinding a particular solution of (NH) and adding to it any solution of (H). Thisworks because the equation is linear.

Lemma 1.9.1. Linearity Lemma. If (x0, y0) is a particular integer solution of(NH), then every integer solution of (NH) is of the form

(x, y) = (x0, y0) + (xh, yh),

for some integer solution (xh, yh) of (H).

Proof. Suppose that (x0, y0) is a particular solution of (NH) and that (xh, yh)is a solution of (H). Inserting x = x0 + xh, y = y0 + yh into the left-hand side of(NH) yields

a(x0 + xh) + b(y0 + yh) = (ax0 + by0) + (axh + byh) = c+ 0 = c,

and so (x, y) satisfies (NH).

1.9. LINEAR EQUATIONS IN TWO VARIABLES 17

Conversely, suppose that (x1, y1) is any solution on (NH), so that ax1+by1 = c.Then setting xh = x0 − x1, yh = y0 − y1, we have (x1, y1) = (x0, y0) + (xh, yh) and

axh + byh = a(x0 − x1) + b(y0 − y1) = (ax0 + by0)− (ax1 + by1) = c− c = 0,

and so, (xh, yh) is a solution of (H). �

Next, lets find the general solution of (H).

Lemma 1.9.2. Let a, b be integers, not both zero, and d =gcd(a, b). Then thegeneral solution of (H) is given by

x =−bdt, y =

a

dt, t ∈ Z.

Proof. Let a, b ∈ Z, d =gcd(a, b). First note that for any t ∈ Z, the point(x, y) = (−bd t,

ad t) satisfies (H) since

ax+ by = a−bdt+ b

a

dt =

1

d(−abt+ bat) = 0.

Next, suppose that (x, y) is any solution of (H). Then

ax = by ⇒ a

dx =

b

dy ⇒ a

d|y

the last implication following from Euclid’s Lemma, since (ad ,bd ) = 1. Say a

d t = y,

with t ∈ Z. Then, since x = by/a, we also get x = −bd t. �

As a consequence of the preceding two lemmas, we deduce the following theo-rem.

Theorem 1.9.2. Let a, b, c ∈ Z, d =gcd(a, b). Suppose that d|c, so that (NH)is solvable, and that (x0, y0) is a particular solution of (NH). Then the generalsolution of (NH) is given by

x = x0 −b

dt,(1.7)

y = y0 +a

dt(1.8)

with t ∈ Z.

In applications we may wish to restrict the variables to positive values. Settingx > 0, y > 0 imposes a constraint on the parameter t. If a and b are both positivethe constraint is −y0 da < t < x0

db .

Example 1.9.1. A person has a collection of 17 and 25 cent stamps, but fewerthan 30 25 cent stamps. How can he mail a parcel costing $8.00. Let x be thenumber of 17 cent stamps, y the number of 25 cent stamps. Then we wish tosolve 17x + 25y = 800 with x ≥ 0 and 0 ≤ y < 30. Using the array method wequickly see that (−100, 100) is a particular solution (we could also have noted byinspection that (0, 32) is a solution). Thus, since (17,25)=1, the general solution isx = −100 + 25t, y = 100 − 17t. Plainly t = 5, x = 25, y = 15 is the only viablesolution. (If we started with (0, 32), the general solution is x = 25t, y = 32 − 17t.Again the only viable solution is t = 1, x = 25, y = 15.)


Example 1.9.2. In baseball a few years ago the American league had 2 divisionswith 7 teams each. Say that teams play x games against each team in their owndivision and y games against each team in the other division. Find possible solutionsfor x, y assuming there are 162 games in a season? Which solution do think wasused?

1.10. Introduction to Congruences

Let m be a fixed positive integer, referred to as the “modulus”.

Definition 1.10.1. We say that two integers a, b are congruent (mod m) andwrite

a ≡ b (mod m)

if m|a− b. Equivalently a ≡ b (mod m) iff a = b+ km for some integer k.

Example 1.10.1. Clock Arithmetic. m = 12. The set of integers congruent to3 (mod 12) is

{3 + 12k : k ∈ Z}.

Example 1.10.2. 23 ≡ 18 ≡ 13 ≡ 8 ≡ 3 ≡ −2 (mod 5). The values 18,13, etc.are called residues of 23 (mod 5), and the number 3 is called the least residue of23 (mod 5).

Definition 1.10.2. The least residue (lr) of a (mod m) is the smallest non-negative integer that a is congruent to (mod m).

Lemma 1.10.1. Let a ∈ Z. The least residue of a (mod m) is the remainder individing a by m.

Proof. Applying the division algorithm to a÷m, we get

a = qm+ r,

for some integers q, r with 0 ≤ r < m. Plainly a ≡ r (mod m), since m|(a− r). Ify is any smaller value than r that a is congruent to (mod m), then y ≤ r−m < 0.Thus r is the minimal nonnegative value that a is congruent to (mod m). �

In particular we see that the least residue of a (mod m) is the unique valuebetween 0 and m− 1 (inclusive) that a is congruent to (mod m).

Example 1.10.3. What is the least residue of 800 (mod 7)? By long divisionwe get 800 = 114 · 7 + 2 and so 800 ≡ 2 (mod 7).

Theorem 1.10.1. Congruence is an Equivalence Relationship, that is,it satisfies the following three properties for any integers a, b, c.

(i) Reflexive: a ≡ a (mod m)(ii) Symmetric: If a ≡ b (mod m) then b ≡ a (mod m).(iii) Transitive: If a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m).

Thus, congruence (mod m) partitions Z into equivalence classes of the form

[a]m = {x ∈ Z : x ≡ a (mod m)},

called congruence classes or residue classes. That is,

Z = [0]m ∪ [1]m ∪ · · · ∪ [m− 1]m.

1.11. PRINCIPLE OF INDUCTION 19

Theorem 1.10.2. Substitution Properties of Congruences. Let a, b, c, d be in-tegers with a ≡ b (mod m) and c ≡ d (mod m). Then

(i) a+ c ≡ b+ d (mod m).(ii) ac ≡ bd (mod m).(iii) For any positive integer n, an ≡ bn (mod m).

Example 1.10.4. Find 2004 · 123 · 77 (mod 20). What is the remainder ondividing 3298 by 7? Find 7994 (mod 8).

Theorem 1.10.3. Standard Algebraic properties of congruences. For any inte-gers a, b, c we have

(i) a+ b ≡ b+ a (mod m) (commutative law)(ii) ab ≡ ba (mod m) (commutative law)(iii) a+ (b+ c) ≡ (a+ b) + c (mod m) (associative law)(iv) (ab)c ≡ a(bc) (mod m) (associative law)(v) a(b+ c) ≡ ab+ ac (mod m) (distributive law))

Example 1.10.5. What day of the week will it be 10 years from today?

Note: a is divisible by d iff a ≡ 0 (mod d).

Example 1.10.6. Prove that a number is divisible by 9 iff the sum of its digits(base 10) is divisible by 9. Similarly for 11.

Example 1.10.7. Can 2013 be expressed as a sum of two squares? Supposethat a ≡ 3 (mod 4). Can a be expressed as a sum of two squares of integers. Try3, 7, 11, 15, etc.

1.11. Principle of Induction

Example 1.11.1. Example Notice the pattern for the sum of the first k oddnumbers. Now prove by induction a formula.

Example 1.11.2. Fibonacci sequence 1,1,2,3,5,8,13,21,... Find a formula forf1 + f2 + f3 + f4 + · · ·+ fk

To prove formulas that hold for positive integers, induction is a very powerfultechnique. Recall,

Axiom of Induction: Suppose that S is a subset of the natural numbers suchthat (i) 1 ∈ S and (ii) If n ∈ S then n+ 1 ∈ S. Then S = N.

Principle of Induction: Let P (n) be a statement involving the natural numbern. Suppose that (i) P (1) is true and (ii) If P (n) is true for a given n then P (n+ 1)is true. Then P (n) is true for all natural numbers n.

The connection of course is just to let S be the set of all natural numbers for whichthe statement P (n) is true.

Example 1.11.3. On first HW you conjecture:∑nk=1 k

3 = (1 + 2 + · · ·n)2 =[n(n+ 1)/2]2. Prove.

Note the two ways to conclude an induction proof. 1)“Therefore, by the princi-ple of induction, the statement is true for all natural numbers.” 2) “QED” = QuodErat Demonstrandum. Thus we have established what we wished to demonstrate.


One might object to this method by saying that we are assuming what we wishto prove. Is this a valid objection?

Example 1.11.4. Prove that 16n ≡ 1− 10n (mod 25) for any n ∈ N.

Example 1.11.5. Show that 16n|(6n)! for all n ∈ N.

Example 1.11.6. Prove that everyone has the same name. Let P (n) be thestatement that in any set of n people, everyone has the same name. P (1) is triviallytrue.

Strong Form of Induction. Let P (n) be a statement involving n. Suppose (i)P (1) is true and (ii) If P (1), P (2), . . . P (n) are all true for a given n, then so isP (n+ 1). Then P (n) is true for all natural numbers n.

The induction assumption is stronger, and so this allows us to prove more.

CHAPTER 2

Primes and Unique Factorization

2.1. Fundamental Theorem of Arithmetic

There are three types of natural numbers:1) 1, multiplicative identity or unity element.2) primes. P = {2, 3, 5, 7, . . . }.3) Composites.

Definition 2.1.1. A natural number n > 1 is called a prime if its only positivedivisors are 1 and itself. Otherwise it is called a composite. Thus n is composite ifn = ab for some natural numbers a, b with 1 < a < n, 1 < b < n.

Note 2.1.1. 1 is not called a prime for a couple reasons. The main reason isthat if 1 is a prime, then we would not have unique factorization, for example wecould factor 6 as follows: 6 = 2 · 3 = 2 · 3 · 1 = 2 · 3 · 1 · 1, . . . . Each would bea different factorization of 6 into primes. A second reason is that 1 has a singlepositive divisor, whereas every prime has exactly 2 positive divisors.

Example 2.1.1. i) Everyone factor 120 using a factor tree. Compare. Notethat everyone gets the same factorization.

ii) Do the same thing, but only use even numbers. What do you discover? Notethe set of even numbers E = {2n : n ∈ Z} is closed under + and ·, and enjoys allthe usual axioms as Z (with one exception).

Theorem 2.1.1. Fundamental Theorem of Arithmetic. Any natural num-ber n > 1 can be expressed uniquely as a product of primes.

Note 2.1.2. It is understood that if n is a prime then it trivially is a productof primes.

Proof. Existence. Strong form of induction. For uniqueness we need followinglemma. �

Lemma 2.1.1. (i) If p is a prime and p|ab, then p|a or p|b.(ii) More generally, if p|a1 · · · ak then p|ai for some i.

Proof. Use Euclid to prove (i) and induction to prove (ii). �

Proof. Uniqueness of FTA. �

Example 2.1.2. Factor 60.

Note 2.1.3. Every positive integer n has a unique prime power factorizationof the form n = pe11 · · · p

ekk , with the pi distinct primes and the ei positive integers.

Definition 2.1.2. Let p be a prime n ∈ Z. We write pe‖n if pe|n but pe+1 - n.e is called the multiplicity of p dividing n.

21

22 2. PRIMES AND UNIQUE FACTORIZATION

Example 2.1.3. 2000 = 2453, so 24‖2000 and 53‖2000.

Example 2.1.4. Find the multiplicity of 2 and 5 dividing 21554−2357. Answer,3, 4 resp.

Theorem 2.1.2. Let n > 1 have prime power factorization n = pe11 · · · pekk ,

and let d ∈ N. Then d|n iff d = pf11 · · · pfkk for some nonnegative integers fi ≤ ei,

1 ≤ i ≤ k.

Note 2.1.4. Useful trick: We can use the same set of prime factors for factoringany two integers provided that we allow zero in the exponent position. We will usethis trick for proving the preceding theorem. Example: Let a = 23 · 37, b = 25 · 53.Then we can write a = 23 · 37 · 50, b = 25 · 30 · 53.

Proof. Suppose that d is a positive integer dividing n, say dx = n for some

integer x. Say d =∏li=1 p

fii , x =

∏li=1 p

gii , n =

∏li=1 p

eii , where for i > k, ei = 0,

and all exponents are nonnegative. Then dx =∏li=1 p

fi+gii . By uniqueness of

factorization we deduce that ei = fi + gi, 1 ≤ i ≤ l. If i > k then fi + gi = ei = 0,and so fi = gi = 0. If i ≤ k, then fi = ei − gi ≤ ei. �

2.1.1. The factoring method for finding GCDs and LCMs.

Theorem 2.1.3. Formula for GCD and LCM. Let a, b be positive integers

with prime power factorizations, a = pe11 · · · pekk , b = pf11 · · · p

fkk where the ei, fi are

nonnegative integers. Then

i) (a, b) = pmin(e1,f1)1 . . . p

min(ek,fk)k .

ii) [a, b] = pmax(e1,f1)1 . . . p

max(ek,fk)k .

Proof. i) Suppose that p is a prime divisor of (a, b). Since p|a we must havep = pi for some i, 1 ≤ i ≤ k. Say pgii ‖(a, b). Then gi is the largest integer suchthat pgii |a and pgii |b. Now pgii |a means gi ≤ ei by the preceding theorem, whilepgii |b means gi ≤ fi. Thus gi ≤ min(ei, fi), and the minimal such integer is justgi = min(ei, fi).

The proof of part ii) is similar. �

Example 2.1.5. Let a = 2537715, b = 223654. Find gcd and lcm.

Corollary 2.1.1. For any nonzero integers a, b we have (a, b)[a, b] = |ab|.

Proof. Just use preceding theorem and one simple idea, the proof of whichwe leave to the reader. For any integers e, f

max(e, f) + min(e, f) = e+ f.

�

Note 2.1.5. An elementary proof of the corollary can be given based on justthe definitions of gcd and lcm, but it is not as transparent as the preceding proof.

2.2. Gaussian Integers

Definition 2.2.1. The Gaussian integers is the set Z[i] = {a + bi : a, b ∈ Z}.Note that Z[i] satisfies the ring axioms.

Definition 2.2.2. The absolute value or modulus of a complex number z =a+ bi is given by |z| =

√a2 + b2.

2.2. GAUSSIAN INTEGERS 23

Recall the properties |zw| = |z||w|, |z/w| = |z|/|w|.

Definition 2.2.3. Let z, w be Gaussian integers, z 6= 0. We say that z dividesw, written z|w if zu = w for some u ∈ Z[i].

Example 2.2.1. (1 + 2i)|5 because (1 + 2i)(1− 2i) = 5.

Definition 2.2.4. A Gaussian integer z = a + bi is called a unit if it has amultiplicative inverse in Z[i].

Note 2.2.1. i) The units in Z[i] are {±1,±i}. Why? Suppose z is a unit, sayzw = 1. Then |z||w| = 1 so |z| = 1. The only Gaussian integers on the unit circleare the four listed here.

ii) Units are divisors of every Gaussian integer.

Definition 2.2.5. i) A nonzero Gaussian integer z is called composite if z = uvfor some non-unit Gaussian integers u, v.

ii) A nonzero Gaussian integer z is called a prime if z is not composite and nota unit.

Example 2.2.2. i) 2, 5 are composites, because 2 = (1 + i)(1 − i), 5 = (1 +2i)(1− 2i), and the factors 1± i, 1± 2i are not units. ii) 1 + i is a prime becauseif 1 + i = uv for some u, v ∈ Z[i], then 2 = |1 + i|2 = |u|2|v|2 and so either |u| = 1or |v| = 1, that is, u or v is a unit.

Definition 2.2.6. The gcd of two Gaussian integers z, w is the Gaussian integeru of largest modulus dividing both z and w. It is unique up to unit multiples.Our convention is to choose the representative in the first quadrant (including thepositive real axis but not the imaginary axis.)

Theorem 2.2.1. Division Algorithm. Let z, w ∈ Z[i], w 6= 0. Then thereexist Gaussian integers q, r such that

z = qw + r and 0 ≤ |r| < |w|.

Proof. We want z/w = q + r/w with |r/w| < 1. Define q to be the Gaussianinteger closest to z/w. Certainly | zw − q| < 1. �

Example 2.2.3. a) Find the quotient and remainder for 12 + 5i ÷ 1 + 2i.q = 4− 4i, r = i.

b) Find gcd(12 + 5i, 1 + 2i) = (12 + 5i− q(1 + 2i), 1 + 2i) = (i, 1 + 2i) = 1, usingconvention up choosing rep in 1st quadrant.

What are the primes in Z[i]? There are three types (we will prove this later):(i) Odd integer primes p with p ≡ 3 (mod 4): 3,7,11,19,...(ii) The factors of integer primes p with p ≡ 1 (mod 4): 5 = (1 + 2i)(1 − 2i),

13 = (2 + 3i)(2− 3i), ...(iii) 1 + i, 1− i, the factors of 2.Now that we have a division algorithm, we can proceed as we did with Z to

prove uniqueness of factorization:Division algorithm ⇒ Euclidean algorithm ⇒ GCDLC ⇒ Euclid’s Lemma ⇒

Unique Factorization.

Theorem 2.2.2. Every Gaussian integer can be uniquely expressed as a productof primes, up to the order of the primes and unit multiples.


Example 2.2.4. i) Factor 5. We have 5 = (1 + 2i)(1− 2i), 5 = (2 + i)(2− i).These are the same factorization, since (1 + 2i)(−i) = 2− i, (1− 2i)i = 2 + i.

ii) Factor 30. 30 = 2 · · · 3 · 5 = (1 + i)(1− i)3(1 + 2i)(1− 2i).

2.3. Distribution of primes

Theorem 2.3.1. There are infinitely many primes in N.

Proof. Euclid. Proof by contradiction. Suppose that there are finitely manyprimes, say p1, . . . , pk. Let N = p1 · · · pk + 1. Then, by FTA, N has a prime factor,say pi. Then we have pi|N and pi|(p1 · · · pk). Thus pi|(N − p1 · · · pk), that is, pi|1,a contradiction. Therefore, there must be infinitely many primes. �

Note 2.3.1. In your homework you will prove that there are infinitely manyprimes p ≡ 3 (mod 4) in a similar manner. Another way to say this is that thereexist infinitely many primes in the arithmetic progression {3+4n}∞n=0. A theorem ofDirichlet generalizes this result. It says that given any integers a,m with (a,m) = 1there exist infinitely many primes in the arithmetic progression {a+mn}∞n=1.

Theorem 2.3.2. There exist arbitrarily large gaps between consecutive primes.

Proof. Let n ∈ N. Consider the sequence of consecutive integers n! + 2, n! +3, · · · , n! + n. For 2 ≤ k ≤ n we have k|n! and k|k and so k|(n! + k), and moreoverit is a proper divisor. Thus n! + k is composite. Therefore we have a sequence ofn − 1 consecutive composite numbers, and so if we let p be the largest prime lessthan n! + 2, the gap between p and the next prime must be at least n. �

Theorem 2.3.3. Basic primality test. If n is a positive integer having noprime divisor p ≤

√n, then n is a prime.

Proof. Proof by contradiction. Suppose that n is composite, say n = abwith 1 < a < n, 1 < b < n. We claim that either a ≤

√n or b ≤

√n, else

ab >√n√n = n = ab, a contradiction. Say a ≤

√n. Let p be any prime divisor

of a. Then p ≤ a ≤√n, and, since p|a and a|n we have p|n. But this contradicts

assumption that n has no prime divisor p ≤√n. Therefore n is a prime. �

2.3.1. Sieve of Eratosthenes. An elementary algorithm for finding the setof primes on an interval by sieving out multiples of small primes.

Example 2.3.1. Find all primes between 200 and 220. We start by making alist of all the numbers from 200 to 220. Then cross out all multiples of 2, 3, 5, 7,11, and 13 in turn. Note that

√220 is between 14 and 15, and so by the preceding

theorem we need only test for prime divisors up to 13. Thus, the remaining numbersare all primes. In this case, the only remaining number is 211.

Note 2.3.2. Some Open Problems: 1) Are there infinitely many twin primes.2) Given any even number n, is there a pair of consecutive primes with gap n

between them? Are there infinitely many pairs with gap n between them?3) Goldbach: Given any even number, can we express n as a sum of two primes.

2.3. DISTRIBUTION OF PRIMES 25

2.3.2. Estimating the number of primes up to x. How many primesare there up to a given value x? Let π(x) denote this number. Gauss observeda striking similarity between the value of π(x) and the value of the logarithmicintegral li(x) :=

∫ x2

dtlog t .

x π(x) li(x) =∫ x2

dtlog t

103 168 178104 1229 1246105 9592 9630106 78498 78628107 664579 664918108 5761455 5762209109 50847534 508492351010 455052512 455055614

Integrating by parts, we see that

li(x) =x

log x− 2

log 2+

∫ x

2

dt

log2 t.

Since the latter integral is smaller order of magnitude than xlog x , we see that li(x) ∼

xlog x as x→∞, that is, li(x) is asymptotic to x

log x . Recall, we say two functions are

asymptotic as x→∞ if the ratio approaches 1 as x→∞. Thus, Gauss conjecturedthat π(x) ∼ x

log x . This was finally proven in 1896 by Poussin and Hadamard, and

is now called the Prime Number Theorem.

Theorem 2.3.4. The Prime Number Theorem. π(x) ∼ xlog x .

The proof of this theorem is beyond the scope of this class. The easiest approachto proving the theorem requires complex analysis.

One might ask why it was reasonable for Gauss to consider the logarithmicintegral in the estimation of π(x). Consider the following probabilistic argument.Pick a positive integer n at random from

√x to x. Lets estimate the probability

P that n is a prime? Let p1, . . . , pk be the primes up to√x. For 1 ≤ i ≤ k, we

let Pi denote the probability that n is not divisible by pi. Since one out of everypi integers is divisible by pi we have Pi ≈ 1− 1

pi. Now, n is a prime if and only if

n is not divisible by any of the pi. Thus, assuming the events “divisible by pi” areindependent (which is not exactly the case), we have

P ≈k∏i=1

(1− 1

pi

)=∏p<√x

(1− 1

p

).

Now for any n ∈ N we have∏p≤n

p prime

(1− 1

p

)−1=

∏p≤n

p prime

(1 +

1

p+

1

p2+ · · ·

)

≥n∑k=1

1

k≥

n∑k=1

∫ k+1

k

1

xdx = log(n+ 1),(2.1)


and so

P−1 =∏p<√x

(1− 1

p

)−1≥ log(

√x) =

1

2log x.

It can also be shown in an elementary manner that P−1 ≤ 2 log x.Thus P ≈ 1

log x , that is, the probability that a prime chosen between√x and x

is a prime is on the order of magnitude 1log x . To be precise, the probability density

function for the distribution of primes is of order 1log x , and therefore the number

of primes up to x is of order li(x).

CHAPTER 3

Arithmetic Functions

A function defined on the set of natural numbers is called an arithmetic func-tion. Lets start with a couple examples of arithmetic functions.

Definition 3.0.1. For any positive integer n we let τ(n) denote the numberof positive divisors of n, and σ(n) denote the sum of the positive divisors of n.

Example 3.0.2. τ(1) = 1. τ(2) = 2. For prime p, τ(p) = 2, τ(pk) = k + 1since the divisors are {1, p, p2, . . . , pk}. Next consider τ(pkql) where p, q are distinctprimes. A typical divisor is of the form piqj with 0 lei ≤ k, 0 ≤ j ≤ l. Thus thereare k + 1 choices for i and l+ 1 choices for j and so τ(pkql) = (k + 1)(l+ 1). Notethat τ(pkql) = τ(pk)τ(ql). Functions having this property are called multiplicative.We give a formal definition after this example.

Next lets look at the sigma function. Plainly σ(1) = 1, σ(2) = 3, σ(p) = p+ 1for any prime p, and for any prime power,

σ(pk) = 1 + p+ · · ·+ pk =pk+1 − 1

p− 1.

Next, for a product of two prime powers we have

σ(pkql) =

k∑i=0

l∑j=0

piqj =

k∑i=0

pil∑

j=0

qj = σ(pk)σ(ql),

and again we see the multiplicative behavior.

3.1. Multiplicative Functions

Definition 3.1.1. 1) We say that an arithmetic function f is multiplicative iffor any two natural numbers a, b with gcd(a, b) = 1 we have f(ab) = f(a)f(b).

2) We say that f is totally multiplicative if for any two natural numbers a, b,f(ab) = f(a)f(b).

Example 3.1.1. f(n) = n, f(n) = nk, f(n) ≡ 1, are all multiplicative, in fact,they are totally multiplicative.

Note 3.1.1. If f is a multiplicative function that is not identically 0, thenf(1) = 1. Indeed suppose that f(n) 6= 0 for some n ∈ N. Then f(n) = f(n · 1) =f(n)f(1), and so by cancelation, f(1) = 1.

Theorem 3.1.1. a) If f is a multiplicative function and n is a positive integerwith prime factorization n = pe11 . . . pekk then

(3.1) f(n) = f(pe11 ) . . . f(pekk ).

b) Conversely, if f is an arithmetic function satisfying (3.1), then f is multi-plicative.

27

28 3. ARITHMETIC FUNCTIONS

Proof. a) The proof is by induction on k. The case k = 1 is trivial. Supposestatement is true for a given k, and now consider k+ 1. Let n = pe11 · · · p

ek+1

k+1 . Then

f(n) = f((pe11 · · · p

ekk )p

ek+1

k+1

)= f(pe11 · · · p

ekk )f(p

ek+1

k+1 ),

since f is multiplicative. Then, by the induction assumption, we conclude that

f(n) = f(pe11 ) . . . f(pekk )f(pek+1

k+1 ),

QED.b) Let a, b be positive relatively prime integers, with factorizations a = pe11 · · · p

ekk ,

b = qf11 · · · qfll where the pi, qj are all distinct primes. Then

f(ab) = f(pe11 · · · pekk q

f11 · · · q

fll ) =

k∏i=1

f(peii )

l∏j=1

f(qfjj ) = f(a)f(b).

�

Example 3.1.2. Suppose that f is a multiplicative function with f(2) = 2,f(4) = 3, f(5) = 5. Find f(20) and f(8)? Answer: f(20) = f(4)f(5) = 15, f(8)cannot be determined.

Example 3.1.3. Suppose f is a multiplicative function such that f(p) = 2pfor odd prime p, f(pj) = 3 for odd p and j > 1, f(2) = 4, f(4) = 5, f(8) = 6,f(2k) = 0 for k > 3. Evaluate f(13), f(100), f(80). Answer: f(13) = 26; f(100) =f(22 · 52) = f(22)f(52) = 5 · 3 = 15; f(80) = f(24 · 5) = f(24)f(5) = 0.

Thus, multiplicative functions are determined by their values at prime powers.

Theorem 3.1.2. If f and g are multiplicative functions then so are fg, f/gand fn for any n ∈ N.

Proof. Immediate from definition. For example, to show fg is multiplica-tive, let a, b be positive integers with (a, b) = 1 then fg(ab) = f(ab)g(ab) =f(a)f(b)g(a)g(b) = f(a)g(a)f(b)g(b) = fg(a)fg(b). �

Theorem 3.1.3. Correspondence Theorem for divisors. Let a, b be rel-atively prime positive integers. Then every positive divisor of ab can be uniquelyexpressed in the form de, where d is a positive divisor of a and e is a positive divisorof b. Moreover, any number of the form de where d|a, d > 0, e|b, e > 0, is a divisorof ab.

Proof. Let a = pe11 · · · pekk , b = qf11 · · · q

fll with the pi,qi all distinct primes.

Then ab = pe11 · · · qfll . By an earlier theorem, any divisor s of ab is of the form

s = pg11 · · · pgkk q

h11 · · · q

hl

l for some integers gi, hi with 0 ≤ gi ≤ ei and 0 ≤ hi ≤ fi.

Let d = pg11 · · · pgkk , e = qh1

1 · · · qhl

l . Then s = de, d|a and e|b. Conversely, if westart with d and e as defined above, plainly de is a divisor of ab. The expression isunique by FTA. �

Equivalent Statement: Let Da, Db, Dab denote the sets of positive divisors ofa, b, ab respectively and let

Da ×Db = {(d, e) : d ∈ Da, e ∈ Db},the cartesian product of Da and Db. Then there is a 1-to-1 correspondence betweenDab and Da×Db given by the mapping φ : Da×Db → Dab defined by φ(d, e) = de.

3.2. PERFECT, DEFICIENT AND ABUNDANT NUMBERS 29

Proof. Here is another proof of the Correspondence Theorem for Divisorsusing this equivalent formulation. (i) Note the mapping goes into Dab. (ii) Themapping is one-to-one: d1e1 = d2e2 → d1|d2e2 → d1|d2 since (d1, e2) = 1. Similarlyd2|d1 and so d1 = d2, e1 = e2. (iii) The mapping is onto. Let f |ab. Put d = (f, a),e = (f, b). Then f = (f, ab) = (f, a)(f, b) = de. �

Example 3.1.4. Make an array to illustrate the correspondence for a = 28,b = 15.

Theorem 3.1.4. τ(n) and σ(n) are multiplicative functions.

Proof. Suppose (a, b) = 1. let Da = {d1, . . . , dk}, Db = {e1, . . . , el}. Then,by correspondence theorem Dab = {diej : 1 ≤ i ≤ k, 1 ≤ j ≤ l}. In particularτ(ab) = |Dab| = kl = τ(a)τ(b). Also

σ(ab) =

k∑i=1

l∑j=1

diej = (d1 + d2 + · · ·+ dk)(e1 + e2 + · · ·+ el) = σ(a)σ(b),

by the distributive law. �

Now for prime powers we easily see:

τ(pe) = e+ 1,

σ(pe) = 1 + p+ p2 + · · ·+ pe =pe+1 − 1

p− 1.

Thus, since τ(n) and σ(n) are multiplicative, we obtain

Theorem 3.1.5. Formulas for τ(n) and σ(n). Let n = pe11 · · · pekk . Then

i) τ(n) =∏ki=1(ei + 1).

ii) σ(n) =∏ki=1

pe1+1i −1pi−1 .

3.2. Perfect, Deficient and Abundant Numbers

Definition 3.2.1. We say that a positive integer n isi) Deficient if σ(n) < 2n,ii) Abundant if σ(n) > 2n, andiii) Perfect, if σ(n) = 2n.

Another way to think about it. A number is perfect if it equals the sum of itsproper divisors.

Example 3.2.1. i) 6, 28 are perfect.ii) Any prime power is deficient. Any product of two odd prime powers is

deficient.iii) Any multiple of a perfect number (greater than the perfect number) is

abundant.


Example 3.2.2. The following numbers were known to be perfect to the an-cients. Lets look at their factorizations to see if a pattern can be discerned.

6 = 2 · 3,28 = 4 · 7 = 22 · 7,

496 = 16 · 31 = 24 · 31,

8128 = 64 · 127 = 26 · 127

What about 8 · 15? This is abundant.

Conjecture: n is perfect if n = 2k(2k+1 − 1) and 2k+1 − 1 is a prime.

Proof. We have σ(n) = σ(2k(2k+1 − 1)) = σ(2k)σ(2k+1 − 1). Now σ(2k) =2k+1 − 1 and since 2k+1 − 1 is a prime, σ(2k+1 − 1) = 2k+1. Thus σ(n) = (2k+1 =1)2k+1 = 2n. �

Questions: Are these the only perfect numbers? Are there any odd perfectnumbers? When is 2k+1 − 1 a prime?

These are all open questions. It is known that if n is an odd perfect number,then n > 10300, n must have a prime divisor > 100000 and it must have at least 11distinct prime factors. However, for even perfect numbers we have

Theorem 3.2.1. Euler’s characterization of even perfect numbers. Aneven number is perfect if and only if it is of the form 2k(2k+1 − 1) with 2k+1 − 1 aprime.

Proof. Already shown one way. Suppose now that n is an even perfect numbersay n = 2ka with a odd. Then σ(n) = 2n iff (2k+1 − 1)σ(a) = 2k+1a. Letσ(a) = a + b. The above holds iff a = b(2k+1 − 1). In particular b|a and b < a. Ifb 6= 1 then a, b, 1 are distinct divisors of a and so σ(a) ≥ a+ b+ 1 a contradiction.Thus b = 1 and a = 2k+1 − 1 and σ(a) = a + 1. It follows that a must be an oddprime of the desired form. �

3.2.1. Mersenne Primes.

Definition 3.2.2. Any prime of the form Mk = 2k − 1 is called a Mersenneprime. (In general, numbers of the form Mk are called Mersenne numbers.)

Theorem 3.2.2. (i) If d|k then 2d − 1|2k − 1.(ii) Thus if 2k − 1 is a prime then k must be a prime.

Proof. (i) We make use of the factoring formula

Xn − 1 = (X − 1)(Xn−1 + · · ·+ 1).

Suppose d|k, say k = dn for some n ∈ N. Then

2k − 1 = 2dn − 1 = (2d)n − 1 = (2d − 1)(2d(n−1) + · · ·+ 1),

and so (2d − 1)|(2k − 1).(ii) Immediate from part (i). �

Example 3.2.3. k = 2, 3, 5, 7 yield Mersenne primes 3,7,31,127. However,k = 11, gives 2047 = 23 · 89, a composite. Thus we do not always get a Mersenneprime when k is prime.

3.2. PERFECT, DEFICIENT AND ABUNDANT NUMBERS 31

Example 3.2.4. Factors of Mersenne numbers with composite k. If k = 9,then 23− 1 = 7 is a factor and we see M9 = 511 = 7 · 73. If k = 10, then 22− 1 = 3and 25 − 1 = 31 are factors and we see M10 = 1023 = 3 · 11 · 31.

3.2.2. GIMPS, Great Internet Mersenne Prime Search. It was popularfor many years to test the speed of new computers by seeing if they can find thelargest known prime number using standard algorithms. All of the largest knownprimes are Mersenne primes. In 1876 Lucas had the record k = 127. He discovereda clever algorithm in order to deal with numbers of this size by hand. In 1985a Cray X-MP obtained k = 216091 a 65000 digit number, in 3 hours. In 2008the 45th Mersenne prime was discovered at UCLA, 243,112,609 − 1 a number with12,978,189 digits, earning the finders $100000. The prize money comes from theEFF Cooperative Computing Awards. The remaining awards are $150000 for thefirst to find a prime with 100 million digits, and $250000 for the first to find aprime with a billion digits. The current (as of 3/13/2016) largest known primeis Mk with k = 74207281, a prime with 22338618 digits. It is the 49th knownMersenne prime. It was found at the University of Central Missouri. It took onemonth of computing on a PC with Intel I7-4790 CPU. Check GIMPS on the internetif you wish to participate in this search. You will also find there a list of all knownMersenne primes and when they were discovered.

Open Problem: Are there infinitely many Mersenne primes?

3.2.3. Fermat primes.

Definition 3.2.3. Any prime of the form Fk = 2k+1 is called a Fermat prime.(In general, numbers of the form Fk are called Fermat numbers.)

Theorem 3.2.3. (i) If d|k and k/d is odd, then 2d + 1|2k + 1.(ii) Thus if 2k + 1 is a Fermat prime, then k is a power of 2.

Proof. i) Say k = da for some a ∈ N. Since k/d is odd, a is odd. Then

2k + 1 = 2da + 1 = (2d)a + 1 = Xa + 1,

where X = 2d. Now, since a is odd

Xa + 1 = (X + 1)(Xa−1 −Xa−2 +Xa−3 − · · ·+ 1),

and thus substituting X = 2d we obtain

2k + 1 = (2d + 1)(2d(a−1) − 2d(a−2) + · · ·+ 1),

which implies that 2d + 1|2k + 1.ii) Suppose that 2k + 1 is a prime. If k is not a power of 2, then k has an odd

divisor a > 1, say k = da with d ∈ N. Then d|k and k/d is odd, so by part i), 2d+1is a divisor of 2k + 1. Since d < k this is a proper divisor of 2k + 1 strictly largerthan 1, contradicting the fact that 2k +1 is prime. Therefore, k has no odd divisor,and must be a power of 2. �

Example 3.2.5. Find three proper factors of 215 +1. Letting d = 1, 3, 5 we get21 + 1 = 3, 23 + 1 = 9 and 25 + 1 = 33 as factors. In particular, from the latter, wesee that 11 is also a factor, and from the latter two we obtain 99 as a factor. Thus,215 + 1 = 99 · 331 = 32 · 11 · 331 is the prime factorization.


Fermat conjectured that every number of the form 22l

+ 1 is a prime. This is

true for l = 0, 1, 2, 3, 4, where we get 3,5,17,257, and 65537, but false for 225

+ 1.Euler was able to factor the latter. Here’s one way.

232 + 1 = (29 + 27 + 1)(223 − 221 + 219 − 217 + 214 − 29 − 27 + 1)

4, 294, 967, 297 = 641 · 6700417.

How might one discover that 641 is a divisor of 225

+ 1 without a lot of trialand error? Using Fermat’s Little Theorem, which we will prove later this semester,

one can prove that if p is an odd prime divisor of a Fermat number 22l

+ 1, thenp ≡ 1 (mod 2l+2). Thus in the case l = 5 we must have p ≡ 1 (mod 128). Thusfirst few numbers of this form are 129, 257, 385, 513, 641. Plainly 129,385 and 513are not primes. Testing 257 and 641, one quickly sees that 641 is a factor.

Open problem: Are there any other Fermat primes besides the 5 listed above?In particular it is unknown whether there are infinitely many.

Gauss made a connection between Fermat primes and construction of regularn-gons.

Theorem 3.2.4. (i) A regular n-gon with n a prime, can be constructed withstraight-edge and compass if and only if n is a Fermat prime.

(ii) More generally, a regular n-gon can be constructed iff n is of the formn = 2kp1p2 · · · pl for some k ≥ 0 and distinct Fermat primes p1, . . . , pl.

3.3. Properties of multiplicative functions.

Recall definition of multiplicative function.

Definition 3.3.1. 1) We say that an arithmetic function f is multiplicative iffor any two natural numbers a, b with gcd(a, b) = 1 we have f(ab) = f(a)f(b).

2) We say that f is totally multiplicative if for any two natural numbers a, b,f(ab) = f(a)f(b).

Let f(n) be a given multiplicative function. Define

F (n) :=∑d|n

f(d),

the sum being over all positive divisors of n. We claim that F is multiplicative.First lets look at a few examples of such functions.

Example 3.3.1. Letting f(n) = 1 for all n, we see that F (n) = τ(n). Lettingf(n) = n for all n, we get F (n) = σ(n). Let f(n) = n2 we get F (n) = σ2(n) thesum of the squares of the positive divisors of n.

Theorem 3.3.1. Suppose that f is a multiplicative function. Then so is thefunction F defined by F (n) =

∑d|n f(d).

Proof. Suppose that (a, b) = 1. By the correspondence theorem, any positivedivisor d of ab can be expressed uniquely in the manner d = ce, where c|a and e|b,c, e positive. Note that for any such c, e, we have (c, e) = 1, that is c, e have no

3.3. PROPERTIES OF MULTIPLICATIVE FUNCTIONS. 33

common prime factor, since (a, b) = 1. Hence f(ce) = f(c)f(e). Thus we have

F (ab) =∑d|ab

f(d) =∑c|a

∑e|b

f(ce)

=∑c|a

∑e|b

f(c)f(e) =∑c|a

f(c)∑e|b

f(e) = F (a)F (b).

�

Example 3.3.2. Let F (n) =∑d|n τ(d). Find a formula for F (n) and evaluate

F (8000). First we evaluate F (pe) for any prime power pe.

F (pe) = τ(1)+τ(p)+τ(p2)+· · ·+τ(pe) = 1+2+3+· · ·+e+(e+1) =(e+ 1)(e+ 2)

2.

Next we observe that since τ is multiplicative, so is F by preceding theorem. Thus,if n = pe11 · · · p

ekk , then

F (n) =

k∏i=1

F (peii ) =

k∏i=1

(ei + 1)(ei + 2)

2.

If n = 8000 = 23 · 1000 = 2653, then F (n) = (6+1)(6+2)2

(3+1)(3+2)2 = 28 · 10 = 280.

Example 3.3.3. Let σ2(n) =∑d|n d

2. Find a formula for σ2(n). First note

that σ2 is multiplicative since the function f(n) = n2 is multiplicative. Next, forany prime power pe we have

σ2(pe) = 1 + p2 + p4 + · · ·+ p2e =p2(e+1) − 1

p2 − 1.

Thus for n = pe11 · · · pekk we get

σ2(n) =

k∏i=1

p2(ei+1) − 1

p2 − 1.

3.3.1. The Euler Phi Function.

Definition 3.3.2. For any positive integer n we define φ(n) to be the numberof positive integers less than or equal to n that are relatively prime to n.

Example 3.3.4. Find φ(10). The values relatively prime to 10 are 1,3,7 and 9,so φ(10) = 4.

Example 3.3.5. For any prime p, φ(p) = p − 1. For any prime power pk,φ(pk) = pk− pk−1 since the only values not relatively prime to pk are the multiplesof p, and there are pk−1 such multiples less than or equal to pk.

Suppose now that n is a positive integer with factorization pe11 pe22 · · · p

ekk . How

do we find φ(n)? For any divisor d of n, let

Sd = {k : 1 ≤ k ≤ n, d|k}.

Note |Sd| = n/d. By the inclusion-exclusion principle, to find the number of valuesfor 1 to n relatively prime to n, we need to count how many points are not in Sp1 ,


Sp2 , ..., or Spk .

φ(n) = n− |Sp1 | − |Sp2 | − · · · − |Spk |+ |Sp1p2 |+ |Sp1p3 |+ · · ·+ (−1)k|Sp1···pk |

= n

(1− 1

p1− · · · − 1

pk+

1

p1p2+

1

p1p3+ · · ·+ (−1)k

1

p1 · · · pk

)= n

k∏i=1

(1− 1

pi

).

Another way to write the last product is

k∏i=1

peii

k∏i=1

(1− 1

pi

)=

k∏i=1

(peii − p

ei−1i

)=

k∏i=1

φ(peii ).

Thus we have established the following theorem.

Theorem 3.3.2. Let n have prime power factorization n = pe11 · · · pekk . Then

φ(n) =∏ki=1 φ(peii ) = Πk

i=1pe−1i (pi − 1).

Corollary 3.3.1. The Euler phi-function is multiplicative.

Proof. Follows from Theorem 3.1 (b). �

In Example 3.4.4 we use the Mobius function to show the Euler phi-functionis multiplicative. In the next chapter, we use the Chinese Remainder Theorem toprove this fact.

Here is another interesting property of the Euler phi-function.

Theorem 3.3.3. For any natural number n we have∑d|n φ(d) = n.

Proof. Let F (n) =∑d|n φ(d). First note that since φ is multiplicative, so is

F . Next, for any prime power pe we have

F (pe) = φ(1)+φ(p)+φ(p2)+· · ·+φ(pe) = 1+(p−1)+(p2−p)+· · ·+(pe−pe−1) = pe,

since the last sum is telescoping. Thus if n is any integer, with prime factorizationn = pe11 · · · p

ekk , then we have

F (n) = F (pe11 ) · · ·F (pekk ) = pe11 · · · pekk = n.

�

A direct proof of this theorem, that does not appeal to the multiplicative prop-erty of φ can be given as follows. A complex number w is called a primitive n-throot of unity if wn = 1 but wd 6= 1 for all d < n. There are φ(n) primitive n-throots of unity. Now every n-th root of unity is a primitive d-th root of unity forsome (unique) d|n. Thus since there are n, n-th roots of unity, and φ(d) primitived-th roots of unity for each d|n, we see that n =

∑d|n φ(d).

3.4. The Mobius Function

Definition 3.4.1. The Mobius function µ is defined by

µ(n) =

1, if n = 1;

(−1)k, if n = p1p2 · · · pk, a product of distinct primes;

0, if p2|n for some prime p.

3.4. THE MOBIUS FUNCTION 35

Example 3.4.1. Make a list of the values of µ(n) for 20 ≤ n ≤ 30: 0,1,1,-1,0,0,1,0,0,-1,-1 Are the values statistically random in some sense? This is a deepand open problem.

Theorem 3.4.1. µ(n) is a multiplicative function.

Proof. Let a, b be positive integers with (a, b) = 1. If a or b equals 1, sayWLOG a = 1 then µ(ab) = µ(b) while µ(a)µ(b) = µ(1)µ(b) = 1 ·µ(b) = µ(b) and soµ(ab) = µ(a)µ(b). Next, suppose that either a or b is divisible by p2 for some primep, say wlog a. Then so is ab, and so µ(ab) = 0, while µ(a)µ(b) = 0µ(b) = 0, so againµ(ab) = µ(a)µ(b). Finally, suppose that a, b are products of distinct primes, saya = p1 · · · pk, b = q1 · · · ql. The pi, qj must all be distinct since (a, b) = 1. Thus abis a product of k+ l distinct primes, and we have µ(ab) = (−1)k+l = (−1)k(−1)l =µ(a)µ(b). �

So why is the Mobius function so important. It is because of the followingtheorem, and the Mobius inversion formula that follows it.

Theorem 3.4.2. For any natural number n∑d|n

µ(d) =

{1 if n = 1

0 n > 1

Proof. Let F (n) =∑d|n µ(d). For any prime power pe we have

F (pe) = µ(1) + µ(p) + µ(p2) + · · ·+ µ(pe) = 1− 1 + 0 + · · ·+ 0 = 0,

Thus if n > 1 with prime factorization n = pe11 · · · pekk (k ≥ 1), then F (n) =

F (pe11 ) · · ·F (pekk ) = 0 · · · 0 = 0. Trivially, F (1) = 1. �

Example 3.4.2. Calculate∑d|20 µ(d) = µ(1) + µ(2) + µ(4) + µ(5) + µ(10) +

µ(20) = 1− 1 + 0− 1 + 1 + 0 = 0.

Theorem 3.4.3. Mobius inversion formula. Let f be any arithmetic func-tion and F (n) =

∑d|n f(d). Then for any n ∈ N we have

f(n) =∑d|n

F (d)µ(n/d) =∑d|n

F(nd

)µ(d).

Think of the sums in the theorem as being a sum over the divisor pairs d, nd ofn.

Proof. By the definition of F , we have∑d|n

F(nd

)µ(d) =

∑d|n

µ(d)

∑e|nd

f(e)

=∑e|n

f(e)∑d|ne

µ(d)

=∑e|n

f(e)δ1(n/e) = f(n).

�

Example 3.4.3. σ(n) =∑d|n d and so n =

∑d|n σ(d)µ(n/d).


Theorem 3.4.4. Let f, g be multiplicative functions and define F (n) =∑d|n f(d)g(n/d).

Then F is multiplicative.

Proof. Follows from correspondence theorem. Let (a, b) = 1. Then

F (ab) =∑l|ab

f(l)g

(ab

l

)

=∑d|a

∑e|b

f(de)g

(a

d

b

e

)

=∑d|a

∑e|b

f(d)f(e)g(ad

)g

(b

e

)

=∑d|a

f(d)g(ad

)∑e|b

f(e)g

(b

e

)= F (a)F (b).

�

Corollary 3.4.1. Let f be any arithmetic function and F be defined by F (n) =∑d|n f(d). Then F is multiplicative if and only if f is multiplicative.

Proof. One direction is just Theorem 3.3.1. For the converse, suppose thatF is multiplicative. Then by the Mobius inversion formula we have

f(n) =∑d|n

µ(d)F (n/d),

which is multiplicative by the preceding theorem. �

Example 3.4.4. Suppose we start with the formula n =∑d|n φ(d), in Theorem

3.3.3. Recall that after the theorem we gave a proof that did not rely on the fact thatφ was multiplicative. By the preceding corollary we deduce that φ is multiplicative.In fact, by the Mobius inversion formula we obtain

φ(n) =∑d|n

µ(d)n

d= n

∑d|n

µ(d)

d.

3.4.1. Derivation of the Mobius Inversion Formula. Above we statedas a theorem the Mobius Inversion Formula without motivating where the formulacame from in the first place. Lets see how we might derive this formula from scratchusing indicator functions.

Definition 3.4.2. The indicator function (or characteristic function) for asingleton point set {n} is defined by

δn(x) =

{1 if x = n,

0 if x 6= n.

Thus the previous theorem can be restated:∑d|n µ(d) = δ1(n).

Corollary 3.4.2. Let n ∈ N. For any divisor d of n we have, δn(d) =∑e|nd

µ(e).

Proof. Since d|n we have δn(d) = δ1(n/d) =∑e|nd

µ(e). �

3.4. THE MOBIUS FUNCTION 37

Suppose that f is an arithmetic function and we define F by F (n) =∑d|n f(d).

How can we invert this equation, and solve for f(n) in terms of F (n)? Letting δ bethe indicator function for the point set {n}, we have

f(n) =∑d|n

f(d)δn(d) =∑d|n

f(d)

∑e|nd

µ(e)

=∑e|n

µ(e)

∑d|ne

f(d)

=∑e|n

µ(e)F (n

e).

CHAPTER 4

More on Congruences

Definition 4.0.3. i) A complete residue system (mod m) is a set of m dis-tinct integers (mod m), {x1, . . . , xm}. Thus, every integer is congruent to exactlyone of the xi (mod m).

ii) A reduced residue system (mod m) is a set of φ(m) distinct integers(mod m), each relatively prime to m, {x1, . . . , xφ(m)}.

Example 4.0.5. For m = 5, the following are all examples of complete residuesystems (mod 5): {0, 1, 2, 3, 4}, {5, 6, 7, 8, 9}, {5, 1, 22,−27, 94}.

Form = 10, {1, 3, 7, 9} is a reduced residue system (mod 10). So is {11, 3,−3,−1}.

4.1. Counting Solutions of Congruences

Let f(x) be a polynomial with integer coefficients and m be a positive integer.We wish to solve the congruence

(4.1) f(x) ≡ 0 (mod m).

Example 4.1.1. Solve x2 ≡ 1 (mod 8). By testing values from 0 to 7, we seethat the solution set is all x with x ≡ 1, 3, 5 or 7 (mod 8). Thus {1, 3, 5, 7} is calleda complete set of solutions of the congruence x2 ≡ 1 (mod 8). Thus this congruencehas 4 distinct solutions (mod 8).

Definition 4.1.1. (i) A set of integers {x1, . . . , xk} is called a complete setof solutions of the congruence (4.1) if the values x1, . . . , xk are distinct residues(mod m), and every solution of (4.1) is congruent to one of these values (mod m).

(ii) A complete set of solutions is called the “least” complete set of solutions ifthe xi are least residues (that is, 0 ≤ xi ≤ m− 1.)

(iii) If {x1, . . . , xk} is a complete set of solutions of (4.1), then we say (4.1) hask distinct solutions (mod m).

4.2. Linear Congruences

Consider the linear congruence

(4.2) ax ≡ b (mod m),

where a, b ∈ Z. Note, this is equivalent to solving the linear equation ax = b+my,that is ax−my = b, and we did this earlier. Putting d = (a,m), we saw that thiswas solvable iff d|b, in which case the general solution was given by x = x0 + m

d t,y = y0 − m

d t, with t any integer and (x0, y0) any particular solution.

Theorem 4.2.1. Let d = (a,m). The linear congruence (4.2) has a solution ifand only if d|b, in which case a complete set of solutions is given by

x = x0 + tm

d, 0 ≤ t ≤ d− 1,

39

40 4. MORE ON CONGRUENCES

where x0 is any particular solution of (4.2). Thus, if a solution exists, then thereare d distinct solutions (mod m).

Note that we stop at t = d− 1 in order to avoid repetition of solutions.

Example 4.2.1. Solve 7x ≡ 2 (mod 11). d = (7, 11) = 1 and 1|2 so a uniquesolution exists. To solve the congruence, we solve the linear equation 7x− 11y = 2using the array method, obtaining x = 5 + 11t. Thus, x ≡ 5 (mod 11).

Example 4.2.2. Solve 8x ≡ 4 (mod 12). d = (12, 8) = 4|4 so there exists 4distinct solution (mod 12). We solve the equation 8x − 12y = 4, or equivalently2x − 3y = 1. A particular solution is (2, 1) and the general solution is x = 2 + 3t,y = 1 + 2t. Thus x ≡ 2, 5, 8, 11 (mod 12).

4.3. Multiplicative inverses (mod m)

Definition 4.3.1. Let a ∈ Z. An integer x is called a multiplicative inverse ofa (mod m) if ax ≡ 1 (mod m). In this case we write x ≡ a−1 (mod m).

Example 4.3.1. Find the multiplicative inverse of 3 (mod 10). Answer: 3−1 ≡7 (mod 10), since 3·7 ≡ 1 (mod 10). Find the multiplicative inverse of 2 (mod 10).Answer: There is none, since the congruence 2x ≡ 1 (mod 10) is not solvable. Itdoesn’t exist, since (2, 10) > 1.

Theorem 4.3.1. An integer a has a multiplicative inverse (mod m) if andonly if (a,m) = 1. In this case, the multiplicative inverse is unique.

Example 4.3.2. Use the multiplicative inverse of 3 (mod 10) to solve the con-gruence 3x ≡ 7 (mod 10).

Note 4.3.1. The values in a reduced residue system (mod m) are the valuesthat have multiplicative inverses (mod m).

Theorem 4.3.2. Cancellation Law. If (a,m) = 1 and ax ≡ ay (mod m),then x ≡ y (mod m).

Proof. Since (a,m) = 1, a has a multiplicative inverse (mod m) and so wecan multiply both sides of the congruence ax ≡ ay (mod m) by a−1 (mod m), toget x ≡ y (mod m). �

Theorem 4.3.3. Wilson’s Theorem For any prime p, (p−1)! ≡ −1 (mod p).

Proof. The statement is trivial for p = 2, so assume that p is odd. Note thatthe only solutions of the congruence x2 ≡ 1 (mod p) are x ≡ ±1 (mod p). Thusif x 6≡ ±1 (mod p) then x−1 6≡ x (mod p), and so we can form pairs (x, x−1), andobtain

{1, 2, . . . , p− 1} ≡ {1,−1, x1, x−11 , x2, x

−12 , . . . , xk, x

−1k } (mod p).

Thus taking the product of all of the elements in each set we see,

(p− 1)! ≡ 1(−1)x1x−11 · · ·xkx

−1k ≡ −1 (mod p).

�

4.5. FERMAT’S LITTLE THEOREM AND EULER’S THEOREM 41

4.4. Chinese Remainder Theorem

Example 4.4.1. Find a whole number n such that the remainder is 3 whenn is divided by 7, 5 when divided by 11. This is equivalent to the system x ≡ 3(mod 7), x ≡ 5 (mod 11). Set x = 3 + 7t, 3 + 7t ≡ 5 (mod 11), t ≡ 5 (mod 11).Thus x ≡ 38 (mod 77).

Theorem 4.4.1. Chinese Remainder Theorem. Let a, b be positive integerswith (a, b) = 1. Let h, k be any integers. Then the system

x ≡ h (mod a)

x ≡ k (mod b).

has a unique solution (mod ab).

Proof. Set x = h + at, substitute to get at ≡ k − h (mod b). By previoustheorem this system has solution t = t0 + bs, s ∈ Z. Substituting gives x ≡ h+ at0(mod ab) is the unique solution. �

Example 4.4.2. Historical example used by the ancient Chinese. Suppose wewish to determine the exact number of people in a large crowd of about 500 people.Have the crowd break into groups of 7, 8 and 9 people, with 2,4,6 people left over inthe three cases. Thus we must solve x ≡ 2 (mod 7), x ≡ 4 (mod 8), x ≡ 6 (mod 9).To solve, start with the biggest modulus, that is set x = 6+9t, t ∈ Z. Substitute toget t ≡ 6 (mod 8) and consequently x ≡ 60 (mod 7)2, say x = 60+72s. Substituteagain to get s ≡ 6 (mod 7) and x ≡ 492 (mod 5)04. Thus there are 492 people.

Definition 4.4.1. We say a set of integers {a1, a2, . . . , ak} are pairwise rela-tively prime if (ai, aj) = 1 for all i, j with 1 ≤ i < j ≤ k.

Example 4.4.3. The integers 6, 11, 15 are not pairwise relatively prime, eventhough gcd(6, 11, 15) = 1.

Theorem 4.4.2. CRT with more than 2 congruences Let m1, . . . ,mn bepairwise relatively prime positive integers, and h1, . . . , hn be any integers. Then thesystem

x ≡ hi (mod mi), 1 ≤ i ≤ n,has a unique solution (mod m1m2 · · ·mn).

4.5. Fermat’s Little Theorem and Euler’s Theorem

Theorem 4.5.1. Fermat’s Little Theorem FLT. Let p be a prime and a bean integer with p - a. Then ap−1 ≡ 1 (mod p).

Proof. Special case of Euler’s Theorem, coming next. Just set m = p andnote φ(p) = p− 1. �

An equivalent version of Fermat’s Little Theorem is the following: For anyprime p and integer a we have ap ≡ a (mod p). Note, if p|a then this statement istrivially true (both sides are 0) while if p - a then we can divide both sides by a toobtain the original statement.

Theorem 4.5.2. Euler’s Theorem. Let m ∈ N and a ∈ Z with (a,m) = 1.Then aφ(m) ≡ 1 (mod m).

Note: Euler’s theorem fails if (a,m) > 1.


Example 4.5.1. Find the value of 171802 (mod 27). Note φ(27) = 18 and(17, 27) = 1, so 1718 ≡ 1 (mod 27). Thus 171802 = (1718)100172 ≡ 289 ≡ 19(mod 27).

Lemma 4.5.1. Permutation Lemma. Let m ∈ N and a be an integer with(a,m) = 1 and k = φ(m). Let {x1, x2 . . . , xk} be a reduced residue system(mod m). Then the set {ax1, ax2, . . . , axk} is also a reduced residue system (mod m).

Example 4.5.2. m = 10, {1, 3, 7, 9} is a reduced residue system. Let a = 3, 7, 9to obtain new reduced residue systems, and note that they are just permutationsof the original.

Proof of Permutation Lemma. By the cancelation law, the values ax1, . . . , axkare all distinct (mod p). Since there are k distinct values, this must be a reducedresidue system. �

Proof of Euler’s Theorem. Let a ∈ Z with (a,m) = 1 and {x1, . . . , xk}be a reduced residue system (mod m), where k = φ(m). By the permutationlemma, {[ax1]m, . . . , [axk]m} = {[x1]m, . . . , [xk]m}. Thus the product of all of theelements in each of these sets must be equal (mod m), that is,

(ax1)(ax2) · · · (axk) ≡ x1x2 · · ·xk (mod m).

By the cancelation law we obtain ak ≡ 1 (mod m), which is the statement of thetheorem. �

Example 4.5.3. Find the last 3 digits of 17801. That is, find lr of 171801

(mod 1000). Note φ(1000) = 400, so by Euler, 17400 ≡ 1 (mod 1000). Thus171801 ≡ 17 (mod 1000). So last three digits are 017.

4.5.1. Applications of Euler’s Theorem and Fermat’s Little Theorem.We will see five applications. (i) Computing powers of integers (mod m). (Alreadydone.) (ii) Finding orders of elements (mod m). (iii) Finding the length of therepeating pattern in the decimal expansion of a rational number. (iv) Primalitytesting. (iv) RSA cryptography. We look at these applications in the next fewsections.

4.6. Orders of elements (mod m)

Definition 4.6.1. Let m be a positive integer and a be any integer with(a,m) = 1. The order of a (mod m), written ordm(a) is the smallest positiveinteger k such that ak ≡ 1 (mod m).

Example 4.6.1. ord7(2) = 3, ord5(2) = 4.

Note 4.6.1. i) If (a,m) = 1 then ordm(a) exists. Why? Consider the valuesa, a2, . . . , (mod m) Eventually there must be repetition, that is, ai ≡ aj (mod m)for some i > j. But then ai−j ≡ 1 (mod m). Thus there exists some k such thatak ≡ 1 (mod m), and therefore a minimal such k must exist by well ordering.

ii) If (a,m) > 1 then there is no k with ak ≡ 1 (mod m) and so ordm(a) doesn’texist.

iii) If k = ordm(a) then a−1 ≡ ak−1 (mod m).

4.7. DECIMAL EXPANSIONS 43

Theorem 4.6.1. Powers of a (mod m). Let (a,m) = 1 and k = ordm(a).Then

i) The values 1, a, a2, . . . ak−1 are distinct (mod m).ii) Every power of a is congruent to exactly one of these values. To be precise

if n ∈ Z and r is the remainder in dividing n by k then an ≡ ar (mod m).iii) an ≡ 1 (mod m) if and only if k|n.

Proof. i) Proof by contradiction. Suppose that ai ≡ aj (mod m) for some0 ≤ i < j < k. Then by cancelation law aj−i ≡ 1 (mod m), but since 0 < j− i < kthis contradicts the minimality of k.

ii) By division algorithm, n = qk+ r, with 0 ≤ r < k. Thus an ≡ ar (mod m).iii) an ≡ 1 (mod m) iff ar ≡ 1 (mod m) iff r = 0 (by minimality of k) iff

k|n. �

Theorem 4.6.2. Orders of elements. Let m ∈ N, a ∈ Z with (a,m) = 1.Then ordm(a)|φ(m).

Proof. Let k = ordm(a). By Theorem 4.6.1, an ≡ 1 (mod m) iff k|n. Sinceaφ(m) ≡ 1 (mod m) (by Euler’s Theorem), we must have k|φ(m). �

Example 4.6.2. a) Find k = ord18(7). φ(18) = 6. Thus k|6, that is, k = 1, 2, 3or 6. Plainly k 6= 1 (1 is the only element of order 1 for any modulus). 72 ≡ 13(mod 18) and 73 ≡ 13 · 7 = 91 ≡ 1 (mod 18), so k = 3.

b) Next lets find k = ord18(5). Note 52 ≡ 7 (mod 18), 53 ≡ −1 (mod 18).Thus k = 6.

For composite moduli, the next theorem is convenient for calculating orders.

Theorem 4.6.3. Suppose m = m1m2 with (m1,m2) = 1 and (a,m) = 1. Thenordm1m2(a) = [ordm1(a), ordm2(a)].

Proof. Just note that ak ≡ 1 (mod m1m2) is equivalent to the system, ak ≡ 1(mod m1) and ak ≡ 1 (mod m2). Any k satisfying the first is a multiple of ordm1(a)while any k satisfying the second is a multiple of ordm2(a). Thus the minimal suchk is the least common multiple of these two orders. �

Example 4.6.3. Find ord21(10). We must find the minimal k such that 10k ≡ 1(mod 2)1, that is, 10k ≡ 1 (mod 3) and 10k ≡ 1 (mod 7). The first congruence is1k ≡ 1 (mod 3) which holds for any k, while the second 3k ≡ 1 (mod 7) requires6|k. Thus k = 6 is the minimal value.

4.7. Decimal Expansions

17 = .142857. In your first homework you discovered that the length of the

repeating cycle in the decimal expansion of 1/p where p is a prime, is a divisor ofp− 1. Lets see how we can predict the length of the cycle without ever finding thedecimal expansion.

Theorem 4.7.1 (Decimal Expansions). Let ab be a fraction with 0 < a < b

and (a, b) = 1. Say b = 2e5fm with (m, 10) = 1, and that ab has a decimal expansion

of the forma

b= .a1a2 . . . aic1c2 . . . ck,

with i, k minimal, that is, k is the (minimal) length of the repeating cycle, and therepeating cycle does not start earlier. Then i = max(e, f), and k = ordm(10).


Note: The decimal expansion is called purely periodic if i = 0. By the theorem,this will occur iff (b, 10) = 1.

Corollary 4.7.1. If a/b is a fraction as given in Theorem 4.7.1 then k|φ(m).

Proof. Let k = ordm(10). By Theorem 4.6.2 we have k|φ(m). �

Example 4.7.1. Consider 17 . Let k = ord710 = ord7(3). k|6, so k = 2, 3 or 6,

and one easily finds k = 6. Thus 1/7 is purely periodic with cycle of length 6.

Example 4.7.2. Consider 125336 . Note 125 = 53, 336 = 24 · 3 · 7. Thus m =

21, k = ord21(10) = [ord3(10), ord7(10)] = [ord3(1), ord7(3)] = [1, 6] = 6. Thusi = max(e, f) = 4, k = 6. One finds on a calculator 125

336 = .3720238295. Havingdone the calculation of i and k ahead of time we are assured that the answer oncalculator is exact.

Proof of Theorem 4.7.1. Let ab have a decimal expansion as given in the

theorem with i, k minimal.

10ia

b= a1 . . . ai.c1 . . . ck.

and

10i+ka

b= a1 . . . aic1 · · · ck.c1 . . . ck.

Subtracting, we get 10i ab (10k−1) ∈ Z. Thus b|10ia(10k−1). Since (b, a) = 1 this is

equivalent to b|10i(10k − 1), that is, 2e5fm|10i(10k − 1). Thus, by Euclid’s lemma,2e5f |10i and, since (m, 10) = 1, m|(10k − 1). Moreover any i, k satisfying thelast two conditions gives rise to such a decimal expansion. Thus, i is the minimalinteger satisfying 2e5f |10i and k is the minimal integer satisfying m|10k−1. Plainly,i = max(e, f) and k = ordm(10). �

4.8. Primality Testing

How can we efficiently test whether a given 100 digit number is a prime? Weneed the following facts:

i) Fermat’s Little Theorem: If p is a prime and - a then ap−1 ≡ 1 (mod p).ii) If p is a prime and x2 ≡ 1 (mod p) then x ≡ ±1 (mod p).

iii) If p is an odd prime and p - a then ap−12 ≡ ±1 (mod p).

Proof. (i) Done. (ii) p|(x2 − 1) iff p|(x− 1)(x+ 1) iff p|(x− 1) or p|(x+ 1) iff

x ≡ ±1 (mod p). (iii) Let x ≡ ap−12 (mod p). Then by FLT x2 ≡ 1 (mod p) and

so by (ii), x ≡ ±1 (mod p). �

Theorem 4.8.1 (Composite Number Test). Let m be a positive integer (wewish to test for primality) and b be any integer with (b,m) = 1. If bm−1 6≡ 1(mod m), then m is composite.

Proof. Proof by contradiction. Suppose that m is a prime. Since (b,m) = 1we would then have by FLT that bm−1 ≡ 1 (mod m), a contradiction. �

Definition 4.8.1. (i) A composite number m is called a pseudoprime to thebase b if bm−1 ≡ 1 (mod m) (that is, b satisfies the criterion in FLT).

(ii) A composite number m is called a Carmichael number if m is a pseudoprimeto every base b relatively prime to m.

4.9. PUBLIC-KEY CRYPTOGRAPHY 45

Note 4.8.1. (i) Carmichael numbers exist. You show in homework that 561 =3 · 11 · 17 is a Carmichael number.

(ii) There exist infinitely many Carmichael numbers.

4.8.1. The Strong Pseudoprime Test for Primality. Let m be a given

odd number we wish to test for primality. Start with base 2, and calculate x ≡ 2m−1

2

(mod m). There are four options: If x 6≡ ±1 (mod m) then m is composite. Ifx ≡ −1 (mod m), pause and change base. If x ≡ 1 (mod m) and 4 - m − 1 then

change base. If x ≡ 1 (mod m) and 4|(m − 1) then calculate y ≡ 2m−1

4 (mod m).Note y2 ≡ 1 (mod m) so if m is a prime we should have y ≡ ±1 (mod m), andrepeat the four options. The number of possible repetitions for a given base is atmost the multiplicity of 2 dividing m−1. The next base we test is 3, then 5,7,11,13,... running through the primes.

By just using bases 2 and 3 we can test any number up to one million andarrive at a definitive conclusion as to whether it is a prime or not. Using bases2,3,5,7,11 we can test any number up to 2 · 1012. This algorithm runs extremelyfast (microsecond) on a computer.

4.9. Public-Key Cryptography

RSA-Method: Rivest, Shamir, Adleman (1978).1. First we need a way of changing words into numbers. This can be public and

as simple as A = 01, B = 02, . . . , Z = 26, space = 00, etc. Thus the word “Hello”would become 0805121215, which we think of as the nine-digit number 805,121,215.Sentences are broken into pieces such that each piece becomes a number less thanthe modulus we are working with.

2. Each person selects two distinct primes p and q each with say 200 digits,and multiplies them to create their public modulus m = pq (with 400 digits). Eachperson also selects an encoding exponent e relatively prime to the value L calculatedin step 3. A public phone book is made listing each participant’s name, modulusm and encoding exponent e. The individual primes p and q are kept secret.

3. Each person P also calculates two secret values: (i) L := [p− 1, q− 1]. Thisvalue can be calculated since P knows the individual values p, q. (ii) The decodingexponent d is chosen so that de ≡ 1 (mod L), that is, d ≡ e−1 (mod L). d existssince e was selected relatively prime to L.

4. Encoding the message: Suppose that person P wishes to send a message toperson Q. Person P looks in the phone book for person Q′s m and e, and thenchops his/her message into pieces smaller than m and relatively prime to m. LetM be one such piece. Person P then calculates the least least residue Me of Me

(mod m), that is,

Me ≡Me (mod m), , 0 < Me < m.

Me is called the encoded message.5. The encoded message is delivered to person Q in a public manner. Anyone

is free to look at Me, but it is undecipherable to anyone not having the decodingexponent d.

6. Person Q receives the message and calculates the least residue Md of Mde

(mod m), that is,

Md ≡Mde (mod m), , 0 < Md < m.


We claim that Md = M , that is, person Q has recovered the original message!

Proof. Since Md and M are less than m it suffices to show that Md ≡ M(mod m). Claim: ML ≡ 1 (mod m). This is equivalent to ML ≡ 1 (mod p) andML ≡ 1 (mod q). Since L is a multiple of p−1 we have, by FLT, ML ≡ 1 (mod p),and since L is a multiple of q − 1 we have ML ≡ 1 (mod q), completing the proofof the claim.

Now, since ed ≡ 1 (mod L), we have ed = 1 + kL for some integer k. Thus

Md ≡Mde ≡ (Me)d ≡Med = M1+kL = M · (ML)k ≡M (mod m),

by the claim. QED. �

Example 4.9.1. Let p = 31, q = 37, m = pq = 1147. L = [p − 1, q − 1] =[30, 36] = 180. Let e = 7. Note that (e, L) = 1. Select d ≡ e−1 (mod L), sod = 103. Lets send the message M = 805. Me ≡ 8057 ≡ 650 (mod m). Md ≡650103 ≡ 805 (mod m).

4.9.1. Computing powers (mod m). An efficient way to compute powers(mod m) is to use the binary expansion of the power. Most computing softwarethat handles modular arithmetic uses this method. Lets illustrate the method withan example.

Example 4.9.2. Find 2149 (mod m). The binary expansion of 149 is given by

128 + 0 · 64 + 0 · 32 + 16 + 0 · 8 + 4 + 0 · 2 + 1 = 10010101two.

By successive squaring we calculate 22 (mod m), 24 (mod m), 28 (mod m), 216

(mod m), . . . , 2128 (mod m). Start with x = 1. If a digit 1 appears in the binaryexpansion then we replace x with x times the corresponding power of 2, as we goalong. Thus, altogether we will have computed

1 · 21 · 24 · 216 · 2128 ≡ 21+4+16+128 = 2149 (mod m).

CHAPTER 5

Polynomial Congruences

We wish to solve the congruence

(5.1) f(x) ≡ 0 (mod m),

where f(x) is a polynomial with integer coefficients.The three step process: Let m = pe11 · · · p

ekk .

(i) Solve the congruence f(x) ≡ 0 (mod p)i for each prime pi.(ii) Lift the solutions in (i) from (mod pi) to solutions (mod peii ).(iii) Use CRT to find all possible solutions (mod m) using the info from (ii).

Example 5.0.3. Solve 26x3+x2−13x+5 ≡ 0 (mod 35). Note this is equivalentto solving the congruence (mod 7) and (mod 5).

(i) Solve (mod 5).

26x3 + x2 − 13x+ 5 ≡ x3 + x2 + 2x = x(x2 + x+ 2) (mod 5).

The quadratic has no zero (mod 5) (as seen by testing 0,1,2,3,4). Thus the onlysolution (mod 5) is x ≡ 0 (mod 5).

(ii) Next solve (mod 7). First note that

26x3 + x2 − 13x+ 5 ≡ 5x3 + x2 + x+ 5 = 5(x3 + 1) + x(x+ 1)

= (x+ 1)(5(x2 − x+ 1) + x) = (x+ 1)(5x2 − 4x+ 5) (mod 7).

The quadratic is again seen to have no solution (mod 7), and so the unique solutionis x ≡ −1 (mod 7). By CRT we then find that the unique solution to the originalcongruence is x ≡ 20 (mod 35).

5.1. Lifting solutions from (mod p) to (mod pe)

Let p be a prime and f(x) a polynomial with integer coefficients. Suppose thatwe wish to solve f(x) ≡ 0 (mod p2). Any solution must already be a solution(mod p). Let x1 be an integer solution of the congruence

f(x) ≡ 0 (mod p).

We shall attempt to lift the solution x1 to a solution (mod p2), that is find a pointx2 such that,

(5.2) x2 ≡ x1 (mod p) and f(x2) ≡ 0 (mod p2).

Say x2 = x1 + tp for some t ∈ Z. Can we choose t so that this is a solution(mod p2).

Recall from Calc II the Taylor expansion,

f(a+ y) = f(a) + f ′(a)y +f ′′(a)

2y2 + · · ·+ f (n)(a)

n!yn,

47

48 5. POLYNOMIAL CONGRUENCES

for any a, y ∈ Z. Note that since the coefficients on the left are all integers, so arethe coefficients on the right. Inserting a = x1, y = pt we obtain

f(x1 + tp) = f(x1) + f ′(x1)tp+f ′′(x1)

2(tp)2 + · · · ≡ f(x1) + f ′(x1)tp (mod p2),

since all of the other coefficients are divisible by p2. Thus we need to solve thecongruence

(5.3) Lifting Congruence: f ′(x1)t ≡ −f(x1)

p(mod p).

The three possibilities:(i) If f ′(x1) 6≡ 0 (mod p), then there is a unique solution t of (5.3) and hence

a unique solution x2 of (5.2) (mod p2).(ii) If f ′(x1) ≡ 0 (mod p) and f(x1) 6≡ 0 (mod p2) then there is no solution of

(5.3) and hence no solution of (5.2).(iii) If f(x1) ≡ 0 (mod p) and f(x1) ≡ 0 (mod p2), then any value of t is a

solution of (5.3), and hence there are p distinct solutions of (5.2) (mod p2).

Suppose that we have constructed by induction a sequence of integers x1, x2, . . . xnsuch that

xi+1 ≡ xi (mod pi) and f(xi) ≡ 0 (mod pi),

for i = 1, 2 . . . , n. To continue we wish to find an xn+1 = xn + pnt such thatf(xn + pnt) ≡ 0 (mod pn+1). This amounts to solving

f(xn) + f ′(xn)pnt ≡ 0 (mod pn+1),

or equivalently (noting that f ′(x1) ≡ f ′(xn) (mod p))

f ′(x1)t ≡ −f(xn)

pn. (mod p)

and so again we have three possibilities.

Definition 5.1.1. A solution x1 of the congruence f(x) ≡ 0 (mod p) is callednonsingular if f ′(x1) 6≡ 0 (mod p) and singular if f ′(x1) ≡ 0 (mod p).

Theorem 5.1.1. If x1 is a nonsingular solution of the congruence f(x) ≡ 0(mod p) then for any positive integer n there is a unique solution xn (mod pn) ofthe congruence f(x) ≡ 0 (mod pn) such that xn ≡ x1 (mod p).

Example 5.1.1. Solve the congruence x2 ≡ −1 (mod 125). Start with x2 ≡ −1(mod 5) which has solutions ±2. First lets lift 2. Set x = 2 + 5t. f(x) = x2 + 1,f(2) = 5, f ′(2) = 4, and so Lifting Congruence is 4t ≡ −1 (mod 5), which givest ≡ 1 (mod 5), x ≡ 7 (mod 25). Next lift 7. Set x = 7+25t. f(7) = 50. The LiftingCongruence is 4t ≡ −50/25 (mod 5), so t ≡ 2 (mod 5) and x ≡ 57 (mod 125).Clearly, the second solution (obtained by lifting −2) is x ≡ −57 (mod 125).

Example 5.1.2. Solve x3 + x2 + 23 ≡ 0 (mod 53). Start with the same con-gruence (mod 5). By trial and error we see that x ≡ 1 or 2 (mod 5).

(i) Take x1 = 1. Put x = 1 + 5t. Note that f ′(1) = 5 ≡ 0 (mod p), thatis 1 is a singular solution, while f(1)/5 = 5 ≡ 0 (mod 5). Thus we have haveoption (iii), that is, the lifting congruence is 0t ≡ 0 (mod 5), so t is arbitrary and

5.2. COUNTING SOLUTIONS OF CONGRUENCES 49

we get x2 = 1 + 5t = 1, 6, 11, 16, 21. Now f(1 + 5t)/25 = 4t2 + t + 1, and we seef(1 + 5t)/25 ≡ 0 (mod 5) iff t = 3. Thus for x2 = 16 we have option (iii) and getfive liftings to solution (mod 125), namely x ≡ 16, 41, 66, 91, 116 (mod 125).

If one continues this to (mod 54) one discovers that all of the solutions(mod 53) lift. Thus there are 25 solutions (mod 625) all living above x1 = 1.

(ii) Since x1 = 2 is a nonsingular solution, there is a unique lifting each time.We obtain x2 ≡ 17 (mod 25) and x3 ≡ 42 (mod 125), and (if we continue one morelevel) x4 ≡ 417 (mod 625).

This information can be displayed in a tree graph with vertices 1 and 2 at thetop and branches below for the (mod 25), (mod 125), (mod 625) liftings. Eachof the solutions (mod 125) branches out five solutions (mod 625), while 42 liftsto 417 (mod 625).

Example 5.1.3. Solve the congruence f(x) = x3 + 7x2 + x = x(x − 1)2 ≡ 0(mod 32).

5.2. Counting Solutions of congruences

Example 5.2.1. Suppose that we wish to count the number of solutions off(x) ≡ 0 (mod 35), where f(x) is a polynomial over Z. We start by solving thecongruences f(x) ≡ 0 (mod 5) and f(x) ≡ 0 (mod 7). Say a1, a2, . . . , ar are thesolutions of the former, and b1, . . . , bs the solutions of the latter. By CRT, for anychoice of i, j there is a unique x (mod 35) with

x ≡ ai (mod 5)

x ≡ bj (mod 7).

By the substitution principle, f(x) ≡ f(ai) ≡ 0 (mod 5) and f(x) ≡ f(bj) ≡ 0(mod 7), and so f(x) ≡ 0 (mod 35). Thus, altogether, we obtain rs solutions(mod 35).

The content of this example is given in the following theorem.

Theorem 5.2.1. Let f(x) be a polynomial with integer coefficients and m apositive integer with factorization m = pe11 · · · p

ekk . Then

(i) x is a solution of the congruence

(5.4) f(x) ≡ 0 (mod m)

if and only if x satisfies the system of congruences

(5.5) f(x) ≡ 0 (mod peii ), 1 ≤ i ≤ k.

(ii) Letting N(m) denote the number of solutions of (mod m) and N(peii )denote the number of solutions of (2.66), we have N(m) = Πk

i=1N(peii ).

Proof. (i) m|f(x)⇔ peii |f(x), 1 ≤ i ≤ k.(ii) We claim that the CRT gives us a one-to-one correspondence between the

k−tuples (x1, . . . , xk) ∈ Z/(pe11 ) × · · · × Z/(pekk ) with xi a solution of (5.5) for1 ≤ i ≤ k and the solutions x of (5.4). Indeed, suppose that xi is a solution of (5.5)for 1 ≤ i ≤ k, and let x (mod m) be the unique value with x ≡ xi (mod peii ) ,1 ≤ i ≤ k. Such an x satisfies f(x) ≡ f(xi) ≡ 0 (mod peii ) for all i, and so f(x) ≡ 0(mod m). �


5.3. Solving congruences (mod p)

As we saw above, in order to solve a polynomial congruence (mod m), onestarts by solving congruences (mod p) where p is a prime. For small p this isgenerally done by trial and error. Another tool that can be useful is the factortheorem for congruences.

Theorem 5.3.1. Factor Theorem Suppose that p is a prime, f(x) is a poly-nomial of degree d over Z, and that a is a solution of the polynomial congruencef(x) ≡ 0 (mod p). Then f(x) ≡ (x − a)g(x) (mod p), for some polynomial g(x)over Z of degree d− 1.

Note: To say two polynomials are congruent (mod p) means that all of thecorresponding coefficients are congruent (mod p).

Proof. We are given that f(a) ≡ 0 (mod p). Thus f(a) = kp for some k ∈ Z.Let h(x) = f(x) − kp. Then a is a zero of h(x) and so by the factor theorem forZ, (x− a) is a factor of h(x), that is, h(x) = (x− a)g(x) for some polynomial g(x)over Z of degree d − 1. Clearly deg(g) = d − 1 and f(x) = (x − a)g(x) + pk, thatis, f(x) ≡ (x− a)g(x) (mod p). �

Definition 5.3.1. (i) We say that a is a zero of a polynomial f(x) (mod p),if f(a) ≡ 0 (mod p). In this case (x− a) is a factor of f(x) (mod p).

(ii) We say that a is a zero of f(x) (mod p) of multiplicity k, if (x − a)k is afactor of f(x) (mod p), that is, f(x) ≡ (x− a)kg(x) (mod p) for some polynomialg(x), but (x− a)k+1 is not a factor.

Example 5.3.1. (i) Let f(x) = xp − 1. Since p|(pk

)for 1 ≤ k ≤ p− 1 we have

xp − 1 ≡ (x− 1)p (mod p), and so 1 is a zero of f(x) (mod p) of multiplicity p.(ii) Let f(x) = xp−1 − 1. By FLT, 1, 2, . . . , p− 1 are all zeros of f(x) (mod p),

and so

xp−1 − 1 ≡ (x− 1)(x− 2) . . . (x− (p− 1)) (mod p).

In particular, matching the constant terms on the RHS and LHS, we obtain Wilson’sTheorem, (p− 1)! ≡ −1 (mod p).

Theorem 5.3.2. Lagrange’s Theorem Let f(x) be a polynomial of degreed over Z, and p a prime. Then the congruence f(x) ≡ 0 (mod p) has at most ddistinct solutions (mod p).

Proof. The proof is by induction on d. When d = 1 the statement is trivial.Indeed, a linear congruence has either no solution or 1 solution (mod p). Supposethe statement is true for d and now let f be a polynomial of degree d+ 1. If f hasno zero (mod p) we are done. Otherwise, let a be a zero of f (mod p). Then, bythe factor theorem f(x) ≡ (x− a)g(x) (mod p) for some polynomial g(x) of degreed. By the induction assumption g(x) has at most d zeros (mod p), counted withmultiplicity. Thus f(x) has at most d + 1 zeros, since if f(x) ≡ 0 (mod p) theneither x − a ≡ 0 (mod p), or g(x) ≡ 0 (mod p) (since p is a prime.) Thus eitherx ≡ a (mod p), or x is one of the zeros of g(x) (mod p). �

Note 5.3.1. Lagrange’s Theorem fails for composite moduli. For example iff(x) = x2 − 1 and m = p1 · · · pk, a product of k distinct primes, then f(x) ≡ 0(mod m) has 2k distinct solutions (mod m), even though f(x) is just of degree 2.

5.4. QUADRATIC RESIDUES AND THE LEGENDRE SYMBOL 51

Example 5.3.2. Solve x3 +x+1 ≡ (mod 11). Plainly x = 2 is a solution, andso (x− 2) is a factor. By long division we obtain x3 + x+ 1 ≡ (x− 2)(x2 + 2x+ 5)(mod 11). By trial and error one can check that the quadratic has no solution.Thus x = 2 is the only solution.

Example 5.3.3. Solve the congruence x3 + x+ 1 (mod 312). Hint: Note that3 is a solution (mod 31). Use factor theorem and quadratic formula to obtainothers.

Example 5.3.4. Solve the congruence x495 − x24 + 3 ≡ 0 (mod 7). Hint: UseFermats Little Theorem to make life easier.

5.4. Quadratic Residues and the Legendre Symbol

Definition 5.4.1. Let p be a prime, a ∈ Z, with p - a. a is called a quadraticresidue (mod p) if a ≡ x2 (mod p) for some integer x. Otherwise a is called aquadratic non-residue (mod p).

Example 5.4.1. 5 is a quadratic residue (mod 11) since 72 = 49 ≡ 5 (mod 11).

Theorem 5.4.1. Exactly p−12 values (mod p) are quadratic residues and p−1

2are not.

Proof. The quadratic residues are 12, 22, . . . , (p− 1)2 (mod p). Note x2 ≡ y2

(mod p) iff x ≡ ±y (mod p). Thus the valued 12, 22, . . . ,(p−12

)2(mod p) are the

distinct quadratic residues. �

Example 5.4.2. Find all quadratic residues (mod 11). 12, 22 ≡ 4, 32 ≡ 9, 42 ≡5, 52 ≡ 3 (mod 11).

Definition 5.4.2. Let p be a prime, a ∈ Z, p - a. The Legendre symbol isdefined by (

a

p

)=

{1 if a is a quadratic residue (mod p);

−1 if a is a quadratic non-residue (mod p).

Example 5.4.3. ( 511 ) = 1, ( 2

3 ) = −1.

Theorem 5.4.2. Euler’s Criterion. Let p be an odd prime and a ∈ Z with p - a.Then

(5.6)

(a

p

)≡ a

p−12 (mod p).

Proof. We’ve already seen that the RHS is equiv ±1 (mod p) (by FLT). Sup-pose that a is a quadratic residue, so that a ≡ x2 (mod p) for some integer x. Then

ap−12 ≡ xp−1 ≡ 1 (mod p), and so both sides of (5.6) are 1. Thus all p−12 quadratic

residues are solutions of the congruence xp−12 ≡ 1 (mod p). Since this is a poly-

nomial of degree p−12 it cannot have any other solutions by Lagrange’s theorem.

Thus for any quadratic nonresidue (mod p) the RHS must be -1, agreeing withthe LHS. �

Example 5.4.4. ( 313 ) ≡ 36 ≡ (33)2 ≡ 1 (mod 13) so 3 is a quadratic residue.

Indeed 42 ≡ 3 (mod 13).


Theorem 5.4.3. Multiplicative property of Legendre symbol. Suppose that p isa prime and that a, b ∈ Z with p - ab. Then(

ab

p

)=

(a

p

)(b

p

).

Proof. Trivial for p = 2. Suppose p is odd. By Euler criterion we have(ab

p

)≡ (ab)

p−12 ≡ a

p−12 b

p−12 ≡

(a

p

)(b

p

)(mod p).

Since the LHS and RHS are both ±1 we see that equality (as integers) follows. �

Theorem 5.4.4. Trivial Properties of Legendre symbol. Suppose that p is aprime.

(i) For any integer a with p - a, we have (a2

p ) = 1.

(ii) For any integers a, b with a ≡ b (mod p), we have (ap ) = ( bp ).

Theorem 5.4.5. Legendre symbol for −1 and 2. a) For any odd prime p wehave (

−1

p

)=

{1 if p ≡ 1 (mod 4);

−1 if p ≡ 3 (mod 4).

b) For any odd prime p we have(2

p

)=

{1 if p ≡ ±1 (mod 8);

−1 if p ≡ ±3 (mod 8).

Proof. a) Suppose p ≡ 1 (mod 4), say p = 1+4k, k ∈ N. Then (−1)(p−1)/2 =(−1)2k = 1 and so by Euler’s criterion, −1 is a quadratic residue (mod p). If p ≡ 3(mod 4), say p = 3 + 4k, then (−1)(p−1)/2 = (−1)2k+1 = −1, so -1 is a quadraticnon-residue.

b) Suppose that p ≡ 1 (mod 4), and so p ≡ 1 or 5 (mod 8). We calculate2(p−1)/2 (mod p) two different ways. Set

Q = 2 · 4 · 6 · · · (p− 1).

First note that Q = 2(p−1)/2((p− 1)/2)!. Also, noting that p−12 is even, we have

Q =

(2 · 4 · · · p− 1

2

)(p+ 3

2· · · (p− 1)

)≡(

2 · 4 · · · p− 1

2

)(−(p− 3)

2· · · (−5)(−3)(−1)

)(mod p)

≡ (−1)p−14 1 · 2 · 3 · 4 · · ·

(p− 3

2

)(p− 1

2

)(mod p).

Equating the two expressions for Q and canceling the ((p− 1)/2)! we obtain

2p−12 ≡ (−1)

p−14 (mod p).

If p ≡ 1 (mod 8) then the RHS = 1, while if p ≡ 5 (mod 8) then RHS = -1. Theformula for ( 2

p ) then follows from Euler’s criterion.

5.5. QUADRATIC RECIPROCITY 53

Next, suppose that p ≡ 3 (mod 4), so that p−32 is even. Then

Q =

(2 · 4 · · · p− 3

2

)(p+ 1

2· · · (p− 1)

)≡(

2 · 4 · · · p− 3

2

)(−(p− 1)

2· · · (−5)(−3)(−1)

)(mod p)

≡ (−1)p+14

(p− 1

2

)! (mod p)

and so

2p−12 ≡ (−1)

p+14 (mod p).

If p ≡ 3 (mod 8) then RHS = -1, while if p ≡ 7 (mod 8) then RHS = 1, completingthe proof. �

5.5. Quadratic Reciprocity

Consider the two congruences x2 ≡ 3 (mod 1009) and x2 ≡ 1009 (mod 3).Which one is easier to solve? Since 1009 ≡ 1 (mod 3), the second congruence isjust x2 ≡ 1 (mod 3) which has solutions x ≡ ±1 (mod 3). Does knowledge of thisgive me any information about the first congruence, which cannot be simplified? Isthere any relationship between these two congruences? To address the solvability ofthe first congruence we must calculate

(3

1009

). We’ve already shown that

(10093

)= 1,

but does this reveal any information about(

31009

). Euler and Lagrange observed

a beautiful relationship between these two quantities, called the law of quadratic

reciprocity. It says that if p, q are distinct odd primes then(pq

)=(qp

)unless

p ≡ q ≡ 3 (mod 4), in which case(pq

)= −

(qp

). Thus, for our example above we

conclude that(

31009

)= 1.

Example 5.5.1. Lets investigate the relationship between(p3

)and

(3p

)for

various primes p. As noted above, calculating(p3

)is easy. To calculate

(3p

)we use

Euler’s criterion (part (i) of the preceding theorem) and a calculator.

p 5 7 11 13 17 19 23 29 31 37 41 43(p3

)−1 1 −1 1 −1 1 −1 −1 1 1 −1 1(

3p

)−1 −1 1 1 −1 −1 1 −1 −1 1 −1 −1

Note that if p ≡ 1 (mod 4) then the two values are equal while if p ≡ 3 (mod 4)they have opposite signs.

Example 5.5.2. Lets do the same thing for(p5

)and

(5p

)for various primes p.

p 3 7 11 13 17 19 23 29 31 37 41 43(p5

)−1 −1 1 −1 1 1 −1 1 1 −1 1 −1(

5p

)−1 −1 1 −1 1 1 −1 1 1 −1 1 −1

We see that the values are identical in this case!


By studying further examples of this type you will discover that whenever westart with a prime q of the form q ≡ 3 (mod 4) we get the behavior of the firstexample (where q = 3), and whenever it is of the form q ≡ 1 (mod 4), we getidentical values as in the second example (where q = 5). This leads us to formulatethe Law of Quadratic Reciprocity.

Theorem 5.5.1. Law of Quadratic Reciprocity. For any odd primes p, q, (pq ) =

( qp ) unless p ≡ q ≡ 3 (mod 4), in which case (pq ) = −( qp ). Equivalently,(p

q

)=

(q

p

)(−1)

p−12

q−12 .

There are many proofs of quadratic reciprocity. Gauss was the first to proveit, and over his life he published (at least) six different proofs. I will let you reada proof of theorem in the textbook if you are interested. It is considerably moreinvolved than any proof we have done this semester. Our interest here is in usingthe law for evaluating Legendre symbols.

Example 5.5.3. Find ( 71009 ). Since 1009 ≡ 1 (mod 4) we have ( 7

1009 ) =

( 10097 ) = 1.

Example 5.5.4. Find ( 227137 ), noting that 137 is a prime and that 137 ≡ 1

(mod 8), (227

137

)=

(90

137

)=

(9

137

)(10

137

)=

(10

137

)=

(2

137

)(5

137

)=

(5

137

)=

(137

5

)=

(2

5

)= −1.

Corollary 5.5.1. The Legendre symbol for 3. For any odd prime p we have(3

p

)=

{1, if p ≡ ±1 (mod 12);

−1, if p ≡ ±5 (mod 12).

Proof. Suppose that p ≡ 1 (mod 4). Then by quadratic reciprocity ( 3p ) =

(p3 ). Since 1 is the unique quadratic residue (mod 3), we see that if p ≡ 1 (mod 3),

( 3p ) = 1 and if p ≡ 2 (mod 3) then ( 3

p ) = −1. Now by CRT, if p ≡ 1 (mod 4) and

p ≡ 1 (mod 3) then p ≡ 1 (mod 12), while if p ≡ 1 (mod 4) and p ≡ 2 (mod 3)then p ≡ 5 (mod 12).

Next, suppose that p ≡ 3 (mod 4). Then by quadratic reciprocity ( 3p ) = −(p3 )

which equals -1 if p ≡ 1 (mod 3), and 1 if p ≡ 2 (mod 3). Now, by CRT if p ≡ 3(mod 4) and p ≡ 1 (mod 3) then p ≡ 7 (mod 12), while if p ≡ 3 (mod 4) and p ≡ 2(mod 3) then p ≡ 11 (mod 12). �

5.6. Representing primes as sums of two squares

When can a prime p be expressed as a sum of two squares. It is easy to seethat if p is odd and p = a2 + b2, then since a2 ≡ 0 or 1 (mod 4), we must havep ≡ 1 (mod 4). Test such p: 5 = 12 + 22, 13 = 22 + 32, 17 = 12 + 42, 29 = 52 + 22,..It is reasonable to conjecture the following result

Theorem 5.6.1. Let p be an odd prime. Then p is a sum of two squares if andonly if p ≡ 1 (mod 4).

5.6. REPRESENTING PRIMES AS SUMS OF TWO SQUARES 55

Proof. Suppose that p ≡ 1 (mod 4). Then (−1p ) = 1, so there exists a u ∈ Zwith u2 ≡ −1 (mod p). Consider the set of integers of the form x + uy withx, y ∈ [0,

√p] ∩ Z. Since there are ([

√p] + 1)2 > p choices for (x, y), there must

exist, by the pigeonhole principle, distinct (x1, y1) 6= (x2, y2) with

x1 + uy1 ≡ x2 + uy2 (mod p), that is, (x1 − x2) ≡ u(y2 − y1) (mod p).

Set a = x1 − x2, b = y1 − y2. Then |a| < √p, |b| < √p and a ≡ ub (mod p).

Therefore a2 + b2 ≡ (1 + u2)b2 ≡ 0 (mod p) and a2 + b2 < 2p. Furthermore, since(x1, y1) 6= (x2, y2), a2 + b2 > 0. Thus a2 + b2 = p. �

APPENDIX A

Axioms for the set of Integers Z

We shall assume the following properties as axioms for the set of integers.

1] Addition Properties. There is a binary operation +, called addition, on Zsatisfying

α) Addition is well defined, that is given any two integers a, b, a+ b is uniquelydefined. Thus, if a = b and c = d then a+ c = b+ d, c+ a = d+ b, a+ d = b+ c,d + a = c + b. In particular, if a = b then a + c = b + c and c + a = c + b for anyinteger c. (We sometimes call this the substitution law.)

β) The set of integers is closed under addition. For any a, b ∈ Z, a+ b ∈ Z.a) Addition is commutative. For any a, b ∈ Z, a+ b = b+ a.b) Addition is associative. For any a, b, c ∈ Z, (a+ b) + c = a+ (b+ c).c) There is a zero element 0 ∈ Z, satisfying 0 + a = a = a+ 0 for any a ∈ Z.d) For any a ∈ Z, there exists an additive inverse −a ∈ Z satisfyinga+ (−a) = 0 = (−a) + a.

Note: Axioms α and β are actually part of the definition of a binary operationand so they often are not included in the list of axioms. A binary operation on Zis just a function from the set of ordered pairs of integers into Z.

Definition: Subtraction in Z is defined by a− b = a+ (−b) for a, b ∈ Z.

2] Multiplication Properties. There is an operation · (or ×) on Z satisfying,α) Multiplication is well defined, that is, given any two integers a, b, a · b is

uniquely defined. Thus, if a = b then ac = bc and ca = cb for any integer c.(Also,if a = b and c = d then ac = bd, ca = db, ad = bc, da = cb. This is also sometimescalled the substitution law.)

β) Z is closed under multiplication. For any a, b ∈ Z, a · b ∈ Z.a) Multiplication is commutative. For any a, b ∈ Z, ab = ba.b) Multiplication is associative. For any a, b, c ∈ Z, (ab)c = a(bc).c) There is an identity element 1 ∈ Z satisfying 1 · a = a = a · 1 for any a ∈ Z.

3] Distributive property. This is the one property that combines both additionand multiplication. For any a, b, c ∈ Z, a(b + c) = ab + ac. One can deduce theadditional distributive laws, (a+b)c = ac+bc, a(b−c) = ab−ac and (a−b)c = ac−bcfrom the other axioms.

4] Trichotomy Principle. The set of integers can be partitioned into three disjointsets, Z = −N ∪ {0} ∪ N, where

57

58 A. AXIOMS FOR THE SET OF INTEGERS Z

N = {1, 2, 3, . . . } = Natural Numbers = Positive Integers,−N = {−1,−2,−3, . . . } = Negative Integers.

One then defines the inequalities > and < by saying a > b if a − b ∈ N anda < b if a− b ∈ −N. Thus we get the Law of Trichotomy which states that for anytwo integers a, b exactly one of the following holds: a < b, a = b or a > b, (that isa− b ∈ −N, a− b = 0 or a− b ∈ N.)

5] Positivity Axiom. The sum of two positive integers is positive. The productof two positive integers is positive.

6] Discreteness Properties.a) Well Ordering Property of N. Any nonempty subset of N has a smallestelement. That is, given any nonempty subset S of natural numbers, there exists anelement m ∈ S such that m ≤ x for all x ∈ S.

b) Principle of Induction. Let S be a subset of N such that(i) 1 ∈ S and (ii) n ∈ S ⇒ n+ 1 ∈ S.Then S = N.

c) Maximum Element Principle. If S is a nonempty subset of integers boundedabove, then S has a maximum element. (Note, we say a set of integers S is boundedabove if there exists an integer M , not necessarily in S, such that x ≤ M for allx ∈ S.)

A. AXIOMS FOR THE SET OF INTEGERS Z 59

Further Properties of Z.The properties below can all be deduced from the axioms above. You may

assume them whenever needed.

7] Cancellation law for addition: If a+ x = a+ y then x = y.Cancellation law for multiplication: If ax = ay and a 6= 0 then x = y.

8] Additive inverses are unique, that is, if a, b, c are integers such that a + b = 0and a+ c = 0 then b = c.

9] Zero multiplication property: a · 0 = 0 for any a ∈ Z.

10] Zero divisor property, (or integral domain property): If ab = 0 then a = 0 orb = 0.

11] Properties of negatives: (−a)b = −(ab) = a(−b), (−a)(−b) = ab.

12] The product of two negative integers is positive.

13] “FOIL” Law (and all similar distributive laws): For any integers a, b, c, d,(a+ b)(c+ d) = ac+ ad+ bc+ bd.

14] Genassocomm Law: General Associative-Commutative Law:a) Addition: When adding a collection of n integers a1 + a2 + · · · + an, the

numbers may be grouped in any way and added in any order. In particular, thesum a1+a2+ · · ·+an is well defined, that is, no parentheses are necessary to specifythe order of operations.

b) Multiplication: When multiplying a collection of n integers a1a2 · · · · ·an, thenumbers may be grouped in any way and multiplied in any order. In particular,the product a1a2 · · · · · an is well defined, that is, no parentheses are necessary tospecify the order of operations.

15] Binomial Expansion: For any integers a, b and positive integer n we have

(a+ b)n =

n∑k=0

(n

k

)an−kbk = an +

(n

1

)an−1b+

(n

2

)an−2b2 + · · ·+ bn,

where(nk

)is the binomial coefficient,

(nk

)= n!

k!(n−k)! = n(n−1)···(n−k+1)k! . In particu-

lar,(a+ b)2 = a2 + 2ab+ b2

(a+ b)3 = a3 + 3a2b+ 3ab2 + b3.

16] Standard Factoring formulas, such as a2 − b2 = (a − b)(a + b), a3 − b3 =(a− b)(a2 + ab+ b2) or

ak − bk = (a− b)(ak−1 + ak−2b+ · · ·+ abk−2 + bk−1), for any k ∈ N, k > 1

ak + bk = (a+ b)(ak−1 − ak−2b+ · · · − abk−2 + bk−1), for any odd k ∈ N.

APPENDIX B

A Little Bit of Logic for Math 506

∃ there exists ∃! there exists a unique

∀ for all ⇒ implies

⇔ equivalent to iff if and only if

Let P,Q be statements and ∼ P , ∼ Q denote their negations (called “not P”and “not Q”.) For example, if P = “x is an integer” and Q = “x is a real number”.Then ∼ P = “x is not an integer”, ∼ Q=“x is not a real number”.

Q⇒ P is called the converse of P ⇒ Q.∼ Q⇒ ∼ P is called the contrapositive of P ⇒ Q.

The following are equivalent: (Consider the above example.)

P ⇒ Q ( P is true implies Q is true.)

If P then Q (If P is true then Q is true.)

P only if Q (P is true only if Q is true.)

P is sufficient for Q

Q is necessary for P

∼ Q⇒ ∼ P (Q is false implies that P is false.)

If ∼ Q then ∼ P , (If Q is false then P is false.)

Proof by Contradiction. In order to prove P ⇒ Q one proves the equivalentstatement ∼ Q ⇒ ∼ P . Thus the proof starts by assuming Q is false, and endswith the conclusion that P is false.

The following are equivalent: (For example, let P = “x2 = 1”, Q =: “x = ±1”.)

P ⇔ Q

P if and only if Q

P iff Q

P is necessary and sufficient for Q.

∼ P ⇔∼ Q.

61

Documents

Math 506, Introduction to Number Theory Kansas State ... 506, Introduction to Number Theory Kansas State University Spring 2016 March 21, 2016 Todd Cochrane