Number Theory Almost Final Dra˘

Number TheoryAlmost Final Dra�

Shu Sasaki

23rd April 2021

Contents

1 Introduction 2

2 Samplers 4

3 Revision 43.1 Euclid’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Primes and factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.3 Congruences and modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . 83.4 Congruence equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.5 The Chinese Remainder Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 113.6 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.7 Appendix: Rings and �elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.8 Appendix 2: Euclidian algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Euler’s totient function and primitive roots 154.1 Primitive roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Quadratic residues and non-residues, Gauss reciprocity law 225.1 the Legendre symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2 Solving the equation G2 ≡ 0 mod ? . . . . . . . . . . . . . . . . . . . . . . . . . 265.3 Hensel’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Finite continued fractions 30

7 In�nite continued fractions 377.1 Periodic continued fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8 Diophantine approximation 438.1 Periodic continue fraction II . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

9 Pell’s equation 489.1 Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499.2 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519.3 Appendix: A proof of Theorem 48 (NON-EXAMINABLE) . . . . . . . . . . . . . 54

1

10 Sums of squares 5710.1 Hermite’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5910.2 More sums of squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

11 Algebraic number theory 6311.1 Quadratic number �elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6511.2 The ring of integers in ℚ(

√3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

12 Supplementary notes 1 (revision notes) 7112.1 Relations and a few bits and bobs . . . . . . . . . . . . . . . . . . . . . . . . . 7112.2 Rings and �elds (from notes) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

13 Supplementary notes 2 (revision notes) 74




17 Supplementary notes 6 (revision notes) 8817.1 Pell’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8817.2 Sum of squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9017.3 Algebraic number theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

18 Supplementary notes 7 (mock exam paper) 92

1 Introduction

Number theory can be thought of as having its roots in the study of Diophantine equations. Dio-phantine equations are polynomial equations with rational coe�cients which we seek to solvewhile we insist the solutions should again be rational numbers.

Number theory has a long history. The Greeks knew, by about 400BC, that -2 − 2 = 0 hasno solution in rational numbers (the solutions de�nes a parabola).

Babylonians before 1600BC were already interested in rational solutions to the equation

-2+.2 = 1 (e.g. (-,. ) = (1, 0), (35,4

5), ( 5

13,12

13), . . . and the Greeks knew that there are in�nitely

many of them!), with a stone tablet to prove it.An Arab manuscript around 972 AD, more or less, asks, for which integer #, is there a right

angled triangle with area # whose sides have rational length? Algebraically, this amounts tosolving the simultaneous equations

-2 + .2 = /2

and-. = 2#

in positive rational numbers. In fact, the problem boils down to solving the equation

.2 = -3 − #2-

2

in non-zero rational numbers, and the equation de�nes an example of what we call ellipticcurves.

Number theory is full of surprises and one does not have to be an expert in number theoryto witness them. Let me give you another example. Let � denote the equation

.2 + . = -3 − -2

and #? denote the number of solutions to

.2 + . ≡ -3 − -2 mod ?.

Here is a table? 2 3 5 7 11 13 17 19 · · ·

? − #? −2 −1 1 −2 1 4 −2 0 · · ·

On the other hand, consider the following in�nite product in @

5 = @

∞∏==1

(1 − @=)2(1 − @11=)2

and it is an exercise in binomial expansions to �nd it equals

@−2@2−@3+2@4+@5+2@6−2@7−2@9−2@10+@11−2@12+4@13+4@14−@15−4@16−2@17+4@18+· · · .

Can you see a pattern? The number ? − #? is just the coe�cient of the ?-th power of @. Isthis a coincidence, or is there any explanation for it? The Shimura-Taniyama conjecture (it is atheorem of C. Breuil, B. Conrad, F. Diamond, R. Taylor and A. Wiles) asserts that any equationof the form

.2 +�. = -3 + �-2 + �- +�,

where�, �,�,� ∈ ℚ, corresponds to a power series like 5 in a similar manner.

Number Theory has become an extremely technical subject drawing on techniques fromall over mathematics. Nonetheless, it retains, at the heart of the subject, a particular beautyand elegance in its simplicity of messages it inspires in people (Gauss called number theorythe ‘queen of mathematics’): you might have heard the following:

• Fermat’s Last Theorem (P. Fermat, L. Euler, ..., A. Wiles): the equation -= + .= = /= hasno solutions in integers when = ≥ 3.

• the Twin prime conjecture (..., Y. Zhang, J. Maynard, T. Tao, B. Green, ... not completelyproved yet): there exist in�nitely many pairs of primes that di�er by 2 (e. g. {3, 5} and{17, 19}).

• the Goldbach conjecture (open): every integer (> 2) is the sum of two primes.

• the Riemann hypothesis (open): the “non-trivial” zeros of the Riemann zeta function

Z (B) =∞∑==1

1

=Bin B ∈ ℂ all have real part

1

2(this is one of the seven Millennium prob-

lems; if you solve one of these, you will receive one million dollars from the Clay MathsInstitute).

3

They are extremely hard to prove. For example, it took about 350 years for the FLT to beproved completely and this is the only one in the list that has been proved! David Hilbert, oneof the greatest mathematicians in the 19th and early 20th centuries, famously said “If I were toawaken a�er having slept for a thousand years, my �rst question would be: has the RiemannHypothesis been proven?”.

Topics such as theLanglandsprogram, theBirch-Swinnerton-Dyer conjecture, theFontaine-Mazur conjecture and the abc conjecture are right at the centre of very active research in num-ber theory that continues to inspire many researchers in mathematics.

There are a lot of number theorists in

https://www.bbc.co.uk/programmes/b00srz5b/episodes/player

In my youth, I spent a lot of time reading entries in

https://mathshistory.st-andrews.ac.uk

I occasionally stumble upon articles from

https://www.quantamagazine.org/tag/number-theory/

2 Samplers

So what exactly are we going to learn? Inevitably, they are all about numbers (by which I nor-mally mean integers/natural numbers). Let ? be an odd prime number. We will answer ques-tions such as

• (quadratic reciprocity) Given a number 0, is it congruent to the square of an integer mod? (e.g. −1 ≡ 52 mod 13 but no square is congruent to −1 mod 19)? Is there an easy way(i.e., easier than checking all ? mod ? residues) to �gure this out?

• (Continued fraction and Diophantine approximation) How closely can √? be approxim-ated by a rational number (e.g

√2 is approximately

141421

100000but

1393

985is an even better

approximation with much smaller numerator and denominator)? What do we mean ex-actly by ‘good’ approximation?

• (Pell equations) Does the equation G2 − ?H2 = 1 have a solution? What about G2 − ?H2 = −1(e.g. 182 − 13 · 52 = −1 but there is no solution to G2 − 19H2 = −1)?

• (Representations of primes as sums of squares) Can we express ? in the form G2 + H2 forsome natural numbers G and H (e.g. 13 = 32 + 22 but 19 cannot be)?

3 Revision

Let ℕ = {1, 2, . . . } be the set of natural numbers. Let ℤ = {· · · ,−2,−1, 0, 1, 2, . . . } be the set ofintegers.

4

3.1 Euclid’s algorithm

If 0 ∈ ℕ ∪ {0} and 1 ∈ ℕ, there exists unique numbers @, A ∈ ℕ ∪ {0} such that

0 = 1@ + A

with0 ≤ A < 1.

Remark. The assertion holds for pair of integers 0, 1 with 1 > 0. See Appendix 2.

The @ (resp. A) is o�en referred to as the quotient (resp. remainder) when we divide 0 by 1.When 1 divides 0 (the residue A is 0), we o�en write

1 |0.

Note that 0 |1means something else. Also do not confuse this with 1/0 which is a rational num-ber with numerator 1 and denominator 0 (I will typically write

0

1though).

The highest common factor, or the greatest common divisor in this course, 3 = gcd(0, 1)of two integers (not necessarily positive!) 0 and 1 is 3 ∈ ℕ ∪ {0} characterised by the followingproperties:

• 3 |0 and 3 |1,

• if 4 is a natural number satisfying 4 |0 and 4 |1, then 4 |3.

When gcd(0, 1) = 1, we o�en say that 0 and 1 are relatively prime, or coprime.

Example. gcd(4, 6) = 2. The only integers that divide 4 and 6 are ±1 and ±2. They all divide2.

Remark. Note that, by de�nition, gcd is a non-negative integer. It is certainly possible tode�ne it to bemerely an integer satisfying the properties above (in which case 2 and −2 are bothgcd(4, 6) ) but it would be more convenient for everyone to talk about the gcd.

Example. gcd(0, 0) = 0. Indeed, for any = ∈ ℤ, gcd(=, 0) = =.

Remark. Do you know gcd(0, 1) = gcd(−0, 1) = gcd(0,−1) = gcd(−0,−1)?

Euclid’s algorithm is a procedure to �nd the gcd systematically, given 0, 1 ∈ ℕ. The al-gorithm is based on the observation that

gcd(0, 1) = gcd(1, A)

where A is the unique integer satisfying 0 = 1@ + A with 0 ≤ A < 1. See Appendix 2.

Exercise. Find gcd(225, 157).

225 = 157 · 1 + 68157 = 68 · 2 + 2168 = 21 · 3 + 521 = 5 · 4 + 15 = 1 · 5 + 0

5

Repeatedly applying the ‘key observation’, one sees gcd(225, 157) = gcd(157, 68) = gcd(68, 21) =gcd(21, 5) = gcd(5, 1) = 1.

Exercise. What is gcd(123, 456)? What is gcd(123,−456)?

456 = 123 · 3 + 87123 = 87 · 1 + 3687 = 36 · 2 + 1536 = 15 · 2 + 615 = 6 · 2 + 36 = 3 · 2 + 0

On the other hand,

−456 = 123 · (−4) + 36123 = 36 · 3 + 1536 = 15 · 2 + 615 = 6 · 2 + 36 = 3 · 2 + 0

Euclid’s algorithm also �nds a pair of integers A and B such that

0A + 1B = gcd(0, 1).

In the �rst example above, we work back up the chain:

1 = 21 − 5 · 4= 21 − (68 − 21 · 3) = 21 · 13 − 68 · 4= (157 − 68 · 2) − 68 · 4 = 157 · 13 − 68 · 30= 157 · 13 − (225 − 157) · 30 = 157 · 43 − 225 · 30

So B = −30 and C = 43 work. In fact:

Proposition 1 For any 3 ∈ ℕ and 0, 1 ∈ ℤ, the following are equivalent:

• the equation 0G + 1H = 3 is soluble in G and H.

• gcd(0, 1) divides 3.

Proof. For brevity, let 6 denote gcd(0, 1). Firstly, suppose that there exist A, B in ℤ such that0A + 1B = 3. Since any common divisor of 0 and 1 divides the LHS, it divides 3 on the RHS. Inparticular, 6, the greatest common divisor of 0 and 1, divides 3.

Conversely, suppose that 6 divides 3. Let 3 = I6 for some I ∈ ℤ. By Euclid’a algorithm, onecan �nd A, B in ℤ such that 0A + 1B = 6. Multiplying the equation by I, we have

I6 = I(0A + 1B) = 0(IA) + 1(IB).

Since 3 = I6, the pair (G, H) = (IA, IB) de�nes a solution for the equation 0G + 1H = 3.�

One can extend these concepts to more than two numbers: if 01, . . . , 0# ∈ ℕ, then we havea gcd 3 = gcd(01, . . . , 0# ) such that 01G1 + · · · 0# G# = 3.

6

3.2 Primes and factorisation

A natural number ? ∈ ℕ ∪ {0} is said to be prime if

• ? > 1,

• if ? = 01 holds for some 0, 1 ∈ ℤ, then we either have (0, 1) = (?, 1), (−?,−1), (1, ?) or(−1,−?).

We are going to show that every positive integer greater than 1 can be factored into primes,and the factorisation is unique up to the possibility of writing the factors in a di�erent order(e.g. 12 = 2 · 2 · 3 = 2 · 3 · 2 = 3 · 2 · 2). This innocuous ‘fact’ is in fact known as the FundamentalTheorem of Arithmetic. We will prove this rather carefully.

Proposition 2 (Bezout’s identity) Given 0, 1 ∈ ℤ, there exist integers A, B such that

0A + 1B = gcd(0, 1).

Proof. See Example sheet 1. �

Lemma 3 Let ? be a prime. If ? |01, then either ? |0 or ? |1 holds.

Proof. Suppose that ? does not divide 0. It su�ces to prove, assuming ? divides 01, that ?divides 1. Since ? does not divide 0, gcd(0, ?) = 1. It therefore follows that there exists A, B in ℤ

such that 0A + ?B = 1. Multiplying both sides by 1, we obtain

1 = 1(0A + ?B) = 01A + ?1B.

Since ? divides 01, it divides the term 01A. Also ? certainly divides ?1B. It therefore followsthat ? divides 1. �

Lemma 4 Let ? be a prime. If ? |01 · · · 0# , then ? |0= for some 1 ≤ = ≤ #.

Proof. By the lemma, ? divides 01 or the product 02 · · · 0# . If ? divides 01, we are done.Otherwise ? divides 02 · · · 0# . Using the lemma again, ? therefore divides either 02 or 03 · · · 0# .Repeat the argument. �

Theorem 5 (The Fundamental Theorem of Arithmetic) Any natural number greater than 1 can bewritten as a product of prime numbers, and this product expression is unique apart from re-orderingof the factors.

Proof. We prove the existence of prime factorisation by induction. Let # be a natural num-ber. Suppose that the statement of the theorem holds for any natural number ≤ # − 1. If #itself is a prime, then there is nothing to prove. If # is not a prime, it is a product of two integerseach of which is < #. For these integers, we know from the inductive hypothesis that thesetwo numbers are indeed product of prime numbers. Putting them together, # is a product ofprime numbers.

To prove the uniqueness of prime factorisation, let # = ?1 · · · ?A = @1 · · · @B be prime factor-isations of #. Since ?1 divides @1 . . . @B, it follows from the lemma above that ?1 divides @= forsome 1 ≤ = ≤ B. By re-ordering @’s if necessary, we may assume that ?1 divides @1. Since theyare both positive integers, ?1 = @1. Repeat the argument, starting with ?2 · · · ?A = @2 · · · @B. �

7

3.3 Congruences and modular arithmetic

Let = be a positive natural number. We say that 0, 1 ∈ ℤ are congruent mod = if =| (0 − 1) andwrite

0 ≡ 1 (mod =).

Fact. Congruence mod = is an equivalence relation; the equivalence classes (there are = ofthem, corresponding to the = possible remainders, 0, 1, . . . , =− 1, when we divide a number by=) are called congruence classes modulo =. We denote by [0]= the congruence class modulo =that contains 0. As a set

[0]= = {. . . , 0 − 2=, 0 − =, 0, 0 + =, 0 + 2=, . . . }.

By de�ntion,

· · · = [0 − 2=]= = [0 − 1=]= = [0]= = [0 + =]= = [0 + 2=]= = · · · .

We let ℤ= denote the set of all congruence classes mod = (I would write it ℤ/=ℤ but I willfollow others to be consistent); it is a ring (see Section 3.7) with addition

[0]= + [1]= = [0 + 1]=

and multiplication:[0]= · [1]= = [01]=.

Example. ℤ4. For example,

[1]4 = {. . . ,−11,−7,−3, 1, 5, 9, . . . , }

and+ 0 1 2 3

0 0 1 2 31 1 2 3 02 2 3 0 13 3 0 1 2

× 0 1 2 3

0 0 0 0 01 0 1 2 32 0 2 0 23 0 3 2 1

Proposition 6 If ? is a prime, then ℤ? is a �eld.

Proof. The hardest part is to show that any non-zero element of ℤ? has multiplicative in-verse. Let [0] be a non-zero element; this amounts to assuming that ? does not divide 0. Sincegcd(0, ?) = 1, there exists A, B in ℤ such that 0A + ?B = 1 (Bezout). This means that 0A ≡ 1mod?. Phrased in terms of the corresponding congruence classes, it follows that [0] [A] = [1], i.e.[0] has an inverse [A]. �

Remark. Observe that this proof is constructive, and the key input is Bezout’s identity/Euclid’salgorithm.

8

Example. If ℤ? is a �eld, any non-zero element has an inverse. Let ? = 157. Then the con-gruence class [225]157 de�nes a non-zero element inℤ157. What is the inverse? In other words,what is G ∈ ℤ such that [225]157 [G]157 = [1]157? Recall from earlier that Euclid’s algorithm gaveus 157 ·43−225 ·30 = 1. Reducingmod 157, we have [−30]157 [225]157 = [1]157, so G = −30works.

One of my favourite theorems in number theory:

Theorem 7 (Fermat’s Little Theorem) Let ? be a prime number. Then I? ≡ I mod ? for any I ∈ ℕ.

Remark. Do not confuse this with Fermat’s Last Theorem! In theory, both can be abbrevi-ated as ‘FLT’.

Proof. If ? divides I, or equivalently I is congruent to 0mod ?, the assertion clearly holds.We therefore suppose that I is not congruent to 0mod ?. Consider the set

{I, 2I, . . . , (? − 1)I}

and the set{A1, . . . , A?−1}

of ‘residues’ where 1 ≤ A 9 ≤ ?−1 is de�ned to be the residue (in the range [1, ?−1]) of 9 I whendivided by ?.

{A1, . . . , A?−1} = {1, . . . , ? − 1} By de�nition, we only know that {A1, . . . , A?−1} is a subset of{1, . . . , (?−1)} but they are indeed equal. To see this, it su�ces to establish if A8 = A 9 (i.e. 8I ≡ 9 Imod ?), then 8 ≡ 9 mod ? (so that 8 = 9 since 1 ≤ 8, 9 ≤ ? − 1). But this follows immediatelysince I is coprime to ? and I has inverse mod ?.

The upshot of this observation is that, for every 1 ≤ 9 ≤ ? − 1, there exists a unique1 ≤ 8 ≤ ? − 1 such that 9 I ≡ A8.

In particular, the product of all elements in {I, 2I, . . . , (?−1)I} and {A1, . . . , A?−1} = {1, . . . , ?−1}must coincide mod ?. This therefore results in

?−1∏9=1

9 ≡?−1∏9=1

I 9 = I?−1?−1∏9=1

9 .

Every 9 (1 ≤ 9 ≤ ? − 1) has inverse mod ?; therefore, by multiplying both sides of the congru-ence identity by these inverses, we obtain 1 ≡ I?−1 mod ?. Multiplying I on both sides, I ≡ I?mod ?. �

Here’s another proof.

Proof. We may suppose that I is not congruent to 0 mod ?. The congruence class [I] is anon-zero element of ℤ?. It then follows from the Lagrange’s theorem (see the Introduction toAlgebra notes) that the order of [I] divides the order, ?−1, of the multiplicative group ofℤ?. Ittherefore follows that [1] = [I] ?−1 = [I?−1]. In other words, I?−1 ≡ 1mod ?. Multiplying bothsides by I, we have I? ≡ I mod ?. �

9

Example. Let ? = 7. If 0 ≡ 1mod ?, i.e., 0 and 1 belong to the same congruence class, then0? ≡ 1?. So it su�ces to check on representatives 0 ≤ A ≤ ? − 1.

I 0 1 2 3 4 5 6

I7 0 1 128 2187 16384 78125 279936

I7 = 7@ + A 7 · 0 + 0 7 · 0 + 1 7 · 18 + 2 7 · 312 + 3 7 · 234 + 4 7 · 11160 + 5 7 · 39990 + 6

One possibly surprising application of Fermat’s Little Theorem is that, given a number #,there is a chance that we will know # is composite (i.e. not a prime). All one has to do is spota natural number I such that I# is not congruent to I mod #. For if # were a prime, then I#would be congruent to I mod # for any I.

Be careful that it is certainly possible that I# ≡ I mod # holds for some I even if # is not aprime number. For example, 36 = 729 ≡ 3mod 6 (see Example Sheet 1). What if I# ≡ I mod #holds for any I? Does it mean that # is a prime number?

Example. 32047 ≡ 992 (mod 2047), hence 2047 is not a prime number.

3.4 Congruence equations

Proposition 8 Let 0, = be natural numbers and let 1 be an integer. The congruence equation 0G ≡ 1mod = is soluble if and only if gcd(0, =) divides 1.

Proof. Suppose that 0G ≡ 1 mod = is soluble in G, i.e., there exists an integer A such that0A ≡ 1 mod =. In other words, 0A + I= = 1 for some I ∈ ℤ. A common divisor of 0 and =, inparticular, gcd(0, =), divides the LHS and therefore the RHS.

Conversely, suppose that 6 = gcd(0, =) divides 1. We may then let 0 = 60′, 1 = 61′ and= = 6=′. It su�ces to see that the congruence equation 0′G ≡ 1′ mod =′ is soluble. Sincegcd(0′, =′) = 1, it follows from Bezout that there exists A, B ∈ ℤ such that 0′A + =′B = 1. As aresult 0′A ≡ 1mod =′. Multiplying the both sides by 1′, we �nd that the congruence equation0′G ≡ 1′mod =′ has a solution G = 1A mod =′. �

Example. Let 0 = 2, = = 3, 1 = 5. Since gcd(2, 3) = 1 and this divides 5, the theorem assertsthat the congruence equation 2G ≡ 5 mod 3 is soluble (of course, it is possible to rewrite thecongruence equation as 2G ≡ 2mod 3 but it is more instructive to keep ‘1’ in its original form).

A dogged approach (e�ective when = is very small):

G mod 3 0 1 2

2G mod 3 0 2 4

2G − 5mod 3 1 0 2

Hence G ≡ 1mod 3 is a solution.

A slick approach (e�ective when = is large). While the theorem itself does not say ‘how tosolve the congruence equation’, the proof is constructive and one can follow it to �nd a solu-tion. Since gcd(2, 3) = 1, 0′ = 0 = 2, =′ = = = 3 and 1′ = 1 = 5. Apply Euclid’s algorithm to �nd1 = (−1) · 2 + 1 · 3. Hence (−1) · 5mod 3, i.e., 1mod 3 is the solution for 2G ≡ 5mod 3.

Example. Let 0 = 2, = = 4, 1 = 5. Since gcd(2, 4) = 2, the theorem says the congruenceequation 2G ≡ 5 mod 4 is not soluble. For any G mod 4, 2G is always even mod 4 and cannotpossibly be congruent to 5.

10

3.5 The Chinese Remainder Theorem

The CRT is about solving simultaneous congruences to di�erent moduli. We say that < and =are coprime if gcd(<, =) = 1.

Theorem 9 Let <, = be coprime natural numbers. Then there is a solution to the simultaneous con-gruence equations:

G ≡ 0 (mod <)G ≡ 1 (mod =).

Indeed the solution is unique modulo<= in the sense that if G and H are both solutions, then G ≡ Hmod<= holds.

Proof. We �rstly show the existence. Since gcd(<, =) = 1, there are integers A, B such that<A + =B = 1. We therefore have

<A ≡ 0mod <,<A ≡ 1mod =,=B ≡ 1mod <,=B ≡ 0mod =.

Let G = <A1+=B0. This is what we are looking for. Indeed, G ≡ =B0 ≡ 0mod< while G ≡ <A1 ≡ 1mod =.

To prove the uniqueness, suppose that G and H are solutions. On one hand, it follows fromG ≡ 0 ≡ Hmod < that < | (G− H). On the other hand, G ≡ 1 ≡ Hmod = implies that =| (G− H). Since< and = are coprime, we may then conclude <=| (G − H), in other words, G ≡ H mod <=. �

More generally,

Theorem 10 Let =1, . . . , =A be pairwise coprime natural numbers. Then there is a solution, uniquemodulo =1 · · · =A , to the congruences

G ≡ 08 (mod =8).

Example. Find G satisfying the following simultaneous equations:

G ≡ 2 (mod 3),G ≡ 1 (mod 4),G ≡ 3 (mod 5).

The CRT theorem says that there is a unique solution mod 60which can be found either bytrial-and-error, or following the systematic argument in the proof.

Firstly, we solve the �st two equations– we apply the argument in the proof of CRT with< = 3, = = 4, 0 = 2 and 1 = 1. By Euclid’s algorithm, we �nd 3 · (−1) + 4 · 1 = 1 = gcd(3, 4) and

G = 3 · (−1) · 1 + 4 · 1 · 2 = 5

de�ne a solution mod 12. We need to solve the following simultaneous equations

H ≡ 5 (mod 12),H ≡ 3 (mod 5).

Since gcd(12, 5) = 1, we may apply CRT with < = 12, = = 5, 0 = 5 and 1 = 3. By Euclid’salgorithm, we �nd 12 · (−2) + 5 · 5 = 1 = gcd(12, 5) and

H = 12 · (−12) · 3 + 5 · 5 · 5 = 53

de�nes a solution mod 60.

11

3.6 Prime numbers

Not that, apart from 2, every prime is congruent to either 1 or −1 ≡ 3mod 4. Indeed,

Theorem 11 There are in�nitely many primes congruent to −1mod 4.

Proof. Suppose that there are only �nitely many such primes, say @1, . . . , @A . Consider# = 4@1 . . . @A − 1. It is congruent to −1mod 4.

# is not a prime Indeed if it were a prime, it would be one of the @’s but # is clearly biggerthan any one of them.

If # is not a prime, then it is composite. However,

2 is not a factor of # If it were, # would be even, but it is not (since # is congruent to −1mod 4, it is congruent to −1mod 2).

Similarly,

@ ∈ {@1, . . . , @A } is not a factor of # either If it were, # ≡ 0 mod @, but by de�nition # =

4@1 . . . @A − 1 ≡ −1mod @.

We may then conclude

any prime factor of # is congruent to 1mod 4 We have established that any prime factorof # is not congruent to −1mod 4. On the other hand, a prime factor cannot be congruent to0 nor 2 mod 4 (if it were, 2 would be a prime factor of # which, we know, is not true).

However, the product of integers (we only need to observe it for primes numbers though)that are congruent to 1mod 4 again is congruent to 1mod 4 and this contradicts # ≡ −1mod4. Therefore, there are in�nitely many primes congruent to −1mod 4. �

Remark. It is true that there are in�nitely many primes congruent to 1 mod 4, but whatgoes wrong with the argument if we run it for primes congruent to 1mod 4?

3.7 Appendix: Rings and �elds

De�nition. A ring is

• a set ',

• elements 0, 1 ∈ ',

• maps + : ' × ' → ' and × : ' × ' → ',

subject to 3 axioms

1. (', +) is an abelian group with identity 0,

2. (',×) is a commutative semigroup, i.e. 0 × (1 × 2) = (0 × 1) × 2, 0 × 1 = 1 × 0 = 0 and0 × 1 = 1 × 0 for all 0, 1, 2 ∈ '

12

3. (distributivity) 0 × (1 + 2) = 0 × 1 + 0 × 2 for all 0, 1, 2 ∈ '.

Examples. ℤ,ℚ,ℝ,ℂ.

De�nition. A �eld is a ring ' that satis�es the following extra properties

• 0 ≠ 1,

• every non-zero element of R has a multiplicative inverse: if A ∈ ' and it is non-zero, thenthere exists B ∈ ' such that AB = 1; in other words, '−{0} is a group under ×with identity1.

Non-example. The ring ℤ of integers is not a �eld. Indeed, 3 ∈ ℤ and 7 ∈ ℤ, but there is nointeger I such that 3G = 7, so 3/7 does not lie in ℤ.

Examples. ℚ,ℝ,ℂ.

Example. Let 3 be a square free integer (positive or negative, but not 1). Then the setℚ(√3) = {A + B

√3 | A ∈ ℚ, B ∈ ℚ} is a �eld. If 3 is congruent to 2mod 4, or 3mod 4, let

ℤ[√3] =

{A + B√3 | B ∈ ℤ, C ∈ ℤ

}⊂ ℚ(

√3)

while if 3 ≡ 1mod 4, let

ℤ[1 +√3

2] =

{A + B1 +

√3

2| B ∈ ℤ, C ∈ ℤ

}⊂ ℚ(

√3).

Then ℤ[√3] and ℤ[1 +

√3

2] are rings (See Example Sheet 1).

Why do we have to have the 3 ≡ 1 mod 4 case separately? It will be discussed more formallylater on, so you can safely skip what follows. By de�nition, any element U of ℚ(

√3) is of the form

A +√3B with A, B ∈ ℚ. Then one can eliminate √ to see that U satis�es

U2 − 2AU + (A2 − 3B2) = 0,

in other words, U is a root of themonic polynomial G2−2AG+(A2−3B2) in G with coe�cients inℚ.We may then ask the following question: amongst the elements in ℚ(

√3), which U’s are roots

of the monic polynomial in ℤ-coe�cients? The question boils down to the following question:

�nd the set �ℚ(√3) of all pairs (A, B) ∈ ℚ ×ℚ satisfying 2A ∈ ℤ and A2 − 3B2 ∈ ℤ simultaneously.

It turns out that

• if 3 ≡ 2 or 3 ≡ 3mod 4, then

�ℚ(√3) = {(A, B) ∈ ℚ ×ℚ | A ∈ ℤ, B ∈ ℤ} = ℤ[

√3]

• if 3 ≡ 1mod 4, then

�ℚ(√3) = {(A, B) = (U +

V

2,V

2) ∈ ℚ ×ℚ | U ∈ ℤ, V ∈ ℤ} = ℤ[1 +

√3

2] .

13

These will be proved later on. The following is for those who cannot wait.

[Suppose 3 ≡ 2mod 4 or 3 ≡ 3mod 4 . The inclusion ⊃ is clear. To prove ⊂, we argue asfollows. Let 2A = C ∈ ℤ (we would like to prove 2|C). As

A2 − 3B2 = C2 − 43B2

4∈ ℤ,

it follows that C2 − 43B2 ∈ 4ℤ. We deduce from this that, while we do not know if B ∈ ℤ yet, wedo know that B =

D

2for some D ∈ ℤ. Substitute this back into the equation above, we have

C2 − 3D2 ∈ 4ℤ.

Recall thatI mod 4 I2 mod 4

1 12 03 14 0

We prove D2 ≡ 0mod 4 . Suppose D2 ≡ 1mod 4. Then 3D2 ≡ 2 (resp. 3D2 ≡ 3) mod 4 if 3 ≡ 2(resp. 3 ≡ 3). It follows from C2 − 3D2 ∈ 4ℤ that C2 ≡ 2 (resp. C2 ≡ 3) mod 4. According to thetable, this is not possible.

C2 ≡ 0mod 4 This follows immediately from C2 − 3D2 ∈ 4ℤ.

According to the table, C2 ≡ 1mod 4 implies C ≡ 0 or ≡ 2mod 4. In either case, 2|C. Similarlyfor D.

The case for 3 ≡ 1 is similar but slightly harder. Suppose 3 ≡ 1mod 4 . We show thefollowing two sets are equal:

�ℚ(√3) = {(A, B) ∈ ℚ ×ℚ | A + B ∈ ℤ, A − B ∈ ℤ}.

If this equality holds, then we may, and will, let A − B = U and A + B = U + V (or equivalently2B = V). It is straightforward to check A = U + V

2and B =

V

2. The converse is easy, and what we

claim above holds.

The inclusion ⊃ is easy. To prove the inclusion ⊂, we argue as in the �rst case. We let2A = C ∈ ℤ,

C2 − 3B2 ∈ 4ℤ

forces B to be of the form B =D

2for some D ∈ ℤ. We then have

C2 − 3D2 ∈ 4ℤ.

As 3 ≡ 1mod 4 this time, C2 ≡ 1mod 4 if and only if D2 ≡ 1mod 4. According to the table,this implies that 2| (C−D) and 2| (C+D). We then observe that A−B = C − D

2∈ ℤ and A+B = C + D

2∈ ℤ

as desired. ]

14

3.8 Appendix 2: Euclidian algorithm

Proposition 12 Let 0, 1 ∈ ℤ and suppose 1 > 0. There exist unique @, A ∈ ℤ such that 0 ≤ A < 1.

Proof. Existence Let ( = {0 + I1 | I ∈ ℤ, 0 + I1 ≥ 0}. Since 0 ∈ (, it follows that ( isnon-empty, and let A be the smallest element of (. Necessarily, A is of the form

A = 0 + (−@)1 ≥ 0

for some @ ∈ ℤ.A < 1 If A ≥ 1, then

0 ≤ A − 1 = 0 − (@ + 1)1 < 0 − @1 = A

contradicting the minimality A. Hence A < 1.

Uniqueness Suppose 0 = @1 + A with 0 ≤ A < 1; and 0 = @′1 + A ′ with 0 ≤ A ′ < 1. Observethat A ′ = 0 − 1@′ ∈ (, hence A ′ ≥ A (by the minimality of A). It follows from

A ′ = 0 − 1@′ ≥ 0 − 1@ = A

that @ ≥ @′. We may let @′ = @ − B for some B ≥ 0. It follows that

A ′ = 0 = 1@′ = 0 − 1(@ − B) = 0 − 1@ + 1B = A + 1B.

It then follows that B = 0, therefore @ = @′ and A = A ′. �

Corollary 13 If 0 = @1 + A as above, then

gcd(0, 1) = gcd(1, A).

Proof. Let 6 = gcd(0, 1) and ℎ = gcd(1, A).

6 ≤ ℎ Since 6 |0 and 6 |1, 6 | (0 − 1@), i.e., 6 |A. By the minimality of ℎ, we then conclude that6 ≤ ℎ.

On the other hand,

ℎ ≤ 6 Since ℎ|1 and ℎ |A, ℎ| (@1+A), i.e., ℎ|0. by theminimality of 6, we then conclude ℎ ≤ 6.�

4 Euler’s totient function and primitive roots

De�niton. Euler’s totient function, or Euler’s q-function, is the function q : ℕ→ ℕ that sends= in ℕ to the number of natural numbers 1 ≤ I ≤ = coprime to = (i.e. gcd(I, =) = 1).

Example If ? is a prime, q(?) = ? − 1, since 1, 2, . . . , ? − 1 are all coprime to ?.

Example. q(8) = 4; the odd numbers 1, 3, 5, 7 are coprime to 8, while the even numbers2, 4, 6, 8 are not.

De�nition If ' is a commutative ring with identity, then an element A in ' is said to be aunit if there exists B in ' such that AB = 1. The units in ' form a group under multiplication.

15

Proposition 14 The number |ℤ×= | of elements in the group ℤ×= units of ℤ= is q(=).

Proof. It su�ces to establish that [I] is a unit in ℤ= if and only if I is relatively prime to =.Suppose that I is relatively prime to =, i.e., gcd(I, =) = 1. It then follows that there exists

A, B in ℤ such that IA + =B = 1. Hence IA ≡ 1mod # and this is nothing other than saying that[I] [A] = [1] and [I] is a unit.

Conversely, suppose that [I] is a unit. Then there exists a congruence class [A] ∈ ℤ suchthat [I] [A] = [IA] = [1]. It follows that IE ≡ 1 mod = and we may write IA + =B = 1 for some Ain ℤ. Let 3 = gcd(I, =). By de�nition, 3 divides I and divides =, and therefore it divides IA + =B.Therefore 3 = 1. �

From this, we can deduce

Theorem 15 Let = be a positive integer and I be an integer such that gcd(I, =) = 1. Then Iq (=) ≡ 1mod =.

Proof. Compare this proof with the proof of Fermat’s Little Theorem. For brevity, let # =

q(=) and let I1, . . . , I# be the integers in {0, 1, . . . , =} that are relative prime to =. Consider theset {A1, . . . , A# } where A 9 is de�ned to be the residue 1 ≤ A 9 ≤ = − 1 of II 9 when divided by =.Indeed, {A1, . . . , A# } = {I1, . . . , I# }. To see this, it su�ces to establish that if A8 = A 9 mod = for1 ≤ 8, 9 ≤ #, i.e., II8 ≡ II 9 mod =, then 8 ≡ 9 mod = (hence 8 = 9). But this follows bymultiplyingboth sides by the inverse of I (it exists because I is coprime to =). It then follows that

#∏9=1

I 9 ≡#∏9=1

A 9 =

#∏9=1

II 9 = I#

#∏9=1

I 9 .

Since I 9 for ecery 1 ≤ 9 ≤ # is invertible, we have 1 ≡ I# mod =. �

Corollary 16 Let ? be a prime. Then I? ≡ I (mod ?) for any integer I.

Proof. Let = = ? in the theorem. Then I?−1 ≡ 1mod ? for I not divisible by ?. Multiplyingthe both sides by I, we have I? ≡ I mod ?. On the other hand, if ? divides I, then I ≡ 0mod ?and I? ≡ 0 ≡ I mod ?. �

Remarks (NON-EXAMINABLE) Let = be a positive integer.

If there exists an integer I coprime to =, i.e., gcd(I, =) = 1, such that I=−1 is not congruentto 1mod =, then = is a composite number (this assertion follows from Fermat’s Little Theorem).

If, for any integer I coprime to =, the congruence I=−1 is congruent to 1mod =, then either= is a prime number in which case it is consistent with Fermat’s Little Theorem, or = is a com-posite number in which case a such = is called a Carmichael number/pseudo-primes.

It is not straightforward to �nd Carmichael numbers. The smallest Carmichael number is561 = 3 · 11 · 17. It is a theorem of Alford-Granville-Pomerance (1994) that there are in�nitelymany Carmichael numbers.

Note that, when = is a prime number, the assertion I=−1 ≡ 1 mod = for gcd(I, =) = 1 isequivalent to the assertion that I= ≡ I mod = (Fermat’s Little Theorem). This holds when =is a Carmichael number (surprisingly). More precisely, = is a Carmichael number, i.e., = is a

16

composite number and I=−1 ≡ 1 mod = for gcd(I, =) = 1 if and only if I= ≡ I mod = for everyinteger I. To prove this, we need

Theorem (Korselt, 1899) A compositive integer = is a Carmichael number if and only = issquare free and, for every prime ? dividing =, ? − 1 divides = − 1.

[Can you prove this?]

Let I be an integer such that gcd(I, =) = 1. In this case, I has multiplicative inverse mod =.Multiplying the congruence I= ≡ I mod = by the inverse, we have I=−1 ≡ 1mod =, hence = is aCarmichael number. Conversely, suppose that = is a Carmichael number. By Korselt’s theorem,= is a square free, hence it su�ces to show that, for any prime divisor ? of =, the congruenceI= ≡ I mod ? holds for any integer I. Fix I. If ? divides I, then I ≡ 0mod ?, hence I= ≡ 0 ≡ Imod ?. Suppose that ? does not divide I. In this case, it follows from Fermat’s Little Theoremthat I?−1 ≡ 1 mod ?. By Korselt’s theorem, ? − 1 divides = − 1 and therefore I=−1 ≡ 1 mod ?.Multiplying I on both sides, we see that I= ≡ I mod ? as desired. [non-examinable remarksend here]

Theorem 17 1. If ? is a prime and A > 0, then q(?A ) = ?A−1(? − 1).

2. If gcd(:, ℓ) = 1, then q(:ℓ) = q(:)q(ℓ).

3. If = = ?A11 · · · ?ABB =

B∏9=1

?A 9

9, where ?1, . . . , ?B are distinct primes and A1, . . . , AB > 0, then

q(=) =B∏9=1

?A 9−19(? 9 − 1) = =

B∏9=1

(1 − 1/? 9)

Proof. (a) Let # = ?A where A > 0. The natural numbers less than # which are coprime to?A (equivalently to ?) are those not divisible by ?. There are ?A−1 natural numbers less than ?Awhich are divisible by ?. Hence q(?A ) = q(#) = ?A − ?A−1.

(b) We use the Chinese Remainder Theorem. For brevity, for a positive integer #, we let

(# = {I ∈ ℕ | 1 ≤ I ≤ #, gcd(I, #) = 1}.

By de�nition, |(# | = q(#).We de�ne a map

5 : (: × (ℓ → (:ℓ

and show that it is bijective. This su�ces.Given (A, B) ∈ (: ×(ℓ , the CRT asserts that there exists an integer I, uniquemodulo :ℓ, such

that

I ≡ A mod :

andI ≡ B mod ℓ.

We may replace I by element in the congruence class [I] to assume that 1 ≤ I ≤ :ℓ. Weclaim gcd(I, :ℓ) = 1 . Granted, I ∈ (:ℓ and we may de�ne 5 to send (A, B) to I.

17

To prove that gcd(I, :ℓ) = 1, we argue as follows. Suppose that 3 is an integer that dividesgcd(I, :ℓ). By de�nition, 3 divides :ℓ. Since : and ℓ are coprime, 3 divides either : or ℓ.

When 3 divides : Suppose that 3 divides :. Since 3 divides I, it also divides gcd(I, :).Observing that gcd(I, :) = gcd(A, :) (this follows from I ≡ A mod :), we deduce that 3 dividesgcd(A, :). In conclusion, gcd(I, :ℓ) divides gcd(A, :); in particular,

gcd(I, :ℓ) ≤ gcd(A, :).

When 3 divides ℓ The case is similar to the one above with B (resp. ℓ) in place of A (resp.:). We have

gcd(I, :ℓ) ≤ gcd(B, ℓ).

By de�nition, gcd(A, :) = 1 and gcd(B, ℓ) = 1, therefore, in both cases, it follows thatgcd(I, :ℓ) = 1.

5 is surjective Let I ∈ (:ℓ . By de�nition, gcd(I, :ℓ) = 1. Let 1 ≤ A ≤ : and 1 ≤ B ≤ ℓ beintegers satisfying

I ≡ A mod :

andI ≡ B mod ℓ.

A ∈ (: To prove A ∈ (: , it is enough to check that gcd(A, :) = 1. Let 3 be an integer thatdivides gcd(A, :). By de�nition, 3 divides both A and :. It follows from I ≡ Amod : that 3 dividesI. On the other hand, 3 clearly divides :ℓ. As a result, we may deduce 3 divides gcd(I, :ℓ) andhence gcd(A, :) divides gcd(I, :ℓ); in particular,

gcd(A, :) ≤ gcd(I, :ℓ).

Since gcd(I, :ℓ) = 1, gcd(A, :) = 1.

B ∈ (ℓ The case is similar to the one above with B (resp. ℓ) in place of A (resp. :). We deduce

gcd(B, ℓ) ≤ gcd(I, :ℓ)

and, since gcd(I, :ℓ) = 1, gcd(B, ℓ) = 1.

5 is injective The uniqueness of solution mod :ℓ to I ≡ A mod : and I ≡ B mod ℓ in CRTproves exactly the injectivity.

(c) It follows from (b) that q(=) = ∏8 q(?A8 ). By (a), for each 8, q(?A88 ) = ?

A8−18(?8 − 1). �

Example. 720 = 24 · 32 · 5

q(720) = 23(2 − 1)31(3 − 1)50(5 − 1) = 8 · 6 · 4 = 192.

Proposition 18 Let 3 be a divisor of =. Then the number of integers Iwith 1 ≤ I ≤ = and gcd(I, =) = 3is q

( =3

).

18

Proof. Let = = 3ℓ. The multiplication by 3 de�ne a map

(ℓ = {1 ≤ I′ ≤ ℓ | gcd(I′, ℓ) = 1} → {1 ≤ I ≤ = | gcd(I, =) = 3}.

It su�ces to establish that this map is bijective (since |(ℓ | = q(ℓ)). Let I′ be a element in(ℓ . It is clear that 3I′ de�ne an element of (ℓ . To prove the surjectivity, let I be an elementon the RHS. Since gcd(I, =) = 3, we may divide I by 3. Call it I′. By de�nition, 1 ≤ I ≤ ℓ

and gcd(I′, ℓ) = 1, hence I′ is an element of (ℓ . The injectivity follows immediately because3I′ = 3I′′ for I′, I′′ ∈ (ℓ immediately implies I′ = I′′. �

Example. Let = = 60. According to the proposition, the number of integers 1 ≤ I ≤ 60 such

that gcd(I, 60) = 4 is q(604) = q(15) = q(3 · 5) = (3 − 1) (5 − 1) = 8. They are

{4, 8, 16, 28, 32, 44, 52, 56}.

The number of integers 1 ≤ I ≤ 60 such that gcd(I, 60) = 6 is q(606) = q(10) = q(2 · 5) =

(2 − 1) (5 − 1) = 4. They are{6, 18, 42, 54}.

De�nition. Let = ∈ ℕ. If there exists a positive integer 3 such that I3 ≡ 1 (mod =), then theorder of I mod = is the smallest of all such integer 3.

Alternatively, the order of Imight be de�ned as the smallest integer 3 such that [I]3= = [1]=in the set ℤ= of congruence classes mod =.

Example. Let = = 7.

I mod 7 order mod 7

1 12 33 64 35 66 2

For example, the following table shows that the order of 3 is indeed 6:

3= 3= (mod 7)30 131 332 233 634 435 536 1

Lemma 19 Suppose I has order 3. If I4 ≡ 1mod =, then 3 |4.

Proof. Write 4 = 3@ + A with 0 ≤ A ≤ 3 − 1. It su�ces to show that A = 0. It then follows that

1 ≡ I4 = I3@+A = I3@IA ≡ IA

19

mod =. But by de�nition, 3 is the smallest power for which I3 ≡ 1. Since A ≤ 3 − 1, the onlypossibility is that A = 0. �

The following proposition explains ‘when’ it makes sense for us to talk about the order ofan integer mod =:

Proposition 20 For an integer I, there exists 3 ∈ ℕ such that I3 ≡ 1(mod =) if and only if gcd(I, =) =1. If so, the order of I divides q(=).

Proof. If I3 ≡ 1mod = holds, then gcd(I3 , =) = 1 (if not, i.e., if gcd(I3 , =) > 1, then it wouldhave to divide 1, which is absurd). Evidently, gcd(I, =) = 1.

Conversely, if gcd(I, =) = 1, then Iq (=) ≡ 1 mod = (Theorem 15), hence there do exist suchintegers. The order 3 is the smallest among them.

The last assertion follows from the lemma. �

Example. Let = = 12. In this case, q(12) = 4 with the integers between 1 and 12 coprimeto 12 are 1, 5, 7, 11. We have 11 ≡ 1, 52 ≡ 1, 72 ≡ 1 and 112 ≡ 1 modulo 12. They have orders1, 2, 2, 2 respectively and they divide 4. Note that not every divisor of q(=) necessarily occurs as theorder of an element.

4.1 Primitive roots

De�nition. Let ? be a prime number. An integer I is said to be a primitive root mod ? if I hasorder ? − 1(mod ?). Note that, since q(?) = ? − 1, a primitive roos has the maximum possiblyorder.

Example. What are the primitive rootsmod ? = 7? Looking at the table above, every integerthat is congruent to 3 or 5mod 7 is a primitive root mod 7.

Example. What is a primitive root mod ? = 17? Since 28 ≡ 1mod 17, so 2 is not a primitiveroot. If fact 3 is a primitive root mod 17. It seems rather laborious to check all 3A for 1 ≤ A ≤ 15is not congruent to 1mod 17 and only 316 is.

Lemma 21 Let ? be a prime and 3 be a divisor of ?−1. Then the number of elements in {1, . . . , ?−1}of order 3 mod ? is either 0 or q(3).

Proof. Suppose that the number of such elements is non-zero. So there is at least one ele-ment I of order 3 mod ?.

the numbers 1, I, I2, . . . , I3−1 are all distinct mod ? For if I8 = I 9 mod ? with 0 ≤ 8 < 9 ≤3 − 1, then I 9−8 ≡ 1mod ?. But 9 − 8 < A and this contradicts the minimality of 3.

these 1, I, . . . , I3−1 are exactly the elements of order at most 3 Indeed, any element of or-der at most 3 should be a root in ℤ? of the equation G3 = 1 in G. Evidently, the I 9s are solutionsof the equation and they are all distinct mod ?, they are the solutions of the equation. In otherwords, the elements of order at most 3 are {1, I, . . . , I3−1}.

for 0 ≤ 9 ≤ 3 − 1, that I 9 has order 3 if and only if gcd( 9 , 3) = 1

20

Firstly, suppose gcd( 9 , 3) > 1. The goal is to show that I 9 does not order 3; since we knowthat I 9 has order at most 3, this is equivalent to asserting that the order of I 9 is < 3.

In this case, there exists 6 > 1 that divides gcd(=, 3). Let 9 = 68 and 3 = 64. ObserveI 94 = I38 ≡ 1, hence I 9 has order at most 4 < 3

Conversely, suppose that gcd( 9 , 3) = 1. The goal is to show that I 9 has order exactly 3. IfI 9= ≡ 1 for some = > 0, then it follows from emma 19(combined with the assumption that theorder of I is 3) that 3 | 9=. Since gcd( 9 , 3) = 1, it follows that 3 |=; in particular, 3 ≤ =. This meansthat the order of I 9 is at least 3. On the other hand, we already know that the order of I 9 is atmost 3. Hence the order of I 9 must be 3. �

The proof of the lemma actually explains how to �nd the elements in {1, . . . , ?−1} of order3 mod ? (as long as it is possible for us to �nd ‘I’). Let us work out examples.

Example. Let ? = 17. To �nd the elements of order 3 = 16, i.e., the primitive roots in{1, . . . , ? − 1}, �rstly we �nd ‘I’. For example, 3 is a primitive root mod 17. According to theproof, the elements of order 3 = 16 in {1, . . . , 16} therefore are

{3 9 | 1 ≤ 9 ≤ 16 and gcd( 9 , 16) = 1} = {3, 32, 34, 37, 38, 311, 313, 315} ≡ {3, 9, 13, 11, 16, 7, 12, 6}.

The right-most is worked out by �nding A 9 which is congruent to 3 9 mod 17 satisfying 0 ≤ A 9 ≤16.

To �nd the elements of order 3 = 8, we use 2which has order 8mod 17. Then the elementsof order 8 in {1, . . . , 16} are

{2 9 | 1 ≤ 9 ≤ 7 and gcd( 9 , 8) = 1} = {21, 23, 25, 27} ≡ {2, 8, 15, 9}.

As is more or less clear from the proof of the lemma and the examples that we do not knowyet if there is indeed an element in {1, . . . , ? − 1} of order 3 mod ? exists at all or not (to startthe process). The following proves the ‘existence’.

Theorem 22 Let ? be a prime. For every number 3 dividing ? − 1, let (3 denote the set of elements in{1, . . . , ? − 1} of order 3 mod ?. Then

|(3 | = q(3).

In particular, there are q(? − 1) primitive roots mod ?.

Proof. Let k(3) denote the number of elements in {1, . . . , ? − 1} of order 3 mod ?. We show

(a)∑

3 | (?−1)q(3) = ? − 1,

(b)∑

3 | (?−1)k(3) = ? − 1

(c) For any 3, we have k(3) ≤ q(3).

It follows immediately from these that k(3) = q(3) for 3 | (? − 1).

(a) By Proposition 18, the number of integers 1 ≤ I ≤ ?−1 such that gcd(I, ?−1) = (?−1)/3

is indeed q((? − 1)/ (? − 1)

3

)= q(3). But every 1 ≤ I ≤ ?−1 satis�es that gcd(I, ?−1) = (?−1)/3

21

for some 3. Hence these ? − 1 integers are all counted on the LHS of the equality.

(b) By Fermat’s Little Theorem, every integer 1 ≤ I ≤ ? − 1 has some order (mod ?) thatdivides ? − 1. The LHS counts these ? − 1 integers according to their orders.

(c) Lemma 21 shows the stronger result that, for every 3 | (? − 1), we have either k(3) = 0 ork(3) = q(3). �

Example. ? = 7.

3 1 2 3 6

(3 {1} {6} {2, 4} {3, 5}k(3) 1 1 2 2

q(3) 1 1 2 2

Theorem 23 (NON-EXMANINABLE) Let I be a primitive root mod ?. Then the order mod ? of I= isequal to (? − 1)/gcd(=, ? − 1).

Proof. It is possible to tinker the proof of Lemma 21 to prove this, but we will provide adirect proof. Let gcd(=, ? − 1) = A and write = = A: and (? − 1) = Aℓ. We let 3 denote the orderof I= mod ?. The goal is to show that 3 = ℓ .

3 ≤ ℓ By de�nition, gcd(:, ℓ) = 1. Since (? − 1)/gcd(=, ? − 1) = (? − 1)/A = ℓ, we have

I=ℓ = IA :ℓ = I (?−1): ≡ 1: = 1

mod ? (the last congruence follows because the order of I is ? − 1). Hence 3 ≤ ℓ (in fact, itfollows from Lemma 19 that 3 divides ℓ, and 3 |ℓ follows from this).

ℓ ≤ 3 Since I has order ? − 1, it follows from Lemma 19 that ? − 1 divides =3 (as we knowthat the order of I= is 3, hence I=3 ≡ 1mod ?). In other words, Aℓ = (? − 1) divides A:3 = =3,hence ℓ divides :3. On the other hand, ℓ is coprime to :, hence ℓ divides 3. This, in particular,implies that ℓ ≤ 3.

Combining 3 ≤ ℓ and ℓ ≤ 3, we obtain 3 = ℓ as desired. �

5 Quadratic residues and non-residues, Gauss reciprocity law

The goal of this section is to decide, when ? is an odd prime, whether the congruence equation

G2 ≡ 0 (mod ?)

has integer solutions or not, for any integer 0 not divisible by ?.

De�nition. An integer 0 is a quadratic residue (mod ?) if there exists an integer I withI2 ≡ 0(mod ?); and 0 is a quadratic non-residue if no such I exists.

Remark. If 0 ≡ 1 (mod ?), then 0 is a quadratic residue if and only if 1 is. Therefore itsu�ces to consider 0 ∈ {1, . . . , ? − 1}.

22

Example. Let ? = 7.

I mod 7 I2 mod 7

1 12 43 24 25 46 1

So, the quadratic residues are (the integers congruent mod 7 to)

1, 2, 4,

while the quadratic non-residues are (the integers congruent mod 7 to)

3, 5, 6.

Proposition 24 Let ? be an odd prime. Let I be a primitive root mod ?. If 0 is an integer not divisibleby ?, then 0 is a quadratic residue mod ? if and only if 0 is an even power of I; and is a quadraticnon-residue if and only if it is an odd power of I.

Proof. Let I be a primitive root mod ? (to recall, this means that I has order ? − 1 mod ?);this exists (indeed there are q(? − 1) of them!) by Theorem 22.

{1, I, . . . , I?−2} ≡ {1, . . . , ? − 1} The ? − 1 numbers

1, I, I2, I?−2

are all distinct (otherwise I would have order < ? − 1), so they must be congruent to

1, 2, . . . , ? − 1

in some order. We show that I 9 is a quadratic residue if and only if 9 is even.Suppose, �rstly, that 9 is even and let 9 = 28. Then I 9 = (I8)2, hence I 9 is clearly a quadratic

residue. Conversely, suppose that 0 = I 9 is a quadratic residue. We may and will let 0 ≡ 12mod ? for some 1 ≤ 1 ≤ ? − 1 (if 1 ≡ 2 mod ?, then 0 ≡ 12 holds if and only if 0 ≡ 22 holds;therefore we may choose 1 ‘up to integer multiple of ?’ and in particular choose it so that itsatis�es 1 ≤ 1 ≤ ? − 1). As observed earlier, there must exist 0 ≤ 8 ≤ ? − 2 such that 1 ≡ I8 mod?. Substituting, we have I 9 = 0 ≡ 12 ≡ I28. Since I is primitive, we may then deuce that 9 ≡ 28mod ? − 1. Since 28 and ? − 1 are both even, 9 is even, as desired. �

Example. Let ? = 7. We know that I = 3 is a primitive root mod 7.

3 9 3 9 mod 7

30 131 332 233 634 435 536 1

So any integer congruent to 1, 2 or 4 mod 7 is a quadratic residue, while any integer con-gruent to 3, 5 or 6 is a quadratic non-residue.

23

5.1 the Legendre symbol

De�nition. The Legendre symbol is de�ned by(0

?

)=

0 if ? |0+1 if ? does not divide 0 and 0 is quadratic residue mod ?−1 if ? does not divide 0 and 0 is quadratic non-residue mod ?

Theorem 25

(Rule 0) If 0 ≡ 1 mod ?, then(0

?

)=

(1

?

).

(Rule 1) If ? is an odd prime and 0, 1 ∈ ℤ, then(01

?

)=

(0

?

) (1

?

)(Rule 2) If ? is an odd prime, then(

−1?

)= (−1) (?−1)/2 =

{+1 if ? ≡ 1mod 4−1 if ? ≡ 3mod 4

(Rule 3) If ? is an odd prime, then(2

?

)= (−1) (?2−1)/8 =

{+1 if ? ≡ 1 or 7mod 8−1 if ? ≡ 3 or 5mod 8

(Rule 4) (Quadratic Reciprocity) For any pair of distinct odd primes ? and @,(?

@

) (@

?

)= (−1) (?−1) (@−1)/4 =

{−1 if ? ≡ @ ≡ 3mod 4+1 otherwise

Before proving these assertion,

Example. 38 is a quadratic residue mod 43. One way of checking this, of course, is to solve

the congruence equation G2 ≡ 38mod 43. We will use Theorem 25 to prove(38

43

)= 1:(

38

43

)=

(2

43

) (19

43

)(Rule 1)

= −(19

43

)(Rule 3)

=

(43

19

)(Rule 4)

=

(5

19

)(Rule 0)

=

(19

5

)(Rule 4)

=

(4

5

)(Rule 0)

= +1 (4 ≡ 22 mod 5)

24

Of course, this not the only way to get to(38

43

)= 1. For example,

−(19

43

)= −

(24

43

)(Rule 0)

= −(−143

) (2

43

)2 (6

43

)(Rule 1)

=

(6

43

)(Rule 2)

=

(49

43

)(Rule 0)

=

(7

43

)2(Rule 1)

= +1

Corollary 26 If ? is an odd prime, then(3

?

)=

{+1 if ? ≡ 1 or 11mod 12−1 if ? ≡ 5 or 7mod 12

Proof. By Rule 4, (3

?

)=

+

( ?3

)if ? ≡ 1mod 4

−( ?3

)if ? ≡ 3mod 4

Also ( ?3

)=

{+1 if ? ≡ 1mod 3−1 if ? ≡ 2mod 3

since 1 is squaremod 3while 2 is not. Combining these, the Chinese Remainder Theorem givescongruences mod 12. �

Let us prove Rule 1 and Rule 2.

Proof of Rule 1. If either 0 or 1 is divisible by ?, the assertion follows immediately. Wetherefore assume that both 0 and 1 are not divisible by ?.

Let I be a primitive root mod ? (it exists by Theorem 22). As we saw in the proof of Propos-ition 24, 0 (being not divisible by ?) is congruent to I 9 mod ? for some 0 ≤ 9 ≤ ? − 2 and it

follows that(0

?

)= (−1) 9 . On the other hand, suppose that 1 ≡ I8 for some 0 ≤ 8 ≤ ? − 2. Then

01 = (−1)8+ 9 and therefore(01

?

)= (−1)8+ 9 = (−1) 9 (−1)8 =

(0

?

) (1

?

). �

Proof of Rule 2. Let I be aprimitive rootmod ? and let Z = I (?−1)/2. We thenhave Z2 = I?−1 ≡ 1mod ?, but Z is not congruent to 1 mod ? (if it were, I would have order (? − 1)/2 < (? − 1),contradicting theminimality of the order ?−1), so it has to be congruent to−1mod ?. It followsthat −1 is a quadratic residue (resp. non-residue) if (? − 1)/2 is even (resp. odd), i.e., if ? ≡ 1(resp. ? ≡ 3) mod 4. �

Proposition 27 (Euler’s Criterion) Let 0 be an integer not divisible by ?. Then(0

?

)≡ 0 (?−1)/2 (mod ?).

25

Proof. Let I be a primitive root mod ? and let 0 ≡ I 9 mod ? for some 0 ≤ 9 ≤ ? − 2. SinceI?−1 ≡ 1mod ?, we have

0 (?−1)/2 ≡ I 9 (?−1)/2 ≡{

1 if 9 is evenI (?−1)/2 if 9 is odd

On the other hand, we know from the proof of Rule 2 that I (?−1)/2 ≡ −1mod ?, i.e.

0 (?−1)/2 ≡{

1 if 9 is even−1 if 9 is odd

By Proposition 24, the RHS is exactly the Legendre symbol(0

?

). �

Remark. One can indeed use this proposition to prove Rule 1.

5.2 Solving the equation G2 ≡ 0mod ?

UsingLegendre symbol and the quadratic reciprocity, it is possible to quickly determinewhetheror not the equation

G2 ≡ 0 (mod ?)

has a solution, when ? is an odd prime and 0 is not divisible by ?. It only tells us the existence,or non-existence of a solution, and does not tell us how to �nd it.

Before we delve into the subject, let us ask ourselves: how many solutions are we expectedto �nd (mod ?)? It is the quadratic equation, so there should be max two solutions (mod ?).Can we have all the solutions? Suppose G = I is a solution for the equation above. Then −I willautomatically be the other solution. To see this, �rstly observe that (−I)2 = I2 ≡ 0mod ?. Notethat −I is distinct from I mod ?; for if it were, then 2I ≡ 0mod ? and therefore I ≡ 0 (since ?and 2 are coprime); and 0 ≡ I2 ≡ 0 would be a contradiction to the assumption that ? does notdivide 0.

Proposition 28 Let ? be a prime congruent to 3mod 4. Suppose that(0

?

)= 1. Then I = 0 (?+1)/4 is a

solution to the equation G2 ≡ 0 mod ?.

Proof. By Euler’s criterion,

I2 = 0 (?+1)/2 = 0 (?−1)/2+1 ≡(0

?

)0 = 0.

�

Example. Find all solutions I to each of the following equations with 1 ≤ I ≤ ?:

1. G2 ≡ 2mod 131,

2. G2 ≡ 3mod 131.

Proof. 1) Firstly, we need to know if there are solutions. To this end, we compute the Le-gendre symbol: (

2

131

)= −1

26

since 131 ≡ 3mod 8. Hence there are no solutions.

2) Since (3

131

)'4= −1

(131

3

)'0= −1

(2

3

)= 1,

the equation does have (two) solutions. Using Proposition 28, the solutions are

G ≡ ±3(131+1)/4 ≡ ±333

mod 131. We compute 333 mod 131. To see this, 34 = 81, hence 38 = 812 ≡ 11 mod 131. As316 ≡ 112 ≡ −10, 332 ≡ (−10)2 = 100mod 131. Finally,

333 ≡ 100 · 3 ≡ 38

mod 131. On the other hand,−3333 ≡ −300 ≡ −38 ≡ 93

mod 131. The solutions for G2 ≡ 3mod 131 are 38 and 93mod 131.

Proposition 29 Let ? be a prime congruent to 1mod 4. Suppose that(0

?

)= −1. Then I = 0 (?−1)/4 is

a solution to the equation G2 ≡ −1mod ?.

Proof. By Euler’s criterion,

I2 = 0 (?−1)/2 ≡(0

?

)= −1.

�

Example. Find all solutions I to the equation

G2 ≡ −1 (mod 229)

with 1 ≤ I ≤ 229. The �rst step is to �nd a quadratic non-residue ‘0’. This is done by trial anderror. Note that 1 is always a quadratic residue. So let’s try 2:(

2

229

)'3= −1

since 229 ≡ −3mod 8. Hence, using Proposition 29,

I = 2(229−1)/4 = 257

mod 229 is a solution to the equation. To compute 257, we observe

28 ≡ 27,

216 ≡ 272 ≡ 42

and232 ≡ 422 ≡ 161.

It therefore follows that

257 = 232+16+8+1 ≡ 161 · 42 · 27 · 2 ≡ 122

mod 229. So 122mod 229 is a solution. Since −122 ≡ 107mod 229, 107 is also a solution.

27

5.3 Hensel’s lemma

If %(G) is a polynomial in ℤ[G], let %′(G) ∈ ℤ[G] denote its �rst derivative.

Theorem 30 Let ? be a prime and # ≥ 1 be an integer. Suppose that there exists I ∈ ℤ such that%(I) ≡ 0mod ?# . If %′(I) is not congruent to 0mod ?, then there exists an integer A, unique mod ?,such that I + A ?# de�nes a solution to the equation %(G) ≡ 0mod ?#+1.

Remark. If I satis�es %(I) ≡ 0 mod ?#+1, it certainly satis�es %(I) ≡ 0 mod ?# . The the-orem proves, under certain conditions, that one can prove the converse, i.e., one can ‘li�’ amod ?# solution of the polynomial % to a mod ?#+1 solution.

Proof. Suppose that % has degree 3. Let I be an integer such that %(I) ≡ 0mod ?# . Considerthe Taylor expansion:

%(I + A ?# ) =3∑9=0

% ( 9) (I)9 !(A ?# ) 9 ;

note that % ( 9) (I)/ 9 ! is an integer because % has coe�cients in ℤ. Hence

%(I + A ?# ) ≡ %(I) + %′(I)A ?#

mod ?2# . It follows that

%(I + A ?# ) ≡ 0mod ?#+1 if and only if %(I) ≡ −A ?# %′(I) mod ?#+1

(because terms divisible by ?2# are certainly divisible by ?#+1). Since %(I) ≡ 0 mod ?# byassumption, we may cancel a factor of ?# from this equation so that

%(I + A ?# ) ≡ 0mod ?#+1 if and only if%(I)?#≡ −A%′(I) mod ?.

By assumption, gcd(%′(I), ?) = 1 and %′(I) therefore has an inverse mod ?; we shall call it& ′(I). It follows that

%(I + A ?# ) ≡ 0mod ?#+1 if and only if A ≡ −%(I)?#

& ′(I) mod ?.

This proves the existence and uniqueness of # mod ?. �

Remark. The proof explicitly constructs a li� of the mod ?# solution %(I# ) ≡ 0 to a mod?#+1 solution %(I#+1) by de�ning I#+1 to be

I#+1 ≡ I# − %(I# )& ′(I# )

mod ?#+1 (where & ′(I# ), as in the proof, is the inverse of %′(I# ) mod ?). Since ?# divides%(I# ) by assumption,

I#+1 ≡ I#

mod ?# . It is in this sense that I#+1 is a li� of I# .

Example. Find all solutions to G2 + 1 ≡ 0mod 53.

28

(Step 1) Find a solution mod 5. By trial and error, ±2mod 5 are the solutions to G2 + 1mod5.

(Step 2) Let I = 2 and see if it is possible to li� thismod 5 solution to amod 52 solution, usingthe ‘algorithm’ in the proof. Firstly, since %′(G) = 2G, we have %′(I) = 4 which is manifestly notcongruent to 0mod 5. As a result, %′(I) has an inverse& ′(I); indeed& ′(I) ≡ 4mod 5. We know

I1 ≡ I − %(I)& ′(I) ≡ 2 − 5 · 4 = −18 ≡ 7mod 52

is a mod 52 solution.

(Step 3) See if we can li� the mod 52 solution I1 to a mod 53 solution. We observe %(I1) =%(7) = 72 + 1 = 50 and the inverse & ′(I1) of %′(I1) = %′(7) = 2 · 7 ≡ 4 is, for example, 4mod 5.We then know that

I2 ≡ I1 − %(I1)& ′(I1) ≡ 7 − 50 · 4 ≡ 57mod 53

is a mod 53 solution.

To �nd the other solution, we start with I = −2 and we would get I2 ≡ −57 ≡ 68 mod53.(Exercise to �ll in the argument).

To sum up, 57 and 68mod 53 are the (two) solutions to G2 + 1 ≡ 0mod 53, since the Henselli� is unique and any solution to G2 + 1 ≡ 0mod 53 is a solution to G2 + 1 ≡ 0mod 5.

Remark. This process can be iterated to �nd roots of %(G) mod 5A for any A.

Example. Find all solutions to G3 + 10G2 + G + 3 ≡ 0mod 33.

(Step 1) Find a solution mod 3. Since G3 + 10G2 + G + 3 ≡ G3 + G2 + G mod 3, it is easy to see 0and 1mod 3 are the solutions (mod 3).

(Step 2) Let I = 0 and see if it is possible to li� the mod 3 solution I to a 32 solution I1. As%′(I) ≡ 1mod 3, we can li� the mod 3 solution I = 0 to a mod 32 solution

I1 = I − %(I)& ′(I) ≡ 0 − 3 · 1 ≡ 6mod 32.

(Step 3) See ifmod 32 solution I1 = 6 li�s to a 33 solution. As %(I1) = 63+10·62+6+3 = 585 ≡ 18mod 33 and %′(I1) = 3 · 62 + 20 · 6 + 1 ≡ 1mod 3,

I2 ≡ I1 − %(I1)& ′(I1) ≡ 6 − 18 · 1 ≡ 15mod 33

does the trick.

We would be tempted to carry out the same process starting with I ≡ 1mod 3, but %′(I) ≡ 0in this case, and we need to argue di�erently. If I1 were a mod 32 li� of I ≡ 1 mod 3, then I1would have to be 1mod 3. Therefore I1 would be either 1, 4, 7mod 32. However, none of theseis a solution to %mod 32:

%(1) ≡ %(4) ≡ %(7) ≡ 6mod 32

We therefore conclude that there is no mod 32 li�. There is no mod 33 li� either, for if therewere, it would de�ne a mod 32 li�, which, we know, does not exist. In summary, the equation

29

G3 + 10G2 + G + 3 ≡ 0mod 33 has only one solution 15mod 33.

Remark. There is no general behaviour for when %′(G) ≡ 0mod ?.

6 Finite continued fractions

Recall the calculation of gcd(225, 157) = 1 in terms of Euclid’s algorithm:

225 = 157 · 1 + 68157 = 68 · 2 + 2168 = 21 · 3 + 521 = 5 · 4 + 15 = 1 · 5 + 0

Wemay interpret these steps into the following:

215 = 4 + 1

56821 = 3 + 1

4+ 1515768 = 2 + 1

3+ 1

4+ 15225157 = 1 + 1

2+ 1

3+ 1

4+ 15

These expressions are called continued fractions.

De�nition. For # ≥ 1, U, U1, . . . , U#−1 ∈ ℤ and U# ∈ ℝ, we will write

[U;U1, . . . , U#−1, U# ]

to meanU + 1

U1 +1

. . . + 1

U#−1 +1

U#

.

When ‘# = 0’, we only allow [U; ] to mean U for U ∈ ℤ.

For example,

21

5= [4; 5]

68

21= [3; 4, 5]

157

68= [2; 3, 4, 5]

225

157= [1; 2, 3, 4, 5]

30

Remark. Note you cannot ‘add’ continued fractions: for example, [1; 1] = 1 + 1

1= 2, so

[1 : 1] + [1 : 1] is 4 (or [4; ]). On the other hand [1 + 1; 1 + 1] = [2; 2] = 2 + 1

2=5

2.

Since it will be useful, we introduce here the �oor function:

Proposition 31 Let A = B/C be a rational number A > 1 in its lowest terms, in the sense that gcd(B, C) =1. Then A can be written as a continued fraction [U;U1, U2, . . . , U# ] for some U, U1, . . . , U# ∈ ℕ withU# > 1. If C = 1 (i.e. A = B is an integer), then the continued faction is just [B; ].

Conversely, any sequence of positive integers, the last greater than 1, de�nes a unique ra-tional number > 1:

[U; ] = U

[U;U1, . . . , U# ] = U +1

[U1;U2, . . . , U# ]for # > 0. This allows us to de�ne the symbol [U;U1, . . . , U# ] inductively.

We give two proofs; they are essentially the same but the �rst is recursive and the secondis more explicit. Both involve extracting the continued fraction from Euclid’s algorithm as inthe example above.

Proof I. We prove by induction on C. If C = 1, then A = [B; ]. We may therefore assume thatC > 1. If C > 1, since the fraction is in its lowest term, A is not an integer (if it were, C wouldnecessarily divide B, contradicting gcd(B, C) = 1). If B = UC + C1, then C1 satis�es

0 < C1 < C

andB

C= U + C1

C= U + 1

A1

with A1 =C

C1> 1 a rational number with its lowest term (if not, gcd(B, C) > 1). By inductive

hypothesis,C

C1= [U1;U2, . . . , U# ]

for some U# > 1. Then

A =B

C= U + 1

[U1;U2, . . . , U# ]= [U;U1, . . . , U# ] .

�

De�nition. For d in ℝ, we let bdc denote the largest integer # satisfying # ≤ d.

We may express the proof by a recurrence:

U = bAc

and1

A − U = [U1;U2 . . . , U# ] .

31

Proof II. We run the Euclid’s algorithm:

B = UC + C1 B

C= U + C1

C

C = U1C1 + C2 C

C1= U1 +

C2

C1

C1 = U2C2 + C3 C1

C2= U2 +

C3

C2· · ·

C#−2 = U#−1C#−1 + 1 C#−2C#−1

= U#−1 +1

C#−1C#−1 = U# · 1 C#−1 = U#

We have1 < C#−1 < · · · < C1 < C < B

andA =

B

C= U + 1

U1 +1

. . . + 1

U#−1 +1

U#

,

in other words, A = [U;U1, . . . , U# ]. �

Wemay axiomatise Proof II as follows:Let

A =B

C, A1 =

C

C1, . . . , A#−1 =

C#−2C#−1

, A# = C#−1.

ThenU = bAc ⇒ A1 =

1

A − Uw

U1 = bA1c ⇒ A2 =1

A1 − U1w...

w

U#−1 = bA#−1c ⇒ A# =1

A#−1 − U#−1∈ ℕ

w

U# = bA# c = A#We may turn this into an algorithm to compute the continued fraction of a (positive) ra-

tional number.

Example. A =87

38.

32

U = b8738c = 2 ⇒ A1 =

18738 − 2

=38

11

w

U1 = b38

11c = 3 ⇒ A2 =

13811 − 2

=11

5

w

U2 = b11

5c = 2 ⇒ A3 =

1115 − 2

=5

1∈ ℕ

w

U3 = b5

1c = 5 = A3

Hence A = [U;U1, U2, U3] = [2; 3, 2, 5].

Remark. This algorithm allows us to compute the continued fraction for a non-positive rationalnumber– the only di�erence from ones for positive rational number is that we need to allow

non-positive U’s as a result): let us compute A = −35following the algorithm:

U = b−35c = −1 ⇒ A1 =

135 − (−1)

=5

2

w

U1 = b5

2c = 2 ⇒ A2 =

152 − 2

=2

1∈ ℕ

w

U2 = b2

1c = 2 = A2

Hence −35= [−1; 2, 2]. On the other hand, A = 3

5is computed by

U = b35c = 0 ⇒ A1 =

135 − 0

=5

3

w

U1 = b5

3c = 1 ⇒ A2 =

153 − 1

=3

2

w

U2 = b3

2c = 1 ⇒ A3 =

132 − 1

=2

1∈ ℕ

w

U3 = b2

1c = 2 = A3

Hence3

5= [0; 1, 1, 2].

Remark (NON-EXAMINABLE). It is possible to follows Proof I to compute the continued

fraction, but we need to be careful. For example, for A = −35, we see it as

(−3)5

rather than3

(−5)to see ‘correct residue’ (Recall Euclid’s algorithm Proposition 12):

−3 = 5 · (−1) + 25 = 2 · 2 + 12 = 2 · 1

33

[how do you de�ne the residue of 3 when we divide it by −5? In what range of integers is itallowed to vary?]. HenceIt follows that

−35= −1 + 2

5and

5

2= 2 + 1

2.

−35= −1 + 2

5= −1 + 1

5

2

= −1 + 1

2 + 12

= [−1; 2, 2] .

When the last term of the continued fraction expression satis�es U# > 1, it is possible toprove the uniqueness of the expression:

Theorem 32 If A is a rational number with A = [U;U1, . . . , U:] = [V; V1, . . . , Vℓ] where U: > 1 andVℓ > 1, then : = ℓ and U 9 = V 9 for every 9 .

Proof. We prove this by induction on =.

Suppose : = 0. Then A = U is an integer. If ℓ > 0, then A = V + 1

[V1; V2, . . . , Vℓ]with its

fraction 0 <1

[V1; V2, . . . , Vℓ]< 1 (this is where Vℓ > 1 is used); this is impossible. It therefore

follows that : = ℓ and A = U = V.Suppose that the assertion holds, with : − 1 in place of :. By assumption, we know

U + 1

[U1;U2, . . . , U:]= V + 1

[V1; V2, . . . , Vℓ].

Since U: > 1 and Vℓ > 1, the fractions are both less than 1 and we may deduce U = bAc = V. Itthen follows that

[U1;U2, . . . , U:] = [V1; V2, . . . , Vℓ]

and it follows from the inductive hypothesis that : − 1 = ℓ − 1 and U 9 = V 9 for every 1 ≤ 9 ≤ :.�

Following Proposition 31, we de�ne:

De�nition. Let U, U1, . . . , U# ∈ ℕ. For 0 ≤ = ≤ #,

A= = [U;U1, . . . , U=] .

is a rational number and {A0, . . . , A# } are called the convergents of the continued fractionA = [U;U1, . . . , U# ].

Remark. This seems like a misnomer to call them ‘convergents’, but we will see in the nextsection that the convergents do converge when A is an irrational number.

De�nition. Given U, U1, . . . , U# ∈ ℕ, we de�ne: B−2 = 0, B−1 = 1, B0 = U and, for = = 1, . . . , #,

B= = U=B=−1 + B=−2

and de�ne: C−2 = 1, C−1 = 0, C0 = 1 and, for = = 1, . . . , #,

C= = U=C=−1 + C=−2.

Remark. The B= and C= are both strictly increasing sequences of positive integers.

Proposition 33 A= =B=

C=for every 0 ≤ = ≤ #.

34

Proof. We prove this by induction on =. By de�nition, A0 =B0

C0= U. Suppose that A= =

B=

C=

holds (we would like to establish that A=+1 =B=+1C=+1

). By de�nition,

A= = [U;U1, . . . , U=] =U=B=−1 + B=−2U=C=−1 + C=−2

.

It then follows that

A=+1 = [U;U1, . . . , U=+1

U=+1] =

(U= +

1

U=+1

)B=−1 + B=−2(

U= +1

U=+1

)C=−1 + C=−2

=U=+1(U=B=−1 + B=−2) + B=−1U=+1(U=C=−1 + C=−2) + C=−1

=U=+1B= + B=−1U=+1C= + C=−1

=B=+1C=+1

.

�Example. Compute the convergents of [3; 7, 15, 1] = 3.1415929203 . . . (this is actually a

‘truncated’ in�nite continued fraction [3; 7, 15, 1, 292, 1, . . . ] of c = 3.141592653589793 . . . ).

On one hand,

B−1 = 1B0 = 3B1 = U1B0 + B−1 = 7 · 3 + 1 = 22B2 = U2B1 + B0 = 15 · 22 + 3 = 333B3 = U3B2 + B1 = 1 · 333 + 22 = 355.

On the other hand,

C−1 = 0C0 = 1C1 = U1C0 + C−1 = 7 · 1 + 0 = 7C2 = U2C1 + C0 = 15 · 7 + 1 = 106C3 = U3C2 + C1 = 1 · 106 + 7 = 113.

Hence A1 =22

7= 3.14285714 . . . , A2 =

333

106= 3.1415094 . . . , A3 =

355

113= 3.141592...

Example. What are the convergents of [1; 1, . . . , 1]?

By de�nition,

B−1 = 1B0 = 1B1 = 2B= = B=−1 + B=−2 for = ≥ 2

and

C−1 = 0C0 = 1C1 = 1C= = C=−1 + C=−2 for = ≥ 2.

35

The recursive relations should remind youof Fibonacci numbers. The convergents A1, A2, A3, A4, A5, A6, . . .are

1

1,2

1,3

2,5

3,8

5,13

8, . . . ,

Do you see any pattern?

Theorem 34 Following the notation above,

• B=C=−1 − C=B=−1 = (−1)=−1 for = ≥ 1,

• A= − A=−1 =(−1)=−1C=−1C=

for = ≥ 1,

• gcd(B=, C=) = 1.

Proof. To prove the �rst assertion, we use induction on =. To recall,

B0 = U,

B1 = UU1 + 1,C0 = 1,C1 = U1.

Hence B1C0 − C1B0 = (UU1 + 1) − U1U = 1 = (−1)0.Suppose that B=−1C=−2 − C=−1B=−2 = (−1)=−2 holds (we would like to prove B=C=−1 − C=B=−1 =

(−1)=−1). Since B= = U=B=−1 + B=−2 and C= = U=C=−1 + C=−2, we have

B=C=−1 − B=C=−1 = (U=B=−1 + B=−2)C=−1 − (U=C=−1 + C=−2)B=−1= B=−2C=−1 − C=−2B=−1= −(−1)=−2= (−1)=−1.

The second assertion follows immediately from A= =B=

C=, A=−1 =

B=−1C=−1

and the �rst assertion.

The third assertion follows immediately from the �rst. �

Corollary 35 The convergents A=’s satisfy

A0 < A2 < A4 < · · · < A5 < A3 < A1.

Proof. Applying Theorem 34 twice, we get

A=+2 − A= = (A=+2 − A=+1) + (A=+1 − A=) =(−1)=+1C=+2C=+1

+ (−1)=

C=+1C==(−1)=

C=+2C=+1C=(C=+2 − C=) =

(−1)=U=+2C=+2C=

since, by de�nition, U=+2 =C=+2 − C=C=+1

at the last equality.

If = is even (resp. odd), the RHS is positive (resp. negative), so A=+2 > A= (resp. A= > A=+2).It remains to show that

A28 < A2 9−1

for every 8 ≥ 0 and 9 ≥ 1. Since the ‘even’ convergents increase, while ‘odd’ ones decrease, itfollows from Theorem 34 to get A# − A#−1 = (−1)#−1/C#−1C# < 0 when # = 28 + 2 9 is evidentlyeven, which yields

A28 < A28+2 9 < A28+2 9−1 < A2 9−1,

as desired. �

36

7 In�nite continued fractions

As promised:

Theorem 36 Let U, U1, . . . be a sequence of integers such that U= > 0 if = ≥ 1. De�ne, for every = ≥ 0,

A= = [U;U1, . . . , U=] .

Then the sequence A0, A1, A2, . . . , of rational numbers converges to a limit (not necessarily a rationalnumber).

Proof. Since the A0, A1, . . . , A= are the convergents to the�nite continued fraction [U;U1, . . . , U=]for any �xed = ≥ 0, all the results in the preceding section apply results of the preceding sec-tion:

• A= =B=

C=,

• (Corollary 35) A0 < A2 < A4 < · · · < A5 < A3 < A1,

• (Theorem 34) A= − A=−1 =(−1)=−1C=−1C=

.

Since the even termsA0, A2, A4, . . . ,

form an increasing sequence bounded above by A1, it tends to a limit d. On the other hand, theodd terms

A1, A3, A5, . . . ,

form a decreasing sequence bounded below by A0 and it tends to a limit d′. Since A2 9 < A2 9+1(Theorem 34),

d = lim9→∞

A2 9 ≤ lim9→∞

A2 9+1 = d′.

[Note that ≤ is not a typo; even if A2 9 < A2 9+1 is maintained for every 9 , there is no way ofknowing a priori this continues to hold in the limit] On the other hand,

|A# − A#−1 | = |(−1)#−1C#−1C#

| → 0

as # = 2 9 tends to∞. It therefore follows

|d − d′ | = | (d − A# ) + (A# − A#−1) + (A#−1 − d′) | ≤ |d − A# | + |A# − A#−1 | + |d′ − d#−1 | → 0

and one can deduce d = d′. �

De�nition. We de�ne the limit of the sequence of convergents to be the value of the in�nitecontinued fraction [U;U1, . . . ].

We show that every real number (not just a rational number) has a continued fraction ex-pansion:

Theorem 37 For every irrational number A, there exists a sequence of integers U0, U1, . . . with U= > 0if = ≥ 1 such that the value of [U;U1, . . . ] is A.

37

Proof. Let d0 = A and U = bd0c ∈ ℤ so that 0 < d0 − U < 1. The number d1 =1

d0 − U> 1 is

irrational (if not, A would be rational). Wemay continue this process: startingwith an irrational

number d= > 1, we let U= = bd=c ∈ ℕ and let d=+1 denote the irrational number1

d= − U=> 1.

Then

U1, U2, . . . ,

are positive integers andd1, d2, . . . ,

are irrational numbers > 1. The process continues without an end and [U;U1, . . . , ] de�ne anin�nite continued fraction (This ‘algorithm’ is called the continued fraction algorithm). It re-mains to check that the value of this continued fraction is indeed A (that we started with).

We prove by induction on = ≥ 1 that

[U;U1, . . . , U=−1, d=] = A.

When = = 1, A = U + d0 − U = U +1

d1= [U; d1].

Suppose that [U;U1, . . . , U=−1, d=] = A holds for = ≥ 1 (wewould like to show [U;U1, . . . , U=, d=+1] =A holds). Since

1

d=+1= d= − U=,

[U;U1, . . . , U=, d=+1]= U + 1

U1 +1

. . . U=−1 +1

U= +1

d=+1

= U + 1

U1 +1

. . . U=−1 +1

U= + d= − U== U + 1

U1 +1

. . . U=−1 +1

d== [U;U1, . . . , U=−1, d=]= A,

the claim thus follows.

Let{A= = [U;U1, . . . , U=]}

be the convergents of [U;U1, . . . , ]. We know that the convergents tend to a limit d. We need toshow that d = A, i.e., the continued fraction algorithm correctly de�nes the continued fractionof A. We know from the proof of Theorem 36 that

d ≥ A2 9

38

andA2 9+1 ≥ d,

it su�ces to establish the same set of inequalities with d replaced by A holds. Indeed, if this isthe case, letting # = 2 9 for example,

|A − d | ≤ |A# − A#−1 | → 0

as 9 →∞.Suppose = is even. From the algorithm, we see that bd=c = U= so U= < d=. It therefore

follows from the lemma below that

A= = [U;U1, . . . , U=−1, U=] < [U;U1, . . . , U=−1, d=] = A.

Then case when = is odd follows similarly. �

Lemma 38 Suppose that γ < γ′ for positive real numbers γ, γ′. Then, for = ≥ 1,

• [U;U1, . . . , U=−1, γ] < [U;U1, . . . , U=−1, γ′] when = is even,

• [U;U1, . . . , U=−1, γ] > [U;U1, . . . , U=−1, γ′] when = is odd.

Proof. Prove by induction on =. When = = 1,

U + 1

γ= [U; γ] > [U; γ′] = U + 1

γ′

holds because of the assumption γ < γ′. Suppose that the assertion holds with = − 1 in place of=.

Suppose �rstly that = is even (i.e. = − 1 is odd). It then follows that

[U;U1, . . . , U=−1, γ] = U +1

[U1;U2, . . . , U=−1, γ]< U + 1

[U1;U2, . . . , U=−1, γ′]= [U;U1, . . . , U=−1, γ′]

because by the inductive hypothesis we know

[U1;U2, . . . , U=−1, γ] = [V; V1, . . . , V=−2, γ] > [V; V1, . . . , V=−2, γ′] = [U1;U2, . . . , U=−1, γ′] .

The case = is odd is similar. �

Remark. The proof is constructive. The algorithm is exactly the same as the one for �nitecontinued fractions (for positive/negative rational numbers).

Example. What is the continued fraction of c = 3.141592653589793? Applying the contin-ued fraction algorithm to get

39

U = b3.141592653589793c = 3 ⇒ d1 =1

0.141592653589793= 7.062513305931046

w

U1 = b7.062513305931046c = 7 ⇒ d2 =1

0.06251330593104577= 15.99659440668572

w

U2 = b15.99659440668572c = 15 ⇒ d3 =1

0.99659440668572= 1.003417231013372

w

U3 = b1.003417231013372c = 1 ⇒ d4 =1

0.003417231013372= 292.6345910144503

w

U4 = b292.6345910144503c = 292 ⇒ d5 =1

0.6345910144503= 1.575818089492172

w...

so the continued fraction looks like [3; 7, 15, 1, 292, 1, . . . ]. The convergents are

3,22

7,333

106,355

113,103993

33102, . . .

The numbers are22

7and

355

113are well-known approximations to c!

Example A = 1 + 1

2

√2.

U = b1 + 1

2

√2c = 1 ⇒ d1 =

1

(1 + 1

2

√2) − 1

=√2

w

U1 = b√2c = 1 ⇒ d2 =

1√2 − 1

= 1 +√2

w

U2 = b1 +√2c = 2 ⇒ d3 =

1

(1 +√2) − 2

= 1 +√2 = d2

w

U3 = U2 ⇒ d4 = d3 = d2w...

Hence 1 + 1 + 1

2

√2 is the value of [1; 1, 2, 2, . . . ].

Example. A =√15 − 3.

40

U = b√15 − 3c = 0 ⇒ d1 =

1√15 − 3

=

√15 + 316

w

U1 = b√15 + 316

c = 1 ⇒ d2 =1

(√15+316 ) − 1

=√15 + 3

w

U2 = b√15 + 3c = 6 ⇒ d3 =

1

(√15 + 3) − 6

=1

√15 − 3

= d1

w

U3 = U1 ⇒ d4 = d2w

U4 = U2 ⇒ d5 = d3 = d1w...

Hence√15 − 3 is the value of [0; 1, 6, 1, 6, . . . ].

Example. The golden ratio A =1 +√5

2. This is the value of [1; 1, 1, . . . ].

Finally, we show that the continued fraction expression is unique:

Theorem 39 Every irrational number is the value of a unique in�nite continued fraction.

Proof. Suppose that A is irrational and A = [U;U1, . . . , ]. Then A = U + 1

d1with d1 > 1 is

irrational; so U = bAc and d1 =1

A − U are uniquely determined by A. Similarly, U1 = bd1c and

d2 =1

d1 − U1are uniquely determined, and so on. �

To sum up, there is a bijection between

• the set of irrational numbers

• the set of in�nite continue fractions [U;U1, . . . ], where U ∈ ℤ and U1, · · · ∈ ℕ.

Given an irrational number, the continued fraction algorithm (See the proof of Theorem 37)de�nes an in�nite continued fraction. Conversely, given a continued fraction [U;U1, . . . , ], wemay calculate the convergents A= = [U;U1, . . . ] and their limit gives the corresponding irra-tional number.

7.1 Periodic continued fraction

De�nition. An in�nite continued fraction [U;U1, . . . ] is periodic if there exists ;, # with ; > 0such that the sequence stabilizes at the #-th repeats itself with cycle of length ;:

U# = U#+; = U#+2; = · · · ,U#+1 = U#+;+1 = U#+2;+1 = · · · ,...

...... · · ·

U#+;−1 = U#+2;−1 = U#+3;−1 = · · · ,

41

or more succinctlyU=+; = U=

for all = ≥ #; and we write the continued fraction as

[U;U1, . . . , U#−1, U# , U#+1, . . . , U#+;−1]

It is said to be purely periodic if # = 0, i.e.,

U=+; = U=

for all = ≥ 0.

Example. [2; 1, 2, 1, 2, 1, 2, . . . ] = [2; 1]. What (irrational) real number A does this continuedfraction represent? By de�nition, it su�ces to solve the equation

G = [2; 1, G] = 2 + 1

1 + 1G

= 2 + G

G + 2 =3G + 2G + 1 ,

which is G2−2G−2 = 0. The solution (inℝ) of the equation is 1±√3 but the expression suggests

A > 2. So A = 1 +√3.

Example. What about B = [3; 5, 2, 1, 2, 1, 2, 1, 2, . . . ] = [3; 5, 2, 1]? Let A denote the 2, 1-bitwhich we know, from the �rst example, to be 1 +

√3, so that

B = [3; 5, A]= 3 + 1

5 + 1

A

= 3 + A

5A + 1=

16A + 15A + 1

=126 −

√3

39

These two real numbers are irrational while satisfying quadratic equations (B satis�es (39B−126)2 = 3).

It does not have be concrete numbers:

Example. Let = be a positive integer. Then√=2 + 1 = [=; 2=] .

To see this, we simply run the algorithm:

U = b√=2 + 1c = = ⇒ d1 =

1√=2 + 1 − =

=√=2 + 1 + =

w

U1 = b√=2 + 1 + =c = 2= ⇒ d2 =

1

(√=2 + 1 + =) − 2=

=1

√=2 + 1 − =

=√=2 + 1 + = = d1

w

U2 = U1 ⇒ d3 = d2 = d1w...

42

hence√=2 + 1 = [=, 2=, 2=, . . . ] = [=, 2=].

Conversely, if we are given [=, 2=], can we work out the value A of the continued fraction?Let B = [2=] = [2=, 2=, . . . ]. By de�nition,

B = 2= + 1

B,

i.e., B2 − 2=B − 1 = 0. Solving this equation in A, we have B =2= ±

√4=2 + 42

= = ±√=2 + 1. Since

B > 0, it follows that B = = +√=2 + 1. To compute A, we compute

A = = + 1

B= = + 1

= +√=2 + 1

==2 + =

√=2 + 1 + 1

= +√=2 + 1

=

√=2 + 1

(= +√=2 + 1

)= +√=2 + 1

=√=2 + 1.

8 Diophantine approximation

We have seen that if A is the value of [U;U1, . . . ] and A= = [U;U1, . . . , U=] is the =-th convergentto A, then the numbers A= are rational numbers that tend to the limit A. In this section, weestablish that they give best possible approximations to A.

What should be a good rational approximationB

Cto A?

• it should be close to A (of course),

• the denominator C is relatively small; there should be no rational number, with smallerdenominator, that is close to A.

The goal of this section is to establish that the convergents to the continued fraction for Aindeed satisfy these properties.

To this end, recall:

•A= =

B=

C=,

whereB= = U=B=−1 + B=−2

for = ≥ 1, B0 = U, B−1 = 1, B−2 = 0,

C= = U=C=−1 + C=−2,

for = ≥ 1, C0 = 1, C−1 = 0, C−2 = 1;

• (Corollary 35) A0 < A2 < A4 < · · · < A < · · · < A5 < A3 < A1;

• |A − A= | < |A=+1 − A= | for every =.

Since Theorem 34 asserts |A=+1 − A= | =1

C=C=+1, it follows that

|A − A= | < |A=+1 − A= | =1

C=C=+1.

43

Example. We saw that A =√15 − 3 is the value of [0; 1, 6, 1, 6, . . . ]. The convergents for√

15 − 3 are

A0 =0

1, A1 =

B1

C1=1

1, A2 =

B2

C2=6

7, A3 =

7

8, A4 =

B4

C4=48

55, A5 =

B5

C5=55

63, A6 =

B6

C6=378

433, . . . .

How good is A6, as a (rational) approximation to A?

|A − A6 | ≤1

C6C7=

1

433 · 496 =1

214768<

1

104,

hence accurate to the four decimal places.

We know that A= is always a better approximation to A than A=−2 is (recall A=−2 < A= < A if =is even and A < A= < A=−2 if = is odd). What about A= vs A=−1? We will answer the question byshowing that, a�er the �rst step, the approximation always gets better as = increases.

When A is the value of the continued fraction, [U;U1, . . . ], we let A= denote the =-th conver-gent [U;U1, . . . , U=]; let d= ∈ ℝ be the output of the continued fraction algorithm a�er =-steps:

U=−1 = bd=−1c ⇒ d= =1

d=−1 − U=−1and

A = [U;U1, . . . , U=−1, d=]

(see the proof of Theorem 37) and we may see d= as

d= = [U=;U=+1, . . . ] .

The following will be useful (cf Proposition 33):

Lemma 40 Let A be an irrational number.

A = [U;U1, . . . , U=, d=+1] =d=+1B= + B=−1d=+1C= + C=−1

.

Proof. The �rst equality is established in the proof of Theorem 37. We prove the secondequality by induction on =.

By de�nition, A = U + A − U = U + 1

d1=Ud1 + 1d1

.

Suppose that [U;U1, . . . , U=−1, d=] =d=B=−1 + B=−2d=C=−1 + C=−2

holds. Since [U;U1, . . . , U=, d=+1] = [U;U1, . . . , U=−1, U=+1

d=+1] and

(U= +1

d=+1)B=−1 + B=−2

(U= +1

d=+1)C=−1 + C=−2

=d=+1(U=B=−1 + B=−2) + B=−1d=+1(U=C=−1 + C=−2) + C=−1

=d=+1B= + B=−1d=+1C= + C=−1

.

�It follows

A − B=C== [U;U1, . . . , U=, d=+1] −

B=

C==d=+1B= + B=−1d=+1C= + C=−1

− B=C==C=B=−1 − B=C=−1C= (d=+1C= + C=−1)

=(−1) (−1)=−1

C= (d=+1C= + C=−1),

44

by Theorem 34 and therefore

|A − B=C=| = | (−1)= ||C= (d=+1C= + C=−1) |

=1

C= (d=+1C= + C=−1)<

1

C=C=+1

asd=+1C= + C=−1 > U=+1C= + C=−1 = C=+1

(recall that U=+1 = bd=+1c, hence d=+1 > U=+1).

Proposition 41 For every = ≥ 2, we have

• |C=A − B= | < |C=−1A − B=−1 |,

• |A − A= | < |A − A=−1 |.

Proof. Using the argument above, we have

|C=A − B= | = C= |A −B=

C=| = 1

d=+1C= + C=−1

and|C=−1A − B=−1 | = C=−1 |A −

B=−1C=−1| = 1

d=C=−1 + C=−2.

To prove the �rst assertion, it therefore su�ces to establish

d=+1C= + C=−1 > d=C=−1 + C=−2.

By de�nition, U= = bd=c, thus d= < U= + 1. It follows that

d=C=−1 + C=−2 < (U= + 1)C=−1 + C=−2 = U=C=−1 + C=−2 + C=−1 = C= + C=−1 < d=+1C= + C=−1,

since d=+1 > 1 by de�nition– recall that d=+1 =1

d= − U==

1

d= − bd=cwhere 0 ≤ d= − bd=c < 1.

To prove the second assertion, we �rstly observe from the �rst assertion, combined withC= > C=−1 ≥ 1, that

|C=A − B= |C=

<|C=−1A − B=−1 |

C=<|C=−1A − B=−1 |

C=−1.

The LHS is |A − B=C=| = |A − A= | and |A −

B=−1C=−1| = |A − A=−1 |. �

De�nition. we say that a rational numberB

Cis a good approximation to A if

|A − BC| < |A − B

′

C ′|

for any rational numberB′

C ′with C ′ < C.

Theorem 42 Let A be an irrational number, [U;U1, . . . ] be the continued fraction for A, and A= =[U;U1, . . . , U=] =

B=

C=be the =-th convergent. Let d =

B

Cbe a rational number in its lowest terms. If

C < C= whenever = > 1, then|A − B

C| > |A − B=

C=|

holds.

45

Remark. The theorem asserts that the convergents A=, for = ≥ 2, are good approximationsto A.

We need a lemma.

Lemma 43 Let A be an irrational number, [U;U1, . . . ] be its continued fraction, A= = [U;U1, . . . , U=] =B=

C=be the =-th convergent. If B and C are integers satisfying gcd(B, C) = 1 and C < C=, then

|CA − B | ≥ |C=−1A − B=−1 |

holds, with equality if and only ifB

C=B=−1C=−1

.

Proof of the lemma. This is due to Lagrange. Consider the following system of equations

B=−1G + B=H = B

C=−1G + C=H = C.

Since B=−1C= − B=C=−1 = (−1)= (Theorem 34), we see that it has a unique integer solution

(G, H) = ((−1)= (BC= − CB=), (−1)= (CB=−1 − BC=−1)).

G is non-zero Suppose G = 0. This is equivalent to BC= − CB= = 0, i.e.B

C=B=

C=. While

B=

C=is in

its lowest terms (Theorem 34), the equality express the same rational number with two distinctdenominators (recall C < C= by assumption). This is a contradiction.

H is non-zero or (B, C) = (B=−1, C=−1) Similar to the argument above. If H = 0, then CB=−1 −BC=−1 = 0, hence

B

C=B=−1C=−1

; note that as ‘C < C=−1’ is NOT assumed, it is not possible to eliminate

this case. To sum up, H is either non-zero, or zero in which caseB

C=B=−1C=−1

and therefore B = B=−1and C = C=−1 (again, because ‘C < C=−1’ is NOT assumed, this is allowed!) and consequently

|CA − B | = |C=−1A − B=−1 |.

Suppose that H is non-zero. Our goal then is to establish that

|CA − B | > |C=−1A − B=−1 |.

G and H have opposite signs Suppose that H < 0. If = is odd, then H = −(CB=−1 − BC=−1) < 0,

i.e., CB=−1−BC=−1 > 0, i.e.,B

C<B=−1C=−1

= A=−1. On the other hand, Theorem 34 proves that A=−A=−1 >

0 (since = − 1 is even), hence BC<B=−1C=−1

<B=

C=, i.e., BC= − CB= < 0. As a result, G = (−1)= (BC= − CB=)

(since = is odd). Similarly, if = is even, CB=−1 − BC=−1 < 0, i.e.,B

C>B=−1C=−1

= A=−1. As Theorem 34

proves that A= − A=−1 < 0 (since = − 1 is odd), it follows that BC>B=−1C=−1

>B=

C=, i.e., BC= − CB= > 0. As

a result G = (−1)= (BC= − CB=) > 0 (since = is even).One can similarly show that if H > 0, then G < 0 (whether = is odd or even).

C=−1A − B=−1 and C=A − B= have opposite signs Observe that A lies between A=−1 =B=−1C=−1

and

A= =B=

C=, hence C=−1A − B=−1 and C=A − B= have opposite signs.

46

Combining the last two items, we may then deduce that G(C=−1A − B=−1) and H(C=A − B=) havethe same signs. It hence follows that

|CA − B | = | (C=−1G + C=H)A − (B=−1G + B=H) |= |G(C=−1A − B=−1) + H(C=A − B=) |= |G(C=−1A − B=−1) | + |H(C=A − B=) |= |G | |C=−1A − B=−1 | + |H | |C=A − B= |> |C=−1A − B=−1 |.

The last (strict) inequality follows since |G | > 1, |C=−1A − B=−1 | > 0 and |C=A − B= | > 0. �

Proof of Theorem 42. Suppose C < C=. Then it follows from Lemma 43, Proposition 41

|A − BC| = 1

C|AC − B |>1

C|AC=−1 − B=−1 | >

1

C|AC= − B= | >

1

C=|AC= − B= | = |A −

B=

C=|.

�

Theorem 44 Let A be an irrational number, and let B, C ∈ ℤ with C > 0 and gcd(B, C) = 1. Suppose

that |A − BC| < 1

2C2. Then

B

Cis a convergent to A.

Proof. Let [U;U1, . . . ] be the continued fraction for A (Theorem 37), and de�ne A= =B=

C=as

before. Choose = such thatC=−1 ≤ C < C=.

Lemma 43 states |CA − B | ≥ |C=−1A − B=−1 |, and it therefore follows that

|C=−1A − B=−1 | ≤ |CA − B | = C |A −B

C| < C

2C2=

1

2C

implies

|A − B=−1C=−1| ≤ 1

2C=−1C.

On one hand,

| BC− B=−1C=−1| = | B

C− A + A − B=−1

C=−1| ≤ |A − B

C| + |A − B=

C=| < 1

2C2+ 1

2C=−1C≤ 2

2C=−1C=

1

C=−1C.

On the other hand,| BC− B=−1C=−1| = BC=−1 − CB=−1

CC=−1.

It therefore follows that the numerator BC=−1 − CB=−1 has to be zero, i.e.B

C=B=−1C=−1

. �

8.1 Periodic continue fraction II

A quadratic irrational A is a real number of the form B + C√3 where B ∈ ℚ, C ∈ ℚ − {0} and 3 > 1

is a square-free integer.

Quadratic irrationals are precisely the roots of irreducible degree 2 polynomials with ℚ-coe�cients.

Theorem 45 If a number has a periodic continued fraction, then it is a quadratic number.

47

Proof. We �rstly show that a (real) number with a purely periodic continued fraction is aquadratic number. To this end, let

A = [U;U1, . . . , U;−1] .

The cases when ; = 1 and ; = 2 are le� as exercises. We henceforth assume ; ≥ 3. If A wasrational, the continued fraction would have �nite length; hence A is irrational. By assumption,

A = [U;U1, . . . , U;−1, A]

and it follows from Lemma 40 that the RHS equals

AB;−1 + B;−2AC;−1 + C;−2

,

whereB=

C=is the =-th convergent of [U;U1, . . . , U;−1]. We then see that

C;−1A2 + (C;−2 − B;−1)A − B;−2 = 0

and therefore that A is a quadratic irrational.We now show that any number with periodic continued fraction is a quadratic irrational.Let d be the value of [U;U1, . . . , U# , U#+1, . . . , U#+;] for �xed # and ;, and let A be the value

of [U#+1;U#+2, . . . , U#+;]. By the �rst part, A is a quadratic irrational of the form, say, B + C√3

with B ∈ ℚ, C ∈ ℚ − {0}. We have

d = [U;U1, . . . , U# , A] =AB# + B#−1AC# + C#−1

by Lemma 40, whereB=

C=is the =-th convergent of d. Substituting A = B + C

√3, we have

d =B# B + B#−1 + B# C

√3

C# B + C#−1 + C# C√3= · · · + B#−1C (B# − C# )

(C# B + C#−1)2 − (C# C)23√3 ∈ ℚ +ℚ

√3

and d is quadratic irrational. �

The converse also holds, but it will not be discussed any further:

Theorem 46 A real number has periodic continued fraction if and only if it is quadratic irrational.

9 Pell’s equation

In this section, we will study Pell’s equation:

G2 − 3H2 = ±1.

We will see that continued fractions give us constructive methods of �nding solutions tothese equations.

48

9.1 Part I

If we have a solution (G, H) to Pell’s equation G2 − 3H2 = ±1, then

G

H=

√3 ± 1

H2

which is almost√3. In some sense Pell’s equation gives us good rational approximations to

√3.

Recall that the best rational approximations to A ∈ ℝ arises from convergents of the contin-ued fraction:

Theorem 47 Suppose that 3 ∈ ℕ is not a square and suppose that there is a pair of positive integers Band C satisfying B2 − 3C2 = ±1. Then B

Cis a convergent to

√3 (in the sense that it is of the form A= =

B=C=

for some =).

Proof. If B2 − 3C2 = ±1, then (B +√3C) (B −

√3C) = ±1, hence

|√3 − B

C| = 1

C (B + C√3)

holds. Consequently,B

Cis a good approximation to

√3. The denominator

C (B + C√3) = C2( B

C+√3) = C2(

√3 ± 1

C2+√3) ≥ C2(

√3 − 1 +

√3) > 2C2

since 3 ≥ 2 (as 3 is positive and not a square). It then follows from Theorem 44 thatB

Cis a

convergent to√3. �

Remark. The theorem asserts that positive integer solutions to Pell’s equation G2 − 3H2 = ±1are necessarily convergents to the continued fraction of

√3:

{The positive integer solutions to G2 − 3H2 = ±1

}⊂

{the convergents A= =

B=

C=to√3

}(recall that B= and C= are both positive if = ≥ 1) But not all convergents are solutions to the Pellequation. Do we know which convergents?

Example. G2 − 2H2 = ±1.

The continued fraction of√2 is [1; 2]. The convergents are:

A0 =1

1, A1 =

3

2, A2 =

7

5, A3 =

17

12, A4 =

41

29, . . .

These, as it turns out, de�ne solutions to G2 − 2H2 = ±1:

12 − 2 · 11 = −1, 32 − 2 · 22 = 1, 72 − 2 · 52 = −1, 172 − 2 · 122 = 1, 412 − 2 · 292 = −1, . . .

It might be reasonable to expect that the convergent A=, when = is even (resp. odd), de�ne solu-tions to G2 − 2H2 = −1 (resp. G2 − 2H2 = 1).

49

Example. G2 − 3H2 = ±1.

The continued fraction of√3 is [1; 1, 2]. The convergents are:

A0 =1

1, A1 =

2

1, A2 =

5

3, A3 =

7

4, A4 =

19

11, . . .

This time, not all of them are solutions to the Pell equation:

12 − 3 · 12 = −2, 22 − 312 = 1, 52 − 3 · 32 = −2, 72 − 3 · 42 = 1, 192 − 3 · 112 = −2, . . .

Again, it might not be so far-fetched to conjecture that the A=, where = is odd, de�ne solu-tions to G2 − 3H2 = 1, while it is likely that G2 − 3H2 = −1 does not have any solutions. This canbe checked by passing to ℤ3. If there was a solution, say (B, C), then

B2 ≡ −1 ≡ 2

mod 3, however the Legendre symbol(2

3

)= −1, a contradiction!

Theorem 48 Suppose that 3 ∈ ℕ is not a square. Suppose that√3 = [U;U1, . . . , U;]. Let

B=

C=be the

=-th convergent of the continued fraction of√3. Then

B2= − 3C2= = ±1

if and only if = = #; − 1 for some # = 1, 2, 3, . . . .Moreover,

B2#;−1 − 3C2#;−1 = (−1)

#; .

Proof. NON-EXAMINABLE. �

Example. G2 − 2H2 = ±1. In this case, ; = 1 and every A= =B=

C=is a solution for = = 0, 1, 2, . . . ;

and B2= − 2C2= = (−1)=+1.

Example. G2 − 3H2 = ±1. In this case, ; = 2 and every A=, when = = 2# − 1, i.e. when = is odd,de�nes a solution to G2 − 3H2 = ±1. In fact B22#−1 − 3C22#−1 = (−1)2# = 1 and G2 − 3H2 = −1 doesnot have any solutions (as expected).

As we have seen in the second example, when ; is even, (−1)#; = 1, and the followingfollows immediately from the theorem:

Corollary 49 Suppose that 3 ∈ ℕ is not a square. Suppose that√3 = [U;U1, . . . , U;]. If ; is even, the

equationG2 − 3H2 = −1

has no solutions.

50

9.2 Part II

De�nition. We de�ne a partial order on the set of solutions to equation G2 − 3H2 = ±1: if (B, C)and (B′, C ′) are two distinct solutions, we then de�ne

(B, C) < (B′, C ′)

if G + H√3 < B′ + C ′

√3 in ℝ (SL says this is equivalent to B < B′ and C < C ′). The fundamental

solution is the minimum positive solution in this sense.

By Theorem 48, we know that the fundamental solution to G2−3H2 = ±1 is (G, H) = (B;−1, C;−1)where

B;−1C;−1

is the (; − 1)-st convergent.

We will see that the fundamental solution generates all positive integer solutions (G, H).

Example. G2 − 2H2 = ±1. As we saw already, (B0, C0) = (1, 1), (B2, C2) = (7, 5), . . . de�nesolutions to

G2 − 2H2 = −1,

while (B1, C1) = (3, 2), (B3, C3) = (17, 12), . . . de�ne solutions to

G2 − 2H2 = 1.

An eagle-eyed reader might notice a pattern– if (E=, F=) is the =-th solution (we know that it is(B=, C=)), then (E=+1, F=+1) = (E= + 2F=, E= + F=) is the (= + 1)-st, and

E=+1 + F=+1√2 = (E= + F=

√2) (1 +

√2)

in other words,(E= + F=

√2) = (E1 + F1

√2)= = (1 +

√2)=.

This is not a coincidence, as we shall see shortly.

Example. G2 − 3H2 = ±1. The �rst few solutions to G2 − 3H2 = 1 are (E1, F1) = (B1, C1) =(2, 1), (E2, F2) = (B3, C3) = (7, 4), (E3, F3) = (B5, C5) = (26, 15), . . . and solutions necessarily sat-isfy

E= + F=√3 = (E1 + F1

√3)= = (2 +

√3)=.

Lemma 50 Let (B, C) = (E1, F1) be the fundamental solution to the Pell equation

G2 − 3H2 = ±1.

For = = 1, 2, . . . , de�ne (E=, F=) ∈ ℕ × ℕ by the equation

E= + F=√3 = (B + C

√3)=.

ThenE= =

1

2

((B + C

√3)= + (B − C

√3)=

)and

F= =1

2√3

((B + C

√3)= − (B − C

√3)=

).

51

Remark. Note that (E=, F=) is di�erent from (B=, C=) that de�nes the =-th convergent A= anylonger.

Proof. Induction on =. �

Theorem 51 Let (B, C) = (E1, F1) be the fundamental solution to the equation

G2 − 3H2 = ±1

and letn = B2 − 3C2 ∈ {±1}.

As before, for = = 1, 2, . . . , de�ne (E=, F=) ∈ ℕ × ℕ by the equation

E= + F=√3 = (B + C

√3)=.

ThenE2= − 3F2

= = n=.

Remark. If√3 = [U, U1, . . . , U;], then we know from Theorem 48 that

• The fundamental solution (B, C) is (E1, F1);

• (E=, F=) = (B=;−1, C=;−1) and E2= − 3F2= = (−1)=;;

and as a resultn = B2 − 3C2 = E21 − 3F2

1 = (−1); .

If ; is odd, then

• (E=, F=), for = even, are solutions to the Pell equation

G2 − 3H2 = +1

• (E=, F=), for = odd, are solutions to the Pell equation

G2 − 3H2 = −1.

Proof. Applying Lemma 50, we obtain

E= − F=√3 =

1

2

((B + C

√3)= + (B − C

√3)=

)−√3

2√3

((B + C

√3)= − (B − C

√3)=

)= (B − C

√3)=.

Hence

E2= − 3F2= = (E= − F=

√3) (E= + F=

√3) = (B − C

√3)= (B + C

√3)= = (B2 − 3C2)=.

�

The main theorem in this section is:

Theorem 52 Suppose that 3 ∈ ℕ is not a square. Suppose that (E, F) is a solution to the Pell equation

G2 − 3H2 = ±1.

Then there exists = ≥ 1 such that (E, F) = (E=, F=) as above.

52

Proof. Let (B, C) be the fundamental solution to G2 − 3H2 = ±1. Suppose there exists a pair(E, F) of integers such that

• E ≥ 1 and F ≥ 1,

• E2 − 3F2 = ±1,

• (E, F) is not (E=, F=) for any = ≥ 1, where (E=, F=) is a pair of integers satisfying

E= + F=√3 =

(B + C√3

)=and

E2= − 3F2= = ±1

The assertion follows if this set of assumption leads to a contradiction.

There exists a unique # such that

(B + C√3)# < D + E

√3 < (B + C

√3)#+1,

because the interval (B+ C√3)#+1− (B+ C

√3)# = (B+ C

√3)# (B+ C

√3−1) gets bigger as # increases

[the point is that D + E√3 is bounded strictly by powers of B + C

√3; this only occurs if (E, F) is

not (E=, F=)!].

Multiplying all by1

(B + C√3)#

, we have

1 < + +,√3 < B + C

√3

where

+ +,√3 =

1

(B + C√3)#(E + F

√3) = 1

E# + F#√3(E + F

√3) = E# − F#

√3

E2#− 3F2

#

(E + F√3).

We check

+2 − 3,2 = ±1 Simply substituting above,

+2 − 3,2 =1

E2#− 3F2

#

(E2 − 3F2) (E2# − 3F2# ) = E2 − 3F2 = ±1.

−1 < + −,√3 < 1 Since |+ +,

√3 | |+ −,

√3 | = | ± 1| = 1 and + +,

√3 > 1, it follows that

|+ −,√3 | < 1.

Consequently,2+ = (+ +,

√3) + (+ −,

√3) > 1 − 1 = 0,

hence + > 0 and2,√3 = (+ +,

√3) − (+ −,

√3) > 1 − 1 = 0,

hence, > 0. This contradicts the minimality of the fundamental solution B + C√3. �

53

Example. G2 − 3H2 = 1. The fundamental solution is (B, C) = (2, 1). Hence

E2 + F2

√3 = (2 +

√3)2 = 7 + 4

√3,

E3 + F3

√3 = (2 +

√3)3 = 26 + 15

√3,

E4 + F4

√3 = (2 +

√3)2 = 97 + 56

√3,

......

...

Example. The continued fraction of√13 is [3; 1, 1, 1, 1, 6] with ; = 5. Hence the positive

integer solutions to G2−13H2 = ±1 are the convergent (B5#−1, C5#−1) where A5#−1 =B5#−1C5#−1

is the

5# − 1-st convergent. The ‘smallest’ solution is (B4, C4) = (18, 5) and 182 − 13 · 52 = −1.Example. G2−41H2 = −1. The fundamental solution to G2−41H2 = −1 is (B1, C1) = (32, 5) and

the fundamental solution to G2 − 41H2 = 1 is (B2, C2) is computed by

E2 + F2

√41 = (32 + 5

√41)2 = 2049 + 320

√41.

Example. The continued fraction of√61 is [7; 1, 4, 3, 1, 2, 2, 1, 3, 4, 1, 14] with period ; = 11.

It follows from Theorem 48 that the solutions to G2 −√61H2 = ±1 are

(B10, C10) = (29718, 3805), (B21, C21), (B32, C32), . . .

(satisfying B211#−1 − 3C211#−1 = (−1)11# ), and the fundamental solution to G2 − 3H2 = −1 is

(B10, C10) = (29718, 3805)

while the fundamental solution to G2 − 3H2 = 1 is

(B21, C21) = (1766319049, 226153980).

Theorem 51 ascertains that (29718 + 3805√61)2 = 1766319049 + 226153980

√61.

9.3 Appendix: A proof of Theorem 48 (NON-EXAMINABLE)

Let 3 be a square-free integer. Let {U=} (resp. {d=}) be positive integers (resp. real numbers)appearing in the continued fraction algorithm for

√3, i.e.,

√3 = [U;U1, . . . , U=−1, d=]

(by de�nition, U=−1 = bd=−1c and d= =1

d=−1 − U=−1, hence d=−1 = U=−1 +

1

d=; furthermore, since

√3 is irrational, d= is non-zero for any =).

De�nition. De�ne integers '= and (= inductively as follows:

• '0 = 1 and (0 = 0,

• (=+1 = U=−1'= − (=,

• '=+1 =3 − (2

=+1'=

.

The following is the key lemma:

LemmaWe have

54

• '= and (= are both integers,

• '= divides 3 − (2=,

• d=−1 =(= +√3

'=

Proof of the lemma. We prove this by induction. The case when = = 0 holds trivially. Supposethat the assertion holds for =.

(=+1 ∈ ℤ By de�nition, (=+1 = U=−1'= − (= where U=−1 is an integer by de�nition and '=and (= are integers by the inductive hypothesis.

'=+1 ∈ ℤ By de�nition,

'=+1 =3 − (2

=+1'=

=3 − (U=−1'= − (=)2

'==3 − (2='=

+ 2U=−1(= − U=−1'=.

The assertion follows since3 − (2='=

is an integer by the inductive hypothesis.

'=+1 divides 3 − (2=+1 This follows immediately from '=+1'= = 3 − (2=+1.

d= =(=+1 +

√3

'=+1Since d=−1 = U=−1 +

1

d=, it follows from the inductive hypothesis that

U=−1 +1

d==(= +√3

'=.

It follows that

d= ='=

(= +√3 − U=−1'=

='=√

3 − (=+1='= (√3 + (=+1)

3 − (2=+1

='= (√3 + (=+1)

'='=+1=

√3 + (=+1'=+1

,

as desired. �

Proposition A-1 Let 3 be a square-free integer. LetB=

C=be the =-th convergent to

√3. We then

haveB2= − 3C2= = (−1)=+1'=+1

and '=+1 > 0.

Proof. Since√3 = [U;U1, . . . , U=−1, d=] =

d=B= + B=−1d=C= + C=−1

,

√3 (d=C= + C=−1) = d=B= + B=−1.

Substituting d= =(=+1 +

√3

'=+1from the lemma, we have

√3

((=+1 +

√3

'=+1C= + C=−1

)=(=+1 +

√3

'=+1B= + B=−1.

55

Multiplying '=+1 and rearranging, we have√3 ((=+1C= + '=+1C=−1 − B=) = (=+1B= + '=+1B=−1 − 3C=.

Since√3 is irrational, it follows that

(=+1C= + '=+1C=−1 − B= = 0⇔ B= = (=+1C= + '=+1C=−1,

and(=+1B= + '=+1B=−1 − 3C= = 0⇔ 3C= = (=+1B= + '=+1B=−1.


B2= − 3C2= = B= ((=+1C= + '=+1C=−1) − C= ((=+1B= + '=+1B=−1) = '=+1(B=C=−1 − C=B=−1) = '=+1(−1)=−1

by Theorem 34. �

Proposition A-2 Suppose that√3 = [U;U1, . . . , U;] for some ; ≥ 1. Let {'=} be as above.

Then '= = 1 if and only if ; divides =.

Corollary. Theorem 48 follows.

Proof of Corollary (Theorem 48).

B2= − 3C2= = (−1)= ⇔ '=+1 = 1⇔ ; | (= + 1) ⇔ = = ; − 1, 2; − 1, . . . .

�

Proof of Proposition A-2. Suppose that ; divides =. We show that for any multiple :; of ;,':; = 1. Firstly, since

√3 = [U; d1] by de�nition, we have d1 = [U1;U2, . . . , U;]. Similarly, since√

3 = [U;U1, . . . , U;, d;+1], we also have d;+1 = [U1;U2, . . . , U;]; indeed, for any integer : ≥ 1, wehave

d:;+1 = [U1;U2, . . . , U;] = d1(easy to check by induction). By the lemma above, it therefore follows that

(:;+1 +√3

':;+1=(1 +√3

'1,

and, as a result, that ':;+1 = '1 and (:;+1 = (1. By de�nition,

'1 = 3 − (21 = 3 − (2:;+1 = ':;':;+1 = ':;'1.

Since '1 > 0, we have ':; = 1 as desired.

Conversely, suppose that '= = 1. It follows from the lemma that

U= +1

d== d=−1 =

(= +√3

'== (= +

√3 = (= + U +

1

d1.

While U= (on the le�most), (= and U (on the rightmost) are all integers, both1

d=and

1

d1are

fractions < 1 and therefore the equality d= = d1 needs to hold. This implies that ; divides = (as; is the period length). �

56

10 Sums of squares

Let ? be a prime. The basic question we want to understand in this section is whether

G2 + H2 = ?

has a solution in (G, H) ∈ ℕ × ℕ. For example,

22 + 12 = 532 + 22 = 1342 + 12 = 1752 + 22 = 29.

On the other hand,G2 + H2 = 7

is not soluble in ℤ × ℤ; for if it were, there would be (<, =) ∈ ℕ × ℕ such that <2 + =2 = 7, butthe table

I (mod 4) I2 (mod 4)0 01 12 03 1

shows <2 + =2 would never be 7 ≡ 3mod 4.

In fact,

Proposition 53 Let ? be a prime congruent to 3mod 4. Then

G2 + H2 = ?

has no solutions in (G, H) ∈ ℕ × ℕ.

Proof. The table above shows that, for any pair of integers A, B, the sum A2 + B2 is congruentto 0, 1, or 2mod 4. If (A, B) are a solution for G2 + H2 = ?, then ? would be congruent to 0, 1 or 2,but this contradicts the assumption that ? is congruent to 3. �

On the other hand, the following theorem gives us a good handle on primes representableas sums of squares:

Theorem 54 If the equation I2 ≡ −1mod ? is soluble, then the equation

G2 + H2 = ?

is soluble.

To prove the theorem, we �rstly prove

Lemma 55 Let A ∈ ℝ and # ∈ ℕ. Then there exists B/C ∈ ℚ with gcd(B, C) = 1 and 1 ≤ C ≤ # suchthat ��A − B

C

�� ≤ 1

C (# + 1) .

57

Proof. Let A = [U;U1, . . . ] be the continued fraction of A and A= =B=

C=be the =-th convergent

(recall that C= is an increasing sequence). It follows from Theorem 34 that

|A − A= | = |A −B=

C=| ≤ 1

C=C=+1.

There are two cases to proceed:

Suppose that C= ≤ #. In this case, the increasing sequence C= is bounded from above, i.e., C=stabilises for su�ciently large =, i.e., the continued sequence is indeed �nite and A is a rationalnumber A = [U;U1, . . . , U=] for some =. Letting

B

C=B=

C=, we have

|A − BC| = 0 ≤ 1

C (# + 1) .

Suppose that there exists = such that

C= ≤ # < C=+1.

Since # and C=+1 are both integers, it follows that # + 1 ≤ C=+1. LettingB

C=B=

C=,

|A − BC| = |A − B=

C=| ≤ 1

C=C=+1≤ 1

C (# + 1) .

�

Proof of Theorem 54. Applying the lemma with A =I

?and # = b√?c, we �nd B

C∈ ℚ such that

1 ≤ C ≤ b√?c < √? and|A − B

C| = | I

?− BC| ≤ 1

C (# + 1) <1

C√?

since # = b√?c < √? < # + 1.Let D = ?B − IC. Then, since C > 0 and ? > 0,

|D | = C ?�� I? − BC �� < C?

C√?=√?.

(D, C) is a solution we are looking for It follows from the last inequality that

0 < D2 + C2 < ? + ? = 2?

as we know C ≤ # <√?. On the other hand,

D2 + C2 = (?B − IC)2 + C2 ≡ I2C2 + C2 ≡ (I2 + 1)C2

mod ?. However, by assumption, I2 ≡ −1mod ?, hence D2 + C2 ≡ 0mod ?. The only possibilityfor the ‘real’ value of D2 + C2 therefore is ?, i.e., D2 + C2 = ?. �

Corollary 56 Let ? be a prime congruent to 1mod 4. Then

G2 + H2 = ?has an integer solution in G and H.

Proof. It follows from the theorem that if(−1?

)= 1, then G2 + H2 = ? is soluble. By as-

sumption, ? is odd and Rule 2 in Theorem 25 asserts that(−1?

)= 1 if and only if ? ≡ 1mod 4.

�

58

10.1 Hermite’s algorithm

The proof of Theorem 54 can be made into an algorithm for �nding G, H such that G2 + H2 = ?.

Step 1: �nd I such that I2 ≡ −1mod ?.

Step 2 (inductive): compute the =-th A= =B=

C=and the (= + 1)-th convergents A=+1 =

B=+1C=+1

of

the continued fractionI

?. If

C= <√? < C=+1,

then the algorithm stops and (G, H) = (C=, ?B= − IC=) de�nes a solution for G2 + H2 = ?. If thisdoes not hold,

Step 3: see if C=+1 <√? < C=+2 works. If it does, the algorithm stops. If not...

Example. Let ? = 13.

Step 1: we simply spot that 52 ≡ −1mod 13.

Step 2: To �nd the convergents of5

13, we see

U = b 513c = 0 ⇒ A1 =

1513 − 0

=13

5

w

U1 = b13

5c = 2 ⇒ A2 =

1135 − 2

=5

3

w

U2 = b5

3c = 1 ⇒ A3 =

153 − 1

=3

2

w

U3 = b3

2c = 1 ⇒ A4 =

132 − 1

= 2 ∈ ℕ

w

U4 = bA4c = A4Hence the convergents are

B0

C0= 0,

B1

C1= [0; 2] = 1

2,B2

C2= [0; 2, 1] = 1

3,B3

C3= [0; 2, 1, 1] = 2

5.

The algorithm stops at = = 2 asC2 = 3 <

√13 < C3 = 5

and (G, H) = (3, 13 · 1 − 5 · 3) = (3,−2), or (3, 2) de�nes a solution. Indeed, 32 + 22 = 13.

Example. ? = 2017.

Step 1: 2292 ≡ −1mod 2017.

Step 2: to �nd the convergents of229

2017, we see that

59

U = b 2292017

c = 0 ⇒ A1 =1

2292017 − 0

=2017

229

w

U1 = b2017

229c = 8 ⇒ A2 =

12017229 − 8

=229

185

w

U2 = b229

185c = 1 ⇒ A3 =

1229185 − 1

=185

44

w

U3 = b185

44c = 4 ⇒ A4 =

118544 − 4

=44

9

w

U4 = b 449 c = 4 ⇒ A5 =1

449 −4

=9

8w...

Hence the convergents are

B0

C0= 0,

B1

C1=1

8,B2

C2=1

9,B3

C3=

5

44,B4

C4=

21

185, . . .

The algorithm stops at = = 3 as

C3 = 44 <√2017 < C4 = 185.

and (G, H) = (44, 2017 · 5 − 229 · 44) = (44, 9) is a solution. Indeed, 92 + 442 = 2017.

10.2 More sums of squares

Let = ∈ ℕ. We can write it as = = 021 where 0, 1 ∈ ℕ and 1 is square free, in the sense that if ? isa prime that divides 1, then ?2 does not divide 1). For example,

1440 = 25325 = 122 · 10.

We shall refer to 1 as the square-free part of =.

Theorem 57 (Fermat & Euler) A positive integer = is the sum of two squares (of integers) if and onlyif the square-free part d of = has no prime factors congruent to 3mod 4.

Proof. Suppose �rstly that d has no prime factors congruent to 3mod 4.

If d = ±1, then = is a square and it is evidently a sum of squares (since 02 is a sqaure). Wemay then suppose that d > 1.Since the product of sums of squares is, again, a sum of squares[if U = A2 + B2 and V = C2 + D2, then UV = (A2 + B2) (C2 + D2) = (AC − BD)2 + (AD + BC)2], it su�ces toestablish that any prime factor ? of d is a sum of squares. In theory, there are 4 cases mod 4 todeal with:

? ≡ 0 This can occur since ? is a prime.

? ≡ 1 This follows from Corollary 56.

60

? ≡ 2 The only possibility for ? (prime and congruent to 2mod 4) is 2. Clearly 2 is a sumof squares 2 = 12 + 12!

? ≡ 3 This is excluded by assumption.

Conversely, suppose that = = A2+B2 for some integers A, B. It su�ces to prove that if ? dividesthe square-free part of =, then ? is not congruent to 3 mod 4. It is equivalent to establishingthat if ? ≡ 3mod 4 and ?# is the maximal ?-power divisor of =, then # is even. We shall provethe latter by induction on 1.

If = = 1, then the assertion holds (as = = 1 has no non-trivial divisors).

Suppose that the assertion holds for a positive integer < =. Suppose that ? ≡ 3mod 4 and? |=. The goal is to show that ? divides = an even number of times.

? |A and ? |B Suppose WLOG that ? does not divide A. Therefore there exists C such thatAC ≡ 1mod ?. On the other hand, since ? |=, it follows from = = A2 + B2 that

A2 + B2 ≡ 0

mod ?. Multiplying the congruence by C2, we obtain

0 ≡ C2(A2 + B2) = (AC)2 + (BC)2 = 1 + (BC)2,

i.e. (BC)2 ≡ −1mod ?. In other words,(−1?

)= 1 but this contradicts Rule 2 in Theorem 25 (this

is where we use ? ≡ 3mod 4).

Let A = ?D and B = ?E. Substitute those back into the equation, we have

= = A2 + B2 = ?2(D2 + E2).

Since (D2 + E2) < =, it follows from the inductive hypothesis that ? divides (D2 + E2) an evennumber of times. The same remains true for ?2(D2 + E2). �

Example. 585 = 32 · 5 · 13. The square-free part is 5 · 13 and both prime factors 5 and 13 arecongruent to 1 mod 4, in particular NOT congruent to 3mod 4. According to the theorem, weshould be able to express 585 as a sum of four integer squares. In fact,

5 = 22 + 12 = A2 + B2

and13 = 22 + 32 = C2 + D2

And it follows from the formula in the proof of the theorem that

5 · 13 = (A2 + B2) (C2 + D2) = (AC − BD)2 + (AD + BC)2 = (2 · 3 − 1 · 2)2 + (2 · 2 + 3 · 1)2 = 42 + 72.


585 = 32(42 + 72) = (3 · 4)2 + (3 · 7)2 = 122 + 212.

61

Remark. We were ‘lucky’ that we could easily spot that squares for 5 and 13 respectively inthe example. What should we do if numbers are much bigger? Note that a positive integer =will NOT be a sum of squares if there is a prime congruent to 3mod 4 that divides the square-free part. So if we know that no prime factor of the square-free part of = is congruent to 3mod4 (the theorem ascertains that = is a sum of squares), what we need to do is to write all primefactors congruent to 1mod 4 (the only prime congruent to 2mod 4 is 2 and it is 12 + 12, whilethere is no prime congruent to 0mod 4) as sums of two squares, for whichwemaymake appealto Hermite’s algorithm, and use the product formula in the proof of the theorem.

Theorem 58 (Legendre & Gauss) Every positive integer can be written as a sum of three squares (ofintegers) except for those of the form 4A (8I + 1) for A, I ≥ 0.

Theorem 59 (Lagrange) Every positive integer can be written as a sum of four squares (of integers).

Proof (NON-EXAMINABLE). I learned the proof from A. Baker; it illustrates Fermat’s ‘in�n-ite descent argument’.

Firstly, by the formal identity

(G2 + H2 + I2 + F2) (B2 + C2 + D2 + E2) = (GB + HC + ID + FE)2 + (GC − HB + FD − IE)2+ (GD − IB + HE − FC)2 + (GE − FB + IC − HD)2

that the product of two sums of four squares is again a sum of four squares. Therefore the the-orem follows if we can prove that every prime number is a sum of four squares. In fact, since2 = 12 + 12 + 02 + 02, it su�ce to prove it for an odd prime number.

Consider the set- = {G2 | 0 ≤ G ≤ ? − 1

2}.

The elements in the set are NOT congruent to each othermod ?. Indeed, if B2 ≡ C2mod ? where0 ≤ B, C ≤ ? − 1

2, then ? would divide B2 − C2 = (B + C) (B − C); however, since 0 ≤ B + C ≤ ? − 1 and

− ? − 12≤ B − C ≤ ? − 1

2, it is evident that ? does not divide B + C nor B − C.

Similarly, the elements of the set

. = {−1 − H2 | 0 ≤ H ≤ ? − 12}

are NOT congruent to each other neither (this can be proved similarly). Each of these two sets

contain 1 + ? − 12

elements and therefore |- | + |. | = ? + 1 elements in total. Therefore if weconsider their residues mod ?, there exists at least one pair of elements (G, H) ∈ (-,. ) whoseresides mod ? coincide (the pigenhole principle), i.e.,

G2 ≡ −1 − H2

mod ?. Furthermore, since G <?

2and H <

?

2,

0 < G2 + H2 + 1 < 2( ?2

)2+ 1 < ?2.

62

Wemay therefore let G2 + H2 + 1 = : ? for some 0 < : < ?.

We now de�ne ℓ to be the least positive integer such that ?ℓ = B2 + C2 + D2 + E2 for someB, C, D, E ∈ ℤ– we may and will �nd the smallest positive integer of the form B2 + C2 + D2 + E2 thatis divisible by ?, and ℓ is simply its quotient by ?.

Let B, C, D, E be a set of integers satisfying ?ℓ = B2 + C2 + D2 + E2.

ℓ ≤ : < ? This follows by de�nition.

ℓ is odd Suppose that ℓ is even. Then B2+C2+D2+E2 is even, hence either 0, 2, or 4 of {B, C, D, E}are even. If at least two of them are even, we may relabel them if necessary to assume that Band C are even. In this case, B + C, B − C, D + E, D − E are indeed all even! In fact, even if B, C, D, Eare all odd, B + C, B − C, D + E and D − E are all even. Granted,( B + C

2

)2+

( B − C2

)2+

(D + E2

)2+

(D − E2

)2=B2 + C2 + D2 + E2

2= ?

ℓ

2∈ ℕ

contradicting the minimality of ℓ. Therefore ℓ is odd.

It remains to establish that ℓ = 1 . To this end, suppose that ℓ > 1. Let B denote the residueof B by ℓ, i.e., the unique integer 0 ≤ B ≤ ℓ − 1 congruent to B; in fact, it is possible to choose Bsuch that 0 ≤ B < ℓ

2(since ℓ is odd, B =

ℓ

2cannot hold). Similarly de�ne C, D, E, and let

# = B2 + C2 + D2 + E2.

ℓ divides # Since ℓ divides B2+C2+D2+E2, it follows that B2+C2+D2+E2 is congruent to 0mod ℓ.

# > 0 If # = 0, then B = C = D = E = 0, i.e., ℓ divides B, C, D and E. It would then follow thatℓ2 divides B2, C2, D2 and E2 and consequently it divides B2 + C2 + D2 + E2. As the latter is ℓ?, thiswould mean that ℓ divides ? but by de�nition ℓ < ? and this cannot possibly happen.

It follows that

# < 4

(ℓ

2

)2= ℓ2

and therefore # = Aℓ for some integer 0 < A < ℓ. By the formal identity, the product (Aℓ) (?ℓ) ofAℓ = B2 + C2 + D2 + E2 and ?ℓ = B2 + C2 + D2 + E2 is again a sum of four squares. As the product isdivisible by ℓ2, it is easy to see that each of the four squares is in fact divisible by ℓ2. Dividingthrough by ℓ2, we then see that A ? is a sum of four squares, but this contradicts the minimalityof ℓ. �

11 Algebraic number theory

De�nition. Let U be a complex number.

• U is an algebraic number if there is a non-zero polynomial 5 (G) ∈ ℚ[G] such that 5 (U) = 0;

• U is an algebraic integer if there is a non-zero monic polynomial 5 (G) ∈ ℤ[G] such that5 (U) = 0.

63

• U is a transcendental number if it is NOT an algebraic number.

By a monic polynomial 5 (G), it means that the coe�cient of the highest power (=degree of5 ) of G is exactly 1.

Remark. By de�nition, an algebraic integer is an algebraic number.

Example. A rational number is an algebraic number. A rational number A ∈ ℚ is a root ofthe monic polynomial G − A ∈ ℚ[G]. Similarly, an integer is an algebraic integer.

Example. U =√2 is an algebraic integer. It is a root of the polynomial G2 −2which is monic

and has coe�cients in ℤ.

Example. U =1√2is an algebraic number. Is this an algebraic integer? If U =

1√2, then

U2 =1

2, hence U is a root of the polynomial G2 − 1

2∈ ℚ[G] monic but not not all coe�cients are

inℤ; alternatively, wemay think of U as a root of the polynomial 2G2−1with integer coe�cientsbut it is not monic. It seems likely U is not an algebraic integer (the argument above is not goodenough to conclude U is not an algebraic integer– we have not eliminated the possibility thattheremight be a strange monic polynomial with integer coe�cients with U its root.

Example. c is a transcendental number, i.e., not an algebraic number. This is a theorem ofLindermann about 150 years ago. ‘Transcendental number theory’ is what A. Baker got a Fieldsmedal (1970) for.

Proposition 60 A rational number is an algebraic integer if and only if it is an integer.

Proof. It su�ces to prove that if a rational number A =B

C, with gcd(B, C) = 1, is an algebraic

number, then A ∈ ℤ, i.e., C = 1.By de�nition, A satis�es

A= + 2=−1A=−1 + · · · + 21A + 2 = 0.

Substituting A =B

Cand subsequently multiplying by C=, we obtain

B= + 2=−1B=−1C + · · · + 21BC=−1 + 2C= = 0.

If wewrite B= = −(2=−1B=−1C+· · ·+21BC=−1+2C=), one sees that C divides the RHS and thereforealso divides the LHS, B=.

Suppose that C > 1 (the goal is to deduce a contradiction). Let ? be a prime factor of C (whichexists because C > 1). Since C divides B=, the prime factor ? divides B= and it follows fromLemma4 that ? divides B. However, since ? divides C, it follows that ? |gcd(B, C). But gcd(B, C) = 1 andthis is a contradiction. �

De�nition. Let U be an algebraic number. The minimal polynomial of U is the non-zero,monic polynomial 5 (G) ∈ ℚ[G] of smallest possible degree, such that 5 (U) = 0.

Remark. Existence Given U, there is a monic polynomial (with rational coe�cients) U sat-is�es. Since U is algebraic, there is a polynomial 5 (G) = 2=G= + 2=−1G=−1 + · · · + 21G + 20 ∈ ℚ[G]

64

such that 2= is non-zero and 5 (U) = 0. Then1

2=5 ∈ ℚ[G] is a such monic polynomial. We then

choose one of the smallest degree– �nd one that is irreducible, i.e., that can not be factorisedas a product of polynomials inℚ[G] of smaller degrees. In fact, the minimal polynomial 5 of Uis irreducible (if reducible, it would contradicts the minimality of deg 5 ). Furthermore, if 6(G)is a polynomial in ℚ[G] such that 6(U) = 0, then 5 necessarily divides 6; indeed by ‘divisionalgorithm’, there exist @ and A in ℚ[G] such that 6 = @ 5 + A with deg A < deg 5 , and it followsfrom 6(U) = 0 = 5 (U) that A (U) = 0, contradicting theminimality of degree of 5 ! So theminimalpolynomial is an irreducible ‘factor’ of a polynomial of which U is a root, but it is not alwayseasy to spot one!

(NON-EXAMINABLE) A slick way of saying that, there is aminimal polynomial for an algeb-raic number, is that the ring ℚ[G] of polynomials with rational coe�cients is a UFD (UniqueFactorisationDomain), hence aPID (Principal IntegralDomain))–we can runEuclid’s algorithm

with polynomials with rational coe�cients. For example, a 11-th root of unity cos2c

11+ 8 sin 2c

11is, by de�nition, a root of the polynomial G11 − 1 but

G11 − 1 = (G − 1) (G10 + G9 + G8 + G7 + G6 + G5 + G4 + G3 + G2 + G1 + 1)

suggests that G10 + G9 + G8 + G7 + G6 + G5 + G4 + G3 + G2 + G1 + 1 is a good candidate for a minimalpolynomial. How do we know that it is rreducible?

Uniqueness theminimal polynomial is unique. Indeed, if 5 and 6were two distinctmonicpolynomials ofU, thenUwould be a root of ℎ = 5 −6with deg ℎ < deg 5 = deg 6. This contradictsthe minimality of the degree of 5 (and 6)– the key point is that the minimal polynomial ismonic!

Theorem 61 (Gauss’s lemma) The algebraic number U is an algebraic integer if and only if its minimalpolynomial has integer coe�cients.

Example. Indeed, we can make appeal to Gauss’s lemma to establish that U =1√2is NOT

an algebraic integer. As we saw earlier, U is a root of the polynomial G2 − 1

2. By Gauss’ lemma,

we are home if we show that this is the minimal polynomial of U. In fact, since G2− 12is monic,

it su�ces to show that there is no monic polynomial of degree < deg(G2 − 1

2) = 2 of which U is

a root. But if U is a root of a degree 1 polynomial with rational coe�cient, then U should be arational number.

11.1 Quadratic number �elds

De�nition. An algebraic number is a quadratic number if its minimal polynomial is of degree2. It is a quadratic integer if its minimal polynomial is of degree 2 and has integer coe�cients.

An algebraic number is a quadratic integer if and only if it is a quadratic number and analgebraic integer.

Remark. A quadratic number is not a rational number.

65

Example. The golden ratio (1 +√5)/2 is a quadratic integer. It is a root of the monic poly-

nomial G2 − G − 1.

Example. U = (−1+√−3)/2 is a quadratic integer. It is a cube root of unity, i.e., a root of the

polynomial G3 − 1 = (G − 1) (G2 + G + 1). The minimal polynomial of U is G2 + G + 1.

De�nition. An integer 3 is a square-free if ? is a prime that divides 3, then ?2 does notdivide 3.

Proposition 62 A complex number U is a quadratic number if and only if U is of the form B + C√3,

where B, C ∈ ℚ, B is non-zero and 3 is a square-free integer not equal to 1.

Proof. (⇐) Let U = B + C√3 be as above. If we let

5 (G) = (G − B)2 − C23,

then 5 is a monic polynomial with rational coe�cients and 5 (U) = 0.

(⇒) Let 5 (G) = G2 + 1G + 2 ∈ ℚ[G] satisfying 5 (U) = 0. Then U = −12±

√12

4− 2. Since it is

rational we may write12

4− 2 = C

D=

1

D2CD

for some C, D ∈ ℤ, D non-zero. Furthermore, since CD is an integer, wemaywrite it as the productof a square of integers and a square-free integer:

CD = d23

for some integer d > 0 and a square free integer 3 not equal to 1. It then follows that

U = −12±

√1

D2d23 = −1

2± dD

√3.

hence (B, C) = (−12,± dD) works. �

Remark. When U is a quadratic number, the expression U = B + C√3 is unique. Indeed, if

B + C√3 = B′ + C ′

√3, then B − B′ = (C ′ − C)

√3 where the LHS is a rational number while the RHS

would be irrational if C ′ is distinct from C. Therefore C ′ = C. It follows that B = B′ as a result.

We know from Proposition 62 that

{quadratic numbers} ⊃ {B + C√3 | B ∈ ℚ, C ∈ ℚ − {0}} ⊂ ℚ(

√3) = {B + C

√3 | B, C ∈ ℚ}

where ℚ(√3) is in fact a �eld; if B + C

√3 is non-zero (i.e. B and C are not simultaneously 0, then

B

B2 − 3C2 −C

B2 − C23√3 is its inverse in ℚ[

√3]. When 3 = 1, it is simply ℚ. What is an analogue

of ℤ ⊂ ℚ for a general ℚ(√3).

De�nition. Let 3 be a square-free integer. The set of elements U in ℚ(√3) which are algeb-

raic integers de�nes a ring. The ring is called the ring of integers of ℚ(√3).

66

Proposition 63 Let 3 be a square-free integer. Then the ring of integers ofℚ(√3) is

• ℤ[√3] = {B + C

√3 | B, C ∈ ℤ} if 3 ≡ 2, 3mod 4,

• ℤ

[1 +√3

2

]= {B′ + C ′1 +

√3

2| B′, C ′ ∈ ℤ} if 3 ≡ 1mod 4.

Proof. This has been proved in Section 3.7. To summarise the argument, let U be an elementB + C√3 of ℚ[

√3]. It is a root of the polynomial

G2 − 2BG + (B2 − 3C2).

For U to be an algebraic integer in ℚ[√3], the coe�cients 2B and B2 − 3C2 both need to be

integers. These conditions boil down to both B and C being integers if 3 ≡ 2 or 3mod 4, or (B, C)being of the form (B′ + C

′

2,C ′

2) for some integers B′, C ′ if 3 ≡ 1mod 4. In the latter case,

B + C√3 = (B′ + C

′

2) + C

′

2

√3 = B′ + C ′1 +

√3

2.

�

Remark. Note that

ℤ[√3] = {(B + C) + C

√3} = {B + 2C 1 +

√3

2} ⊂ {B′ + C ′1 +

√3

2} = ℤ[1 +

√3

2]

with the complement

ℤ[1 +√3

2] − ℤ[

√3] = {B + (2C + 1) 1 +

√3

2} = {(B + C + 1

2) + (C + 1

2)√3} =

(ℤ + 1

2

)+

(ℤ + 1

2

) √3

Example. When 3 = −1, the ring ℤ[√−1] of integers of ℚ[

√−1] is referred to as the Gaus-

sian integers.

Proposition 64 A quadratic number U is a quadratic integer if and only if it is of the form U = B+ C√3

for a square-free integer 3 > 1 and

• either (B, C) ∈ ℤ × (ℤ − {0})

• or (B, C) ∈(ℤ + 1

2

)+

(ℤ + 1

2

)when 3 ≡ 1mod 4.

Proof. Following the notation in the proof of Proposition 63, we observe that for U = B + C√3

be a quadratic integer, we not only demand 2B, B2−3C2 ∈ ℤ (so U is in particular an algebraic in-teger) butwe also demand that the polynomial G2−2BG+B2−3C2 is irreducible inℚ[G] (so that thepolynomial isminimal for U). The latter is equivalent to demanding that the polynomial is NOTa product of two degree one polynomials with rational coe�cients, i.e. the solutions B ± C

√3

are NOT rationals, i.e., C is non-zero. When 3 ≡ 1 mod 4, this corresponds to {B′ + C ′1 +√3

2}

with non-zero C ′ and this corresponds to the union of ( = {B′ + 2C 1 +√3

2| B′ ∈ ℤ, C ∈ ℤ − {0}} =

67

{B + C√3 | B ∈ ℤ, C ∈ ℤ − {0}} and ) = {B′ + (2C + 1) 1 +

√3

2| B′ ∈ ℤ, C ∈ ℤ}. �

To sum up,

{‘quadratic’ algebraic numbers} ⊃ {quadratic numbers}

ℚ(√3) ⊃ {B + C

√3 | B ∈ ℚ, C ∈ ℚ − {0}}

{‘quadratic’ algebraic integers} ⊃ {quadratic integers}

3 ≡ 2 or 3mod 4 ℤ[√3] ⊃ {B + C

√3 | B ∈ ℤ, C ∈ ℤ − {0}}

3 ≡ 1mod 4 ℤ

[1 +√3

2

]⊃ {B′ + C ′1 +

√3

2| B′ ∈ ℤ, C ′ ∈ ℤ − {0}}

| |( ∪ )

11.2 The ring of integers inℚ(√3)

De�nition. Let ' be a ring. An element A in ' is said to be a unit if there exists B in ' such thatAB = 1.

Example. The units inℤ are±1. If A and B are integers such that AB = 1, the only possibilitiesfor (A, B) are (1, 1) or (−1,−1).

Example. ±1,±√−1 are units in ℤ[

√−1]. This is because 1 · 1 = 1, (−1) · (−1) = 1,

√−1 ·

(−√−1) = 1. Indeed, they are the units in ℤ[

√−1] (to be explained shortly).

Remark. It might be useful for us to understand what units in ' = ℤ[√3] look like. Let

A = B + C√3 and ' = ( +)

√3 where B, C, (, ) ∈ ℤ. The condition A' = 1 would then imply that (1)

B( + C)3 = 1 and (2) C( + B) = 0. It follows from (1) × B − (2) × C3 that ((B2 − 3C2) = B and from(1) × C − (2) × B that ) (B2 − 3C2) = C. To sum up,

( + )√3 =

B

B2 − 3C2 +C

B2 − 3C2√3

and therefore bothB

B2 − 3C2 andC

B2 − 3C2 should be integers. In fact, B2 − 3C2 should be 1 or −1

because of this. See the forthcoming proposition [it is, of course, possible to prove this directly].

De�nition. Given U = B + C√3 ∈ ℚ(

√3), let

U = B − C√3 ∈ ℚ(

√3)

and we call it the conjugate of U.

Lemma 65 Let 3 be a square-free integer and U, V ∈ ℚ(√3).

• U = V if and only if U = V.

68

• UU ∈ ℤ if U = B + C√3 ∈ ℤ[

√3].

• UV = U V.

Proof. This is straightforward. �

Proposition 66 Suppose that 3 is a square-free integer and 3 ≡ 2, 3mod 4 (hence the ring of integersinℚ(

√3) isℤ[

√3]). An integer U = A + B

√3 ∈ ℤ[

√3] is a unit if and only if |UU | = 1, or equivalently,

B2 − 3C2 = ±1,

i.e., (B, C) is a solution of Pell’s equation G2 − 3H2 = ±1.

Remark(NON-EXAMINABLE).When 3 ≡ 1mod 4, the ring of integers inℚ(√3) isℤ[1 +

√3

2].

An element inℤ[1 +√3

2] is of the of form A = B+ C 1 +

√3

2= B′+ C ′

√3 where B′ = B+ C

2=2B + C2∈ ℚ

and C ′ =C

2∈ ℚ. Analogous to the argument in the previous remark, it follows from A' = 1 for

' = ( + ) 1 +√3

2= (′ + ) ′

√3 ∈ ℤ[1 +

√3

2] that

(′ + ) ′√3 =

B′

B′2 − 3C ′2 +C

B′2 − 3C ′2√3 =

B

D+ CD

1 +√3

2,

where D =1

4((2B + C)2 − 3C2) = B2 + BC + 1 − 3

4C2 ∈ ℤ (since 3 ≡ 1 implies

1 − 34∈ ℤ). Demanding

B

D∈ ℤ and

C

D∈ ℤ simultaneously is equivalent to D = ±1 [if not, there would be a prime ? > 1

that divides D. And any power of ? would then divide B and C, which is evidently a contradic-tion] which is markedly di�erent from what we get when 3 ≡ 2 or 3mod 4. Note that D = ±1 isequivalent to (2B + C)2 − 3C2 = ±4.

Proof. Suppose that U = B + C√3 ∈ ℤ[

√3] is a unit. Then there exists V ∈ ℤ[

√3] such that

UV = 1. By the �rst part of Lemma 65, UV = 1 = 1. It follows from the third part of Lemma 65then that

1 = (UV) (UV) = UUVV.

From the second part of Lemma 65, VV ∈ ℤ and UU = B2−3C2 ∈ ℤ and the equality in fact claimsthat UU = A2−3B2 is a unit inℤ. Since the units inℤ are ±1, we then conclude that A2−3B2 = ±1.

Conversely, suppose that A2 − 3B2 = ±1. Then (UU)2 = (A2 − 3B2)2 = 1. In other words,

U(UU U) = 1,

which says that U is a unit in ℤ[√3]. �

Example. The units in ℤ[√−1] are ±1,±

√−1. By Proposition 66, the units in ℤ[

√−1] are

B + C√−1 such that B2 + C2 = 1 for integers B and C. The only possible pairs (B, C) are (±1, 0) and

(0,±1).

Example. ℤ[√3] has in�nitely many units. Indeed, the units in ℤ[

√3] are of the form

(2+√3)= for = inℕ. Sincewe know that the fundamental solution to the Pell’s equation G2−3H2 =

±1 is (B, C) = (2, 1) and Theorem 51 asserts that every solution (E=, F=) is given by E= + F=√3 =

69

(B + C√3)=.

Example(NON-EXAMINABLE). Then units inℤ[1 +√3

2] are {B+C 1 +

√−3

2| B2+BC+C2 = 1}. To

solve the equation B2+ BC + C2 = 1 in B, C ∈ ℤ, we �rstly make appeal to the quadratic formula and

see that B =−C ±

√C2 − 4(C2 − 1)

2=−C ±

√−3C3 + 42

. For B to be an integer, there are two cases tofollow:

C is even In this case, −3C2 + 4 should be of the form (2U)2 for some U ∈ ℤ. It then followsfrom the equation (2U)2 + 3C2 = 4 that C2 = 0 (as C is meant to be an even integer) and therefore

that −3C3 + 4 = 4 and B =±√4

2= ±1 as a result.

C is odd In this case, −3C2 + 4 is should be of the form (2U + 1)2 for some U ∈ ℤ. It thenfollows form the equation (2U + 1)2 + 3C2 = 4 that C2 = 1 (as C is meant to be an odd integer), i.e.

C = ±1. If C = 1, then −3C3 + 4 = 1 and B =−1 ±

√1

2= 0 or 1; if C = −1, thenm −3C2 + 4 = 1 and

B =1 ±√1

2= 1 or 0.

In conclusion, the units in ℤ[1 +√3

2] are elements of the form B + C 1 +

√−3

2where (B, C) is

(1, 0), (−1, 0), (0, 1), (−1, 1), (1,−1) or (0,−1).

The following theorem proves the structure of solutions of Pell’s equation G2 − 3H2 = ±1.

Theorem 67 (Dirichlet’s unit theorem for a real quadratic �eld; NON-EXAMINABLE) Let 3 be asquare-free positive integer congruent to 2 or 3mod 4. The group of units in the ringℤ[

√3] of integers

of ℤ[√3] is isomorphic to

{±1} × ℤ.

This is a distilled form of what Dirichlet actually proved for � in 1846 (apparently, P. G. L.Dirichlet came up with a proof during a concert in the Sistine Chapel in Rome).

Theorem 68 (Dirichlet’s unit theorem for a number �eld; NON-EXAMINABLE) Let � be a number�eld. Let Aℝ (resp. 2Aℂ) be the number |Homℚ(�,ℝ) | (resp. |Homℚ(�,ℂ) |) of real embeddings (ofpairs of complex conjugate embeddings). The group of units in � is �nitely generated by A = Aℝ + Aℂ −1generators of in�nite order, i.e., the group of units in � is isomorphic to

` × ℤA

where ` is the �nite cyclic group of roots of unity.

Remark. If Aℝ > 0, then ` = {±1} as ±1 are the only roots of unity in ℝ. Even if Aℝ = 0, westill have ` = {±1}; for example ℤ[

√−1] has units {±1,±

√−1}.

Remark. The unit group is �nite if and only if A = 0, i.e., (Aℝ, Aℂ) = (1, 0) or (0, 1), i.e, � = ℚ

orℚ(√3) with 3 < 0. When 3 is a negative square-free integer, the group of units in the ring of

integers of ℚ(√3) is {±1} except when 3 = −1 in which case it is {±1,±

√−1} and when 3 = −3

and there are 6 units in ℤ[1 +√−3

2].

70

12 Supplementary notes 1 (revision notes)

12.1 Relations and a few bits and bobs

De�nition. A relation on a set ( relates two elements of (. When elements 0 and 1 of ( arerelated, write 0 ∼ 1.

For example, we may de�ne a relation on ( = ℤ by decreeing that 0 ∼ 1 when 0 and 1 havethe same parity, i.e, (0, 1) = (odd, odd) or (0, 1) = (even, even). Equivalently, 0 ∼ 1 if 0 − 1 isdivisible by 2.

We may de�ne 0 ∼ 1 when 0 and 1 has the same �nal digit. Or we may de�ne 0 ∼ 1 when0 is distinct from 1.

We simply specify ( and de�ne a relation on it.

De�nition. A relation ∼ on ( is an equivalence relation if

• (re�exive) 0 ∼ 0 holds for every 0 in (.

• (symmetric) If 0 ∼ 1, then 1 ∼ 0 holds for every 0, 1 in (.

• (transitive) If 0 ∼ 1 and 1 ∼ 2, then 0 ∼ 2 holds for every 0, 1, 2 in (.

The �rst two example above are equivalence relations. On the other hand, the third (i.e.last) example fails to be re�exive and transitive!

De�nition. If ∼ is an equivalence relation on (, the equivalence class [0] of 0 in ( is the setof all elements in ( that are related by ∼ (i.e., equivalent because ∼ is not just a relation but it isan equivalent relation) to 0.

The key example is the congruence relation. Fix an integer =. We de�ne a relation ∼ (typic-ally written ≡ for the congruence relation) on ( = ℤ by decreeing 0 ≡ 1 (mod =) when =| (0 − 1),i.e., = divides 0 − 1.

Proposition 69 The congruence relation is an equivalence relation on ℤ

This is because

• (re�exive) 0 − 0 is 0, and = divides 0.

• (symmetric) If 0 ≡ 1 mod =, then, by de�nition, = divides 0 − 1, i.e., 0 − 1 = A= for someinteger A. Multiplying −1 on both sides, −(0 − 1) = −A=, i.e., 1 − 0 = (−A)=. The last saysthat = divides 1 − 0, i.e., 1 ≡ 0 mod =.

• (transitive) Since =| (0 − 1), there exists an integer A such that 0 − 1 = A=. On the otherhand, since =(1 − 2), there exists an integer B such that 1 − 2 = B=. Adding up these twoequalities, we have

A= + B= = (0 − 1) + (1 − 2),

i.e.,(A + B)= = 0 − 2.

In other words, =| (0 − 2).

71

Example. Let = = 10 and consider the equivalence relation ≡ mod 10 on ( = ℤ. The equi-valence class [1] (o�en written as [1]10) of 1 is the set of all integers that are related to 1 by ≡mod 10, i.e., all integers 0 such that 0 ≡ 1mod 10, i.e., all integers 0 such that 10 divides 0 − 1.So, as a set,

[1] = {. . . ,−19,−9, 1, 11, 21, . . . } = {1 + 10: | : ∈ ℤ}.

Similarly,[2] = {. . . ,−18,−8, 2, 12, 22, . . . } = {2 + 10: | : ∈ ℤ}.

How about [11]? One can argue as above with 11 in place of 1 to see that [11] is the set ofall integers 0 such that 10 divides 0 − 11. This is nothing other than

{11 + 10: | : ∈ ℤ} = {. . . ,−19,−9, 1, 11, 21, . . . } = {1 + 10; | ; ∈ ℤ},

in other words, [1] = [11]. In fact, any integer in [1] can represent the equivalence class of [1]:

· · · = [−19] = [−9] = [1] = [11] = [21] = · · · .

So it appears that one can identify a lot of equivalence/congruence classes onℤ. Howmanydistinct equivalence classes are there? The equivalence classes in the case of the mod 10 ex-ample are:

[0], [1], [2], [3], [4], [5], [6], [7], [8], [9] .

As mentioned above, it does not have to be [1]; it can be [11], or any [1 + 10:] for that matter,but it is o�en useful/convenient to choose representatives (the numbers inside [ ]) according to a�xed rule in mind. The representatives A above are chosen according to the rule that 0 ≤ A ≤ 9so that [A] can be seen as the set of integers with residue A when divided by 10.

The following proposition, or rather its proof, helps you to �nd all the equivalence classes.

De�nition. A partition of a set ( is a collection of subsets such that every element of ( liesin exactly one of them.

Example. If ( = {1, 2, 3}, then {1}, {2}, {3} de�ne a partition, {1, 2}, {3} also de�ne a parti-tion, but {1, 2}, {2, 3} do not de�ne a partition because 2 appears in both subsets.

Proposition 70 If ((,∼) is an equivalence relation, then the equivalence classes form a partition on (.

Proof. By re�exivity, 0 lies in [0]. So any integer lies in at least one equivalence class. It re-mains to show that if 0, 1 ∈ ℤ, then either [0] = [1] or [0]∩[1] holds. Suppose that [0]∩[1] ≠ ∅.The goal, therefore, is to show that [0] = [1]. Since [0] ∩ [1] is assumed to be non-empty, thereexists an element 2 in ( that lies in [0] ∩ [1]. By de�nition, this means that 2 ∼ 0 and 2 ∼ 1.By symmetry, 0 ∼ 2. It therefore follows from the transitivity that 0 ∼ 1. This implies [0] = [1](an exercise to check why this holds– one needs o check [0] ⊆ [1] and [1] ⊆ [0] using 0 ∼ 1and transitivity of the equivalent relation ∼). �

Fix an integer = and ≡ denote the mod = congruence relation on ℤ. We know that ≡ is anequivalence relation. We typically write ℤ= for the set of all equivalence classes– there are in-deed = equivalence classes corresponding to the number of possible ‘residues’ when dividedby =.

72

We can ‘add’ and ‘multiply’ equivalence classes in this case:

[0] + [1] = [0 + 1]

and[0] [1] = [01] .

Exercise. Check, in the case of = = 10, why this is possible, i.e, unravel what both sidesmean and observe that they demand the same set of integers.

In fact,

Proposition 71 ℤ= is a ring with respect to the addition and multiplication as de�ned above.

12.2 Rings and �elds (from notes)

De�nition. A ring is

• a set ',

• elements 0, 1 ∈ ',

• maps + : ' × ' → ' and × : ' × ' → ',

subject to 3 axioms

1. (', +) is an abelian group with identity 0,

2. (',×) is a commutative semigroup, i.e. 0 × (1 × 2) = (0 × 1) × 2, 0 × 1 = 1 × 0 = 0 and0 × 1 = 1 × 0 for all 0, 1, 2 ∈ '

3. (distributivity) 0 × (1 + 2) = 0 × 1 + 0 × 2 for all 0, 1, 2 ∈ '.

Examples. ℤ,ℚ,ℝ,ℂ.

De�nition. A �eld is a ring ' that satis�es the following extra properties

• 0 ≠ 1,

• every non-zero element of R has a multiplicative inverse: if A ∈ ' and it is non-zero, thenthere exists B ∈ ' such that AB = 1; in other words, '−{0} is a group under ×with identity1.

Non-example. The ring ℤ of integers is not a �eld. Indeed, 3 ∈ ℤ and 7 ∈ ℤ, but there is nointeger I such that 3G = 7, so 3/7 does not lie in ℤ.

Examples. ℚ,ℝ,ℂ.

73


De�nition. 0 ≡ 1 mod = if = divides 0 − 1.

De�nition. ℤ= is the set of congruence classes mod = with addition [0]= + [1]= = [0 + 1]=and multiplication [0]= [1]= = [01]=, with respect to which ℤ= de�nes a ring.

Be comfortable with the equivalent formulation

• 0 ≡ 1 mod =,

• [0]= = [1]= in ℤ=.

Proposition 8 proves for 0, = ∈ ℕ and 1 ∈ ℤ that

0G ≡ 1

mod = is soluble (i.e. can ‘get rid of 0’ and �nd G ≡ 2 mod = for some 2) if and only if gcd(0, =)divides 1. In particular, if gcd(0, =) = 1, we can always solve the congruence equation 0G ≡ 1mod =. This can be proved as follows (it contains the essence of what this proposition isabout): since gcd(0, =) = 1, it follows from Bezout/Euclid that there exist A, B ∈ ℤ such that0A + B= = gcd(0, =) = 1. Hence 0A ≡ 1 mod =. Multiplying A on the both side of the equation,G ≡ 1G ≡ 0AG ≡ A1 mod =.

Chinese Remainder Theorem.

Euler’s totient function q : ℕ → ℕ which sends = ∈ ℕ to the number of integers 1 ≤ I ≤ =such that gcd(I, =) = 1.

Theorem 17 proves formulas for q.

Theorem 15 proves for = ∈ ℕ and I ∈ ℤ such that gcd(I, =) = 1 that

Iq (=) ≡ 1

mod =. This generalises (the proof of) Fermat’s Little Theorem (Theorem 7).

De�nition. The order of an integer I is the smallest positive integer 3 such that I3 ≡ 1mod=. If [I] ∈ ℤ= denotes the congruence class mod = represented by I, i.e., the set of integerscongruent to I mod =, then the de�nition can be paraphrased as [I]3 = [I3] = [1].

Theorem15, Lemma19 andProposition 20 in combinationprove for I ∈ ℤ such that gcd(I, =) =1 that if 3 is the order of I mod =, then 3 divides q(=).

We specialise to ‘congruence mod ?’.

De�nition. An integer I is a primitive root mod ? if I has order ? − 1. From the statementabove, it means that I has max possible order.

We saw lots of examples of primitive roots mod ?.

74

Is there, really, a primitive root mod ? for any ?? If so, how many? What about those in-tegers of order 3 < ? − 1mod ?? How many?

Theorem 22 answers these questions. The number of integers 1 ≤ I ≤ ? − 1 of order1 ≤ 3 ≤ ? − 1 is exactly q(3).

A key observation: if I is a primitive root mod ? (it exists and there are indeed q(?) = ? − 1of them according to Theorem 22), then

{1, I, . . . , I?−2} ≡ {1, . . . , ? − 1}

mod ?.

Quadratic residues and non-residues. Suppose ? > 2.

De�nition. An integer 0 is a quadratic residue mod ? if there exists an integer I such thatI2 ≡ 0 mod ?, or equivalently the congruence equation G2 ≡ 0 mod ? is soluble.

If 0 ≡ 1 mod ?, 0 is a quadratic residue mod ? if and only if 1 is.

De�nition.(0

?

)=

0 if ? |0+1 if ? does not divide 0 and 0 is a quadratic residue mod ?−1 if ? does not divide 0 and 0 is not a quadratic residue mod ?

Theorems 25 asserts if ? is an odd prime, then:

(Rule 0) If 0 ≡ 1 mod ?, then(0

?

)=

(1

?

).

(Rule 1)(01

?

)=

(0

?

) (1

?

)(Rule 2)

(−1?

)(Rule 3)

(2

?

)(Rule 4)

(?

@

) (@

?

)= (−1) (?−1) (@−1)/4 for any pair of distinct odd primes ? and @.

Proposition 27 (Euler’s criterion) shows that if ? does not divide 0, then(0

?

)≡ 0 (?−1)/2 mod

?. The key observation is used crucially in proving this.

We use Euler’s criterion to solve congruence equations of the form G2 ≡ 0 mod ? (the Le-gendre symbol only tells you the solubility).

75

Proposition 28 asserts if ? ≡ 3mod 4 and(0

?

)= 1, then 0 (?+1)/4 is a solution to the equation

G2 ≡ 0 mod ?.

Proposition 29 asserts if ? ≡ 1mod 4 and(0

?

)= −1, then 0 (?−1)/4 is a solution to the equa-

tion G2 ≡ −1mod ?.

Admittedly, Theorem 30, Hensel’s lemma, was an avalanche. Given a polynomial %(G) withinteger coe�cients, we ask ourselves the following question:

Q. If we �nd a solution I= to %(G) ≡ 0mod ?# , can we �nd a solution I#+1 to %(G) ≡ 0mod?#+1 that satis�es (‘li�s’) I#+1 ≡ I# mod ?# ?

A. Yes, if %′(I# ) =3%

3G|G=I# is not divisible by ?. Indeed,

I#+1 = I# − %(I# )& ′(I# )

where & ′(I# ) is the inverse of %′(I# ) mod ? [Recall that, given an integer 0, e.g. %′(I# ), theinverse of 0 is an integer 1 such that 01 ≡ 1mod ?. How do we �nd a such 1 in practice? When0 is not divisible by ?, e.g. when 0 = %′(I# ), gcd(0, ?) = 1. It then follows from Bezout/Euclidthat there exist A, B ∈ ℤ such that 0A + ?B = 1; hence 0A ≡ 1mod ?. The 1 we are looking for isA.]

We saw a bunch of examples for Hensel.


Finite continued fractions

De�nition. Given U, U1, . . . , U#−1 ∈ ℤ and U# ∈ ℝ (where # ≥ 1), we write

[U;U1, . . . , U# ]

to meanU + 1

U1 +1

. . . + 1

U#−1 +1

U#

By de�nition, this is a rational number.

Proposition 31 shows conversely that, given a rational number in its lowest terms, one canwrite it as [U;U1, . . . , U# ] for some #, and Theorem 32 in fact shows that the continued fractionexpression is indeed unique.

Proposition 31 is proved in two di�erent ways, namely by induction and by an argumentthat involves Euclid’s algorithm. We turn the latter into an algorithm:

76

U = bAc ⇒ d1 =1

A − Uw

U1 = bd1c ⇒ d2 =1

A1 − U1w...

w

U#−1 = bd#−1c ⇒ d# =1

d#−1 − U#−1∈ ℕ

w

U# = bd# c = d#When the input A is a rational number, the algorithm stops as soon as we see a positive

integer d# .

Example. A =87

38.

U = b8738c = 2 ⇒ A1 =

18738 − 2

=38

11

w

U1 = b38

11c = 3 ⇒ A2 =

13811 − 2

=11

5

w

U2 = b11

5c = 2 ⇒ A3 =

1115 − 2

=5

1∈ ℕ

w

U3 = b5

1c = 5 = A3

The algorithmworks for negative rationals! The only di�erence is that the �rst term ‘U’ willinevitably a negative integer.

Let A = [U, U1, . . . , U# ]. For 0 ≤ = ≤ #, de�ne

A0 = [U; ]A1 = [U;U1]

...

A= = [U;U1, . . . , U=]...

A# = [U;U1, . . . , U# ]

and call them the convergents to A.

It turns out that it is possible to rewrite this in terms of pairs B=, C= for 0 ≤ = ≤ #, which arede�ned as

• B−1 = 0, B0 = U,B= = U=B=−1 + B=−2,

77

• C−1 = 0, C0 = 1,C= = U=C= + C=−2.

Proposition 33 asserts that, for 0 ≤ = ≤ #, we may write A= = [U;U1, . . . , U=] as

A= =B=

C=.

Theorem 34 proves how the B=’ and the C=’s are related.

• B=C=−1 − C=B=−1 = (−1)=−1 for every = ≥ 1,

• A= − A=−1 =(−1)=−1C=C=−1

for every = ≥ 1,

• gcd(B=, C=) = 1.

As a corollary, we see that ‘even (resp. odd) indexed’ convergents de�ne an increasing (resp.decreasing) sequence of positive integers:

A0 < A2 < A4 < · · · < A28 < A2 9−1 < · · · < A5 < A3 < A1.

What if we know more than �nitely many

U, U1, . . . , U#

and know that in�nitely manyU, U1, . . .

positive integers (U may be negative)? Does it make sense to think about [U;U1, . . . ]? Would itmake sense to consider the limit of [U;U1, . . . , U# ] as # tends to∞?

Theorem 36 answers ‘Yes’! More precisely, the convergents A= = [U;U1, . . . , U=] =B=

C=do con-

verge to a limit in ℝ (no longer in ℚ as in the �nite case) as =→∞.

Theorem 37 characterises what these in�nite length continued fractions look like. It provesthat every irrational number can be written as [U;U1, . . . ], and Theorem 39 says that the ex-pression is unique.

Amazingly, just as in the case of �nite continued fractions, the algorithm we use work forirrationals!

Example. A =√2.

U = b√2c = 1 ⇒ d1 =

1√2 − 1

= 1 +√2

w

U1 = b1 +√2c = 2 ⇒ d2 =

1

(1 +√2) − 2

=1

√2 − 1

= d1

w

U2 = U1 ⇒ d3 = d2 = d1w...

78

Hence√2 = [U;U1, U2, . . . ] = [1; 2, 2, . . . ].

Example. A =√3.

U = b√3c = 1 ⇒ d1 =

1√3 − 1

=1 +√3

2w

U1 = b1 +√3

2c = 1 ⇒ d2 =

1

1+√3

2 − 1=

2√3 − 1

= 1 +√3

w

U2 = b1 +√3c = 2 ⇒ d3 =

1

(√3 + 1) − 2

=1

√3 − 1

= d1

w

U3 = U1 ⇒ d4 = d2w...

Hence√3 = [U;U1, U2, . . . ] = [1; 1, 2, 1, 2, . . . ].

We o�en see ‘periodic’ continued fractions (e.g.√2,√3). Write

[U;U1, . . . , U#−1, U# , . . . , U#+;−1]

if[U;U1, . . . , U#−1, U# , U#+1, . . . , U#+;−1,

| | | | | |U#+;, U#+;+1, . . . U#+2;−1,| | | | | |

U#+2; . . . . . . ]

Examples.√2 = [1; 2],

√3 = [1; 1, 2].

On the other hand, if we are given a periodic continued fraction, we can work out what itlooks like as a real number.

Example. [1; 2].

Let A = [1; 2]. ThenA = 1 + 1

2 + 1

[1; 2]

=1

2 + 1

A

=3A + 12A + 1 .

It therefore follows that A (2A + 1) = 3A + 1, i.e., 2A2 − 2A − 1 = 0. By the quadratic formula,the solutions for the quadratic equation is

1 ±√3

2.

Since A > 0, A =1 ±√3

2.

79

This simple example contains the essence of Theorem 46 which asserts that a real numberhas a periodic continued fraction if and only if it is a quadratic irrational, i.e., it is of the formB + C√3 where B, C ∈ ℚ, C is non-zero and 3 > 1 is square-free.

Given an irrational number A, we are interested in �nding a ‘good’ approximation inℚ to A.What should we mean by ‘good’?

De�nition. Let A be an irrational number. A rational numberB

Cis a good approximation to

A if|A − B

C| < |A − B

′

C ′|

for anyB′

C ′with C ′ < C.

Remark. It means that ‘there is no rational number closer to A thanB

Cis to A with smaller

denominator. IfB′

C ′has a smaller denominator than that of

B

C, it has to be further from A than

B

Cis to A. This is exactly what it says in the the de�nition.

In fact, Theorem 43 proves that if A is irrational, [U;U1, . . . ] is its continued fraction and A=is the =-th convergent [U;U1, . . . , U=], then A= is a good approximation to A for = ≥ 2, i.e.,

|A − B=C=| < |A − B

′

C ′|

for anyB′

C ′with C ′ < C=.

Theorem 44 asserts that a rational. number su�ciently close to an irrational number A usinevitably a convergent to A.


In Lecture 9-4, I provided a narrative and explainedwhy (NON-EXAMINABLE)√3 (for a square-

free positive integer 3) has continued fraction expression√3 = [U;U1, U2, . . . , U2, U1, 2U] .

Recall:

De�nition. A real number A ∈ ℝ is a quadratic irrational if if it of the form B + C√3 where

B, C ∈ ℚ, C is no-zero and 3 > 0 is square-free.

By de�nition,

A = B + C√3 ⇒ (A − B)2 = C23 ⇒ A2 − 2BA + (B2 − C23) = 0,

hence A is a solution to the quadratic equation G2 − 2BG + (B2 − C23) with rational coe�cients.

De�nition. Let A be the other solution of the quadratic equation; we call A the conjugate of A.

The key idea I would like to promote in this document is

80

the conjugate A helps us understand A a lot better than A on its own

This will be particularly evident in the last section of the course (algebraic number theory).

De�nition. A quadratic irrational A is reduced if

• A > 1

• the conjugate A satis�es −1 < A < 0.

Example. A = 1 +√2 is a reduced quadratic irrational. Evidently A > 1, while the conjugate

(check!) is A = 1 −√2 and it satis�es −1 < A < 0.

On the other hand,√2 is a quadratic irrational but NOT reduced. In fact,

√3 is NOT reduced

for any 3.Example. A =

√3 + b

√3c is a reduced quadratic irrational, for any 3. The conjugate A is

−√3 + b√3c.

We will see that Theorem 46 (Theorem 45 for⇒){periodic continued fractions

}⇐⇒

{quadratic irrationals

}specialises to{

purelyperiodic continued fractions

}⇐⇒

{reduced

quadratic irrationals

}In fact, we will see⇐= (and =⇒) in the special case to give you an idea of how⇐= in The-

orem 46 should work.

Let A be a reduced quadratic irrational. We may let

A = B + C√3 =

% +√�

&

for some integers %,&, � where � is not necessarily square-free. More precisely, as A satis�esthe quadratic equation

0G2 + 1G + 2 = 0

where 0, 1 and 2 are integers, the solutions of the equation is computed by the quadratic for-

mula and they are−1 ±

√12 − 40220

. By de�nition, A > A, hence A =−1 +

√12 − 40220

while

A =−1 −

√12 − 40220

. As a result,% = −1,

& = 20,

� = 12 − 402.

Using these, we may deduce the following.

& divides %2 − � This follows from%2 − �&

=12 − (12 − 402)

20=402

20= 22 ∈ ℤ.

81

Since A is reduced, A > 1 and −1 < A < 0. It follows from these inequalities that

& > 0 As A − A > 0, we have2√�

&> 0, hence & > 0.

% > 0 As A + A > 0, we have%

&> 0 and it follows from & > 0 then that % > 0.

% <√� Since A =

% −√�

&< 0 and & > 0, we have % −

√� < 0.

& < % +√� < 2

√� Since A =

% +√�

&> 1 and & > 0, we have & < % +

√�. The other

inequality follows from % <√� above.

To sum up, if A is a reduced quadratic irrational, it is of the form% +√�

&for some integers

%,&, � satisfying%,& > 0, % <

√�, & < 2

√�, & |%2 − �.

Let us now compute the continued fraction of (reduced) A. The �rst step of the algorithmis:

U = bAc ⇒ d1 =1

A − U .

Alternatively, A = U + 1

d1.

We claim that d1 is (again) a reduced quadratic irrational . It is le� as an exercise to checkthat d1 is a quadratic irrational. To see that it is reduced:

d1 > 1 Since 0 ≤ A − U = A − bAc < 1, we have d1 =1

A − U > 1.

−1 < d1 < 0 It is le� as an exercise to check

A = U + 1

d1,

or equavalently d1 =1

A − U = − 1

U − A . Since A < 0 and U ≥ 1 (because A > 1), we see U − A > 1

and therefore −1 < d1 < 0.

We now claim that d1 is of the form%1 +

√�

&1for some integers %1 and &1. Since

1

d1= A − U = % +

√�

&− U = % −&U +

√�

&,

if we let %1 = −% +&U, we get1

d1=−%1 +

√�

&. It then follows that

d1 =&

−%1 +√�=&(%1 +

√�)

� − %21

=%1 +

√�

&1

82

if we let&1 =� − %2

1

&; note that because& |%2−�, we know that& |�−%2

1 (check!) and therefore

&1 is an integer as desired.

Since d1 is (proved to be) a reduced quadratic irrational of the form%1 +

√�

&1for some in-

tegers %1, &1, we may repeat the argument before to deduce

%1 > 0, &1 > 0, %1 <√�, &1 < 2

√�, &1 |%2

1 − �.

By induction, we may indeed deduce that, for every = ≥ 1, d= is a reduced quadratic irra-

tional of the form%= +

√�

&=for some integers %=, &=, which satisfy

%= > 0, &= > 0, %= <√�, &= < 2

√�, &= |%2

= − �.

Observe that the inequalities actually suggest that there are only �nitely many possibilitiesfor %=’s and &=’s (there are only �nitely many integers in the intervals (0,

√�) and (0, 2

√�)

to choose from). This means that it is not possible to keep �nding distinct pairs (%=, &=) ofintegers %= and &=, i.e., there has to exist # such that (%# , &# ) = (%#+;, &#+;) for some ; ≥ 1.It follows that d# = d#+; and therefore that U# = U#+;, i.e, A has periodic continued fraction.

Interlude: %=+1 and &=+1 in terms of %= and &= Note that if d= =%= +

√�

&=, then

d=+1 =1

d= − U==

1

%= +√�

&=− U=

=&=

%= +√� − U=&=

=&= (%=+1 +

√�)

� − %2=+1

=%=+1 +

√�

� − %2=+1

&=

=%=+1 −

√�

&=+1

if we let %=+1 = −%= + U=&= and &=+1 =� − %2

=+1&=

.

We still haven’t shown that (reduced) A has purely periodic continued fraction. To provethis, it su�ces to show that, for any # ≥ 1,

d# = d#+;

impliesd#−1 = d#+;−1.

To this end, recall that the =-th stage of the continued fraction algorithm says:

U= = bd=c ⇒ d=+1 =1

d= − U=so

d= = U= +1

d=+1.

One of the exercises suggested earlier in fact shows similarly that

d= = U= +1

d=+1;

83

to prove this, if d= =%= +

√�

&=, then d= =

%= −√�

&=and therefore

1

d= − U==

1

%= −√�

&=− U=

=&=

−%=+1 −√�=&= (−

√� + %=+1)

� − %2=+1

=%=+1 −

√�

&=+1= d=+1

by the formulae (%=+1 = −%= + U=&= and &=+1 =� − %2

=+1&=

) above.

Let γ= = −1

d=for brevity. Since d= is reduced and therefore −1 < d= < 0, we have

γ= > 1

and d= = U= +1

d=+1may be rewritten as − 1

γ== U= − γ=+1, i.e.,

γ=+1 = U= +1

γ=.

In particular,bγ=+1c = U=.

Notice its similarity to d= = U= +1

d=+1. The point is that the index for γ ‘goes the other way’.

This is at the heart of the argument to follow and why considering conjugates is useful.

d# = d#+; ⇒ d#−1 = d#+;−1 Let us now prove the assertion that d# = d#+; implies d#−1 =d#+;−1 using γ’s. Suppose that d# = d#+;.

U#−1 = U#+;−1 It follows from the assumption that d# = d#+;, hence γ# = γ#+;. It there-fore follows that

U#−1 = bγ# c = bγ#+;c = U#+;−1.

d#−1 = d#+;−1 Since d#−1 = U#−1 +1

d#and d#+;−1 = U#+;−1 +

1

d#+;, the assertion follows

from U#−1 = U#+;−1 (above) and d# = d#+; (assumption).

To sum up, we have proved that a reduced quadratic irrational has its purely periodic con-tinued fraction of the form

[U, U1, . . . , U;−1] .

Conversely, a real number with a purely periodic continued fraction is a reduced quadraticirrational. This follows from

Lemma. If A has a purely periodic continued fraction of the form

[U;U1, . . . , U;−2, U;−1],

then the ‘reverse continued fraction’

B = [U;−1;U;−2, . . . , U1, U]

84

represents γ = −1A.

Proof of Lemma. It is possible to prove this by making appeal to ‘Euler’s rule’ (Section IV-3of Davenport’s ‘The Higher Arithmetic’) but here is a direct proof I have come up with. Thecontinued fraction algorithm for A is

U = bAc ⇒ d1 =1

A − Uw

U1 = bd1c ⇒ d2 =1

A1 − U1w...

w

U;−1 = bd;−1c ⇒ d; =1

d;−1 − U;−1= A

w

U; = bAc = U ⇒ d;+1 = d1w...

For 1 ≤ = ≤ ;, let γ= = −1

d=> 1. We then have

d= =1

d=−1 − U=−1⇔ d= =

1

d=−1 − U=−1⇔ d=−1 = U=−1 +

1

d=

⇔ γ= = U=−1 +1

γ=−1.

Using this, we may rewrite above as:

U;−1 = bγ;c ⇒ γ;−1 =1

γ; − U;−1w

U;−2 = bγ;−1c ⇒ γ;−2 =1

γ;−1 − U;−2w...

w

U = bγ1c ⇒ 1

γ1 − U= −1

A=−1d;= γ;

w

bγ;c = U;−1 ⇒ γ;−1w...

where we use d1 =1

A − U to see that1

γ1 − U= −1

A, and γ; = −

1

d;=

1

A(as we know from above

85

that d; = A). In other words, the continued fraction for γ = −1

Ais [U;−1, . . . , U1, U]. �

A with purely periodic continued fraction is reduced Since it is purely periodic and U ≥ 1

‘by de�nition’, A > 1. On the other hand, it follows from B > 1 (because U;−1 > 1) that

0 < −1B= A < −1, i.e., A is reduced.

We are �nally (!) in a position to prove that the continued fraction for√3 is of the form

√3 = [U;U1, U2, . . . , U2, U1, 2U] .

As mentioned earlier, while√3 is not reduced, A =

√3 + b

√3c is reduced. It then follows

from {purely

periodic continued fractions

}⇐⇒

{reduced

quadratic irrationals

}that the continued fraction for A =

√3 + b

√3c therefore is purely periodic. If we let U = b

√3c,

the �rst term for the continued fraction for A will be 2U, because the very �rst term in thecontinued fraction algorithm for A is

bAc = b√3 + Uc = b

√3c + U = U + U = 2U.

Taking these into account, we may therefore write

A =√3 + U = [2U, U1, . . . , U;−1]

for some U1, . . . , U;−1 in ℕ. In fact, subtracting U, we see that√3 = [U, U1, . . . , U;−1, 2U, U1, . . . , U;−1, 2U, . . . ]

[note that subtracting U breaks the symmetry in the �rst ; terms, but not all the way through].The continued fraction expression for A tells us even more. As the �rst few steps of the contin-ued fraction algorithm for A looks like:

bAc = 2U ⇒ d1 =1

A − bAc =1

A − 2Uw

bd1c = U1 ⇒ d2w...

we see that d1 =1

A − 2U may be expressed as

[U1, U2, . . . , U;−1, 2U, U1, . . . , U;−1, 2U, U1, . . . ] = [U1, . . . , U;−1, 2U, U1, . . . , U;−1]

as appeared ‘from the second term onwards’ in the continued fraction for A.

On the other hand, it follows from the lemma above that the reserve continued fraction

[U;−1, U;−2, . . . , U1, 2U]

computes −1A= − 1

U −√3=

1√3 − U

=1

A − 2U .

Equating the two continued fraction expressions for d1 =1

A − 2U , we see thatU;−1 = U1, U;−2 =U2, . . . . Plugging these back into the continued fraction expression for

√3 above, we see that

√3 = [U, U1, U2, . . . , U2, U1, 2U, U1, U2, . . . , U2, U1, 2U, . . . ] = [U, U1, U2, . . . , U2, U1, 2U] .

86


Mahdiyah Ali asked me how one can, in practice, achieve Step 1 of Hermite’s algorithm (Sec-tion 10.1)– �nd an integer I such that I2 ≡ −1 mod ?–when ? is relatively big. Here is what Ithink doable.

I’ll explain how to do this with the example ? = 2017– this is the example in Section 10.1,where the notes simply claim I = 229 but how?

Since 2017 ≡ 1mod 4, we can make appeal to Proposition 29. To this end, we need to �nd

an integer 0 such that(0

?

)= −1, i.e., a non-quadratic residue mod ? = 2017. There is no clever

way of �nding 0. Simply try 0 = 2, 3, .... and pick the smallest that works.Indeed, (

2

2017

)= 1,

(3

2017

)= 1,

(4

2017

)=

(2

2017

)2= 1,

(5

2017

)= −1, . . . ,

hence 0 = 5works. Proposition 29 then shows that I = 52017−1

4 = 5504 satis�es I2 ≡ −1mod 2017.However, this is too large a number to run Hermite’s algorithm and we need to simply it mod2017. How? (This question is what inspired me to write this note)

Step 1: Write 504 as a sum of powers of 2 (with no multiplicity, e.g. 3 · 24 is not allowed).Indeed

504 = 28 + 27 + 26 + 25 + 24 + 23.How do we �nd this in practice? Firstly, �nd the highest power of 2 less than 504. It is 28 = 256.Then calculate 504 − 28 = 248. Secondly �nd the highest power of 2 less than 248. In this case27 = 128. Calculate 248 − 27 = 120. Thirdly, �nd the highest... (repeat). We get

505 = 28 + 248 = 28 + 27 + 120 = · · ·

You might wonder whether this ‘algorithm’ would ever stop, i.e. can any integer be ex-pressed as a sum of powers of 2 without multiplicity? The answer is ‘Yes’. It is because aninteger is a 2-adic integer (the idea commonly used in computers and calculators), but it isbeyond the scope of this note! It su�ces to know that it is possible.

Step 2: Calculate 52# mod 2017 recursively and multiply the residues of 52# with # =

3, 4, 5, 6, 7, 8 together in the end.

Observe that if 52A is congruent to an integer I mod 2017, then the square (52A )2 = 52A

52A

=

52A+2A = 52·2

A

= 52A+1 is congruent to I2 mod 2017. Applying this argument repeatedly, we get

52 ≡ 25

(52)2 = 522 ≡ 252 = 625

(522)2 = 523 ≡ 6252 = 390625 ≡ 1344

(523)2 = 524 ≡ 13442 = 1806336 ≡ 1121

(524)2 = 525 ≡ 11212 = 1256641 ≡ 50

(525)2 = 526 ≡ 502 = 2500 ≡ 483

(526)2 = 527 ≡ 4832 = 233289 ≡ 1334

(527)2 = 528 ≡ 13442 = 1779556 ≡ 562.

87


5504 = 528+27+26+25+24+23 ≡ 562 · 1334 · 483 · 50 · 1121 · 1344

≡ 1401 · 483 · 50 · 1121 · 1344≡ 988 · 50 · 1121 · 1344≡ 992 · 1121 · 1344≡ 665 · 1344≡ 893760≡ 229.

What makes this approach quite attractive is that it is algorithmic and in particular that, ateach stage, we can minimise computations by ‘reducing modulo 2017’.


17.1 Pell’s equation

Given an integer 3 that is not a square, we are interested in solving the equation

G2 − 3H2 = ±1;

more precisely, �nding a pair (B, C) of positive integers satisfying B2 − 3C2 = 1 or −1.

RemarkWhy 3 should not be a square? If 3 = 0, the equation is G2 = ±1 and that is boring.If 3 is a square 3 = �

2 > 1 say, then the equation is G2 − 3H2 = G2 − �2H2 = (G + �H) (G − �H) = ±1.Therefore (G+�H, G−�H) is either (1, 1), (−1, 1), (1,−1) or (−1,−1). Each one of the possibilitieswould then allows us to determine what G, H and 3 should be.

Remark. Why B > 0 and C > 0? If (B, C) is a solution, so is any of (−B, C), (B,−C), (−B,−C).

Remark. Why 3 > 0? If 3 < 0, then−3 > 0. In this case, G2+(−3)H2 = −1hasno solutions. Onthe other hand, if 3 < −1, then −3 > 1 and the solutions to G2 + (−3)H2 = 1 are (G, H) = (1, 0) and(−1, 0); if 3 = −1, then the solutions to G2 + (−3)H2 = G2 + H2 = 1 are (1, 0), (−1, 0), (0, 1), (0,−1).

It makes sense to consider only a non-square positive integer 3 and look for positive integersolutions to G2 − 3H2 = ±1.

Theorem 47 asserts that{The (positive) integer solutions to G2 − 3H2 = ±1

}⊂ {(B=, C=)} ,

where B=, C= is de�ned by the =-th convergent A= =B=

C=to√3.

We did two examples: G2 − 3H2 = ±1 where 3 = 2 or 3– we computed the convergents

A0 =B0

C0, A1 =

B1

C1, A2 =

B2

C2, . . .

to√3 and checked which ones indeed de�nes solutions to G2 − 3H2 = ±1. The point was that

NOT all of them did! It is therefore natural for us to ask if we can single out exactly whichconvergents A= =

B=

C=satisfy B2= − 3C2= = ±1.

88

Supplementary notes 4 explain that, for a square-free positive integer 3,√3 = [U;U1, U2, . . . , U2, U1, 2U] .

Theorem 48 (I’ve written the proof in the notes; the proof is NON-EXAMINABLE) assertsthat if

√3 = [U;U1, . . . , U;] (a form a lot less precise than above, but this still does the job), then{

The (positive) integer solutions to G2 − 3H2 = ±1}

= {(B#;−1, C#;−1) | # = 1, 2, . . . }= {(B;−1, C;−1), (B2;−1, C2;−1), (B3;−1, C3;−1), . . . }.

Moreover,B2#;−1 − 3C

2#;−1 = (−1)

#; .

In other words, not only we can single out solutions to G2 − 3H2 = ±1, we can also single outexactly which one are solutions to G2 − 3H2 = +1 and which ones are to G2 − 3H2 = −1!

Wepushed forward and introduced the concept of the fundamental solution to G2−3H2 = ±1.

De�nition. If (B, C) and (B′, C ′) are solutions to G2 − 3H2 = ±1, then de�ne

(B, C) < (B′, C ′)

if B + C√3 < B′ + C ′

√3 in ℝ. It then, and only then, makes sense to de�ne the smallest solution

(with respect to < de�ned above) to G2−3H2 = ±1 to be the fundamental solution. By de�nition,the fundamental solution does exist.

Remark. Theorem48proves that, if√3 = [U;U1, . . . , U;], the solutions are (B;−1, C;−1), (B2;−1, C2;−1), . . . .

Since we know by de�nition,0 < B;−1 < B2;−1 < . . .

and0 < C;−1 < C2;−1 < . . . ,

the fundamental solution is (B;−1, C;−1)!

Furthermore, the fundamental solution generates all the solutions to G2 − 3H2 = ±1 . The-orem 51 and Theorem 52 made precise what this means.

Theorem 51 proves that if (B, C) = (B;−1, C;−1) is the fundamental solution to G2 − 3H2 = ±1and if we de�ne (E=, F=) by

E= + F=√3 = (B + C

√3)=,

then E2= − 3F2= = ±1, i.e., (E=, F=) de�nes a solution to G2 − 3H2 = ±1. Furthermore, we know

for which = de�nes E2= − 3F2= = +1 and which does E2= − 3F2

= = −1: if n = B2 − 3C2 ∈ {±1},thenE2= − 3F2

= = n=. To sum up,

{(E=, F=)} ⊂{The (positive) integer solutions to G2 − 3H2 = ±1

}This does not quite prove what the fundamental.. says. But Theorem 52 concludes it; it

proves that if (E, F) is a solution to G2 − 3H2 = ±1, then (E, F) = (E=, F=) for some =. In otherwords,

{(E=, F=)} ⊃{The (positive) integer solutions to G2 − 3H2 = ±1

}.

89

Combining these two, we have another way of completely describing the solutions to thePell equation:

{(E=, F=)} ={The (positive) integer solutions to G2 − 3H2 = ±1

}.

Example. G2 − 61H2 = ±1. As√61 = [7; 1, 4, 3, 1, 2, 2, 1, 3, 4, 1, 14],

the length ; = 11 and therefore the fundamental solution is (B, C) = (B11−1, C11−1) = (B10, C10).Unfortunately, there is no easyway to compute the �rst/smallest solution. Startingwith (B0, C0),we recursively compute convergents. It turns out that (B10, C10) = (29718, 3805) and

B210 − 61C210 = 297182 − 61 · 38052 = −1.

According to Theorem 51,E1 + F1

√61 = (B + C

√61)1

so (E1, F1) = (B, C) = (B10, C10), but

E2 + F2

√3 = (B + C

√61)2 = (29718 + 3805

√61)2 = ...

andE2 − 61F2

2 = (−1)2 = 1.

To sum up, (E1, F1) is the smallest solution (the fundamental solution) to G2 − 61H2 = ±1;(E2, F2) is the second smallest solution to G2 − 61H2 = ±1 while it is also the smallest solutionto G2 − 61H2 = 1 (as E21 − 61F2

1 = −1 and this is the only smaller solution to G2 − 3H2 = ±1).

Furthermore, we know from Theorem 48 that (E2, F2) = (B2·11−1, C2·11−1) = (B21, C21). It iseasy to compute (E2, F2) using Theorem 51, but it is far more laborious to compute (B21, C21) byhand, simply following the de�nition (as you might have notice in Assessed coursework 4)! Itis possible to extrapolate the trick and compute (B=, C=) when = is very big (computing such athing is useful in approximating numbers), especially when ; is very big.

17.2 Sum of squares

If ? is a prime number, can we express ? as a sum of two integer-squares?, i.e., can we solveG2 + H2 = ? in (G, H) ∈ ℕ × ℕ?

Proposition 53 asserts that if ? ≡ 3mod 4, then G2 + H2 = ? has no solutions in (G, H) ∈ ℕ×ℕ.

Theorem 54/Corollary 56 asserts that if ? ≡ 1 mod 4, then G2 + H2 = ? has a solution in(G, H) ∈ ℕ × ℕ.

We can axiomatise the proof of Theorem 54 to solve the equation when ? ≡ 1 mod 3(Hermite’s algorithm).

Step 1: �nd I ∈ ℤ such that I2 ≡ −1 mod ?. In Supplementary note 5, I explained how wecan actually do this (NON-EXAMINABLE).

90

Step 2: compute convergentsB=

C=toI

?and �nd = satisfying

C= <√? < C=+1.

Then (G, H) = (C=, ?B= − IC=) or (C=, IC= − ?B=) de�nes a solution.

More sums of squares.

Theorem 57 asserts that a positive integer = is the sum of two squares if the square free partof = has no prime factors congruent to 3mod 4.

We can turn the proof into an algorithm that solves G2 + H2 = = for G, H ∈ ℕ × ℕ when thesquare-free part of = has no prime factors congruent to 3mod 4. Indeed, the identity

(A2 + B2) (C2 + D2) = (AC − BD)2 + (AD + BC)2

allows us to reduce the computation to solving G2+H2 = ? for every prime factor ? in the square-free part of =. By de�nition, ? is either 2 (in which case 12 + 12 = 2) or is congruent to 1mod 4and we may make appeal to Hermite’s algorithm to solve the equation G2 + H2 = ?.

Theorem 59 asserts that every positive integer is a sum of four squares. I wrote a proof butit is NON-EXAMINABLE.

17.3 Algebraic number theory

De�nition. A complex number U is an algebraic number (resp. algebraic integer) if there ex-ists 5 (G) ∈ ℚ[G] (resp. 5 (G) ∈ ℤ[G]) such that 5 (U) = 0 (resp. such that 5 ismonic and 5 (U) = 0).

Proposition 60 asserts that, in ℚ, the algebraic numbers are the integers.

De�nition. If U is an algebraic number, the minimal polynomial is a non-zero monic poly-nomial 5 (G) in ℚ[G] of smallest possible degree such that 5 (U) = 0.

Theminimal polynomial exists and it does so uniquely: given an algebraic number U, thereexists a polynomial with rational coe�cients of which U is a root; the minimal polynomial isthe irreducible polynomial (with rational coe�cients) of smallest possible degree that dividesof the polynomial. We need to check a candidate polynomial is really minimal/irreducible!

Gauss’ lemma (Theorem61) allows us to characterise algebraic integers in terms ofminimalpolynomials: an algebraic number is an algebraic integer if and only if its minimal polynomialhas integer coe�cients.

Gauss’s lemma is o�en useful in proving a given algebraic number is NOT an algebraic in-teger (to prove a given algebraic number is an algebraic integer, it is only necessary to spot amonic polynomial with integer coe�cients and a redundant to check if it is minimal).

De�nition A quadratic number (resp. integer) is an algebraic number (resp. integer) whoseminimal polynomial is of degree 2.

Let 3 be a square-free integer. Within ℚ(√3) = {B + C

√3 | B, C ∈ ℚ}, it is possible to describe

exactly

91

• the quadratic numbers (Proposition 62)

{B + C√3 | B ∈ ℚ, C ∈ ℚ − {0}}

• the ring of algebraic integers (Proposition 63)

ℤ[√3]

if 3 ≡ 2 or 3mod 4 and

ℤ[1 +√3

2]

if 3 ≡ 1mod 4

• the quadratic integers (Proposition 64)

{B + C√3 | B ∈ ℤ, C ∈ ℤ − {0}}

and in addition {B + C√3 | B ∈

(1

2+ ℤ

), C ∈

(1

2+ ℤ

)}when 3 ≡ 1mod 4.

De�nition. Let ' be a ring. An element A in ' is a unit if there exists B in ' such that AB = 1.

We are interested in understanding the units in the ring of integers of ℚ(√3).

Proposition 66 asserts that if 3 is a square-free integer congruent to 2 or 3mod 4, an algeb-raic integer U = B + C

√3 ∈ ℤ[

√3] is a unit if and only if B2 − 3C2 = ±1, i.e., (B, C) is a solution to

Pell’s equation G2 − 3H2 = ±1.

This proposition says that one can use what we learned in the section about Pell’s equationto understand the units.

18 Supplementary notes 7 (mock exam paper)

This is a collection of questions that, I believe, represents the MTH5130 material.

Pieces of advice.

• As in Assessed Coursework, you will not get a lot of marks by simply writing down an-swers (even if they are correct). Justi�cation really matters (more than ever).

• Wherever possible, check your answers, e.g., by substituting your answer back into theequation you are meant to solve etc.

• There will be no questions in the �nal exam that ask you to prove something. You will beasked to calculate/solve equations. That does NOT mean that you should simply ignoreall the proofs though. There are theorems/propositions whose proofs give us far moreinsight into what is going on than statement themselves (e.g. Proposition 8, Theorem 9,..., Theorem 57, etc.; see below).

• Everything is examinable unless otherwise speci�ed (‘NON-EXAMINABLE’).

92

• In terms of width and depth of questions, this mock exam is very much comparable.

• Good luck.

Q1 [Q1-c (2019)] Find all integer solutions to the following simultaneous congruences:

(1)

G ≡ 1mod 7,G ≡ 7mod 30.

(2)

3G ≡ 3mod 7,G ≡ 7mod 30.

(3)

2G ≡ 2mod 14,G ≡ 7mod 30.

A1 (1) The proof of Theorem 9 (the Chinese Remainder Theorem) in the notes with 0 =

1, < = 7, 1 = 2, = = 30 explains how to do this kind of problem. Firstly we �nd A, B ∈ ℤ such that

7A + 30B = gcd(7, 30) = 1.

To �nd them, we use Euclid’a algorithm:

30 = 7 · 4 + 27 = 2 · 3 + 1

Working backwards,

1 = 7 − 2 · 3 = 7 − (30 − 7 · 4) · 3 = 13 · 7 − 3 · 30.

In other words, (A, B) = (13,−3) does the job. According to the proof of Theorem 9,

G = <A1 + =B0 = 7 · 13 · 7 + 30 · (−3) · 1 = 547 ≡ 127

modulo 3 · 70 = 210 works [Check that 127 indeed satis�es the congruences!]. So the solutionsare all integers congruent to 127 mod 210 [Don’t forget about the ‘modulo 210 bit!]. The con-gruence class [127]210 is alternative way of describing the solution.

(2) If we can get rig of 3 in front of G in 3G ≡ 3mod 7, we can use the CRT. How can we dothis? Note that ‘division’ does not work in general when it comes to congruences. For example,

10 is congruent to 4 mod 6, but10

2= 5 is NOT congruent to

4

2= 2 mod 6 [5 IS congruent to 2

mod6

2= 3, and this touches upon the subject highlighted in Proposition 8, in particular in its

proof]! The way to resolve the situation is to ‘multiply’, i.e. �nd A such that 3A ≡ 1 mod 7, sothat

3AG ≡ 3A mod 7

gives rise toG = 1 · G ≡ 3AG ≡ 3A mod 7.

93

To �nd a such A, we run Euclid’s algorithm (or otherwise) with 3 and 7 to �nd

1 = 1 · 7 − 2 · 3 mod 7

Reducing modulo 7, we get1 ≡ (−2) · 3 mod 7.

Therefore A = −2 works! Feeding this back into the discussion above, we get

G ≡ 3 · (−2) = −6 ≡ 1 mod 7.

So (2) is nothing other than (1). Note that G ≡ 1 is NOT deduced by dividing 3G ≡ 3mod 7 by 3,but by multiplying (−2); it looks like ‘dividing’, but it will in general give you a false answer asthe 10 ≡ 4mod 6 example demonstrates.

(3) The proof in (2) is extrapolates the proof of Proposition 8. Making appeal to Proposition8, as gcd(0, =) = gcd(2, 14) = 2 and this divides 1 = 2, the congruence equation 2G ≡ 2mod 14 issoluble (it is le� as an exercise to look at the proof and deduce that G ≡ 1mod 7). Alternatively,we may argue that 2G ≡ 2 mod 14 if and only if 2G − 2 is divisible by 14 if and only if G − 1 isdivisible by 7. �

Q2 [Q4-b (2018)] Find a primitive root mod 13. How many primitive roots mod 13 are therebetween 1 and 12?

A2 Let ? be a prime number. The order of an integer Imod ? is the smallest integer� suchthat I� ≡ 1mod ?.

On the other hand, (the proof of) Fermat’s Little Theorem (Theorem 7; it s generalisationin Theorem 15) shows that, for I not divisible by ?, I?−1 ≡ 1mod ?, so

� ≤ ? − 1.

Indeed, the minimality of� implies that

� | (? − 1)

(see Lemma 19). It therefore follows that, in working out the order of I, all we need to do istrial and error with divisors of ? − 1 (including ? − 1)! The table so obtained is as follows (thedivisors of 12 are 2, 3, 4, 6 or 12):

I 1 2 3 4 5 6 7 8 9 10 11 12

� 1 12 3 6 4 12 12 4 3 6 12 2

There are 4 integers between 1 and 12 that has order 12. They are primitive.

Alternatively, we can male appeal to Theorem 22: if (� is the set of integers between 1 and? − 1 of order � mod ?, then |(� | = q(�). The number of primitive roots mod ? = 13 istherefore computed by

|(?−1 | = q(? − 1) = q(12) = q(22)q(3) = 22−1(2 − 1) (3 − 1) = 4

(see Theorem 17 for various properties of Euler’s totient function q I am invoking implicitlyabove).

94

Remark. Fermat’s Little Theorem is my favourite (as I mentioned before). It is useful formany things. For example, since I? ≡ I mod ? for any integer I, we can use it to prove that agiven (positive) integer is a prime number or not– given =, if there is a (positive) integer I suchthat I= is NOT congruent to I mod =, then = can NOT be a prime number.

Q3 [Q4-b (2017)] Calculate the Legendre symbol(21

67

).

A3 By R1,(21

67

)=

(3

67

) (7

67

).(

3

67

)= −1 This simply follows from Corollary 26 as 67 ≡ 7mod 12.(

7

67

)= −1 To compute

(7

67

), we use R4

(7

67

) (67

7

)= (−1) 67−12 · 7−12 = (−1)33·3 = (−1).

Therefore(7

67

)= −

(67

7

). On the other hand, it follows from R0 (and R1) that

−(67

7

)= −

(4

7

)= −

(2

7

) (2

7

)= −1.

Q4 Find an integer I such that I2 ≡ −1mod 73.

A4 This follows Supplementary note 5 (which elaborate on Example just a�er Proposition29; this topic comes up again in Hermite’s algorithm in Section 10.1 about sums of squares!).To �nd I, we make appeal to Proposition 29.

Step 1. Find an integer 0 such that( 073

)= −1. Trial and error.

(5

73

)= −1, so 0 = 5 works.

By Proposition 29,I = 5

73−14 = 518

works. It would be amiss not to simplify modulo 73. To this end, we write 18 ‘2-adically’, i.e., asum of powers of 2 without multiplicity:

18 = 1 · 24 + 0 · 23 + 0 · 22 + 1 · 21 + 0 · 20.

We then compute, modulo 73,

52 ≡ 25

522= (52)2 = 252 = 625 ≡ 41

523= (522)2 ≡ 412 = 1681 ≡ 2

524= (523)2 ≡ 22 = 4

ThereforeI = 518 = 52

4+21 ≡ 4 · 25 = 100 ≡ 27.

So I = 27 is what we are looking for. �

95

Q5 [Q3-b (2016)] Given√75 = [8; 1, 1, 1, 16], �nd all integers solutions to G2 − 75H2 = ±1.

A5 Note that I never assume 3 to be square-free when it comes to Pell’s equation G2 − 3H2 =±1; only that 3 is NOT a square. Even if

√75 = 5

√3, we can apply the machine directly to√

75. Given the continued fraction expression, ; = 4 and therefore the fundamental solution is(B, C) = (B3, C3). Compute

A0 =B0

C0= [8; ] = 8

1, A1 =

B1

C1= [8; 1] = 9

1, A2 =

B2

C2= [8; 1, 1] = 17

2, A3 =

B3

C3= [8; 1, 1, 1] = 26

3,

directly or recursively (i.e. ‘by de�nition’). Therefore (B, C) = (26, 3) is the fundamental solution.As

262 − 75 · 32 = 1,

the solutions are(E=, F=), (−E=, F=), (E=,−F=), (−E=,−F=)

where E= + F=√75 = (26 + 3

√75)=. Note that they all satisfy E2= − 75F2

= = 1= = 1, not −1. �

Q6 [Q3-b (2018)] Given that 372 ≡ −1mod 137, use Hermite’s algorithm to solve G2 + H2 = 137in G, H ∈ ℤ.

A6 Compute the convergents to37

137by the continued fraction algorithm:

A0 =0

1, A1 =

1

3, A2 =

1

4, A3 =

3

11, A4 =

7

26, A5 =

10

37, A6 =

37

137.

SinceC3 = 11 <

√137 < 26 = C4,

Hermites’ algorithm shows that (11, 137 · 3 − 37 · 11) = (11, 4) de�nes a solution [Check itindeed does!]. �

Q7 [Q6-a (2020)] For each of the equations in G and H below, determine whether there existsa solution in positive integers.

(1) G2 + H2 = 5850.(2) G2 + H2 = 9450.

A7 (1) We use Theorem 57.5850 = 52 · 32 · 2 · 13;

hence the square-free part is 2 · 13. Neither is congruent to 3 mod 4, so 5850 is a sum of twointeger squares, i.e. G2 + H2 = 5850 is soluble! Note that

2 = 12 + 12

and13 = 22 + 32.

To spot a solution to G2 + H2 = ? for ? congruent to 1 mod 4 (the only other case we need toworry about is 2), we can use of course make appeal to Hermite’s algorithm/Corollary 56.

96

By the formula (A2 + B2) (C2 + D2) = (AC − BD)2 + (AD + BC)2, we have

2 · 13 = (12 + 12) (22 + 32) = (1 · 2 − 1 · 3)2 + (1 · 3 + 1 · 2)2 = 12 + 52.


5850 = 52 · 32 · (12 + 52) = 152 + (15 · 5)2 = 152 + 752.

(2) 9450 = 52 · 32 · 2 · 3 · 7, hence the square-free part is 2 · 3 · 7. As 3 is congruent to, well, 3mod 4, it follows from Theorem 57 that G2 + H2 = 9450 is insoluble. �

Q8 [Q6-b (2020)] Use Hensel’s Lemma to �nd all integer solutions to the equation

G2 ≡ 3 (mod 121).

A8 If we are asked to solve a congruence equation of the form G2 ≡ �modulo a prime num-ber ?, we can use Section 5.2 (Theorem 28 or Proposition 29). However, 121 = 112 and we doneed to use Hermite. Having said that, Section 5.2 is not completely irrelevant to Hermite.

Let %(G) = G2−3 and %′(G) = 2G. Firstly we solve %(G) ≡ 0mod 11. We �nd, by trial and error,that I1 = 5 works; if it is hard to spot one, we can make appeal to Theorem 28 or Proposiiton29 (hence the relevancy). We need to �nd the multiplicative inverse & ′(I1) = & ′(5) mod 11 to%′(I1) = %′(5) = 10, i.e. we need to �nd an integer & ′(5) such that & ′(5)%′(5) ≡ 1 mod 11.Going back to Proposition 8 or otherwise, as Euclid’a algorithm for 10 and 11 gives us

1 = 1 · 11 − 1 · 10

which yields1 ≡ (−1) · 10

mod 11. In other words, & ′(5) = −1 ≡ 10 works. By Hensel, modulo 112 = 121,

I2 = I1 − %(I1)& ′(I1) = 5 − %(5)& ′(5) = 5 − 22 · 10 = −215 ≡ 27

de�nes a solution to G2 ≡ 3mod 112. The other solution is −27mod 121, i.e., 94mod 121.

Q9 [Q1-d (2015)] Which of the following numbers are algebraic integers?

(1)3 +√5

2+ 1.

(2)1

2

√41 − 3

2.

A9 (1)3 +√5

2+ 1 = 5 +

√5

2. We know from Proposition 64 (note that 5 ≡ 1mod 4) that it is

an algebraic integer. Alternatively, let U =5 +√5

2. It then follows that 2U − 5 =

√5; squaring it,

we get 4U2 − 20U + 20, i.e. U2 − 5U + 5 = 0. In other words, U is a root of the monic polynomialG2 − 5G + 5 (it is the minimal polynomial of U since U is irrational).

(2) Similarly, it follow from Proposition 64 that1

2

√41 − 3

2is an algebraic integer (note that

41 ≡ 1mod 5). Letting U =1

2

√41− 3

2, eliminating

√41 as in (2) for

√5, we see that U2+3U−8 = 0.

97

The polynomial G2 + 3G − 8 is monic (it is the minimal polynomial of U since U is irrational).

Q10 [Q1-d (2019)] Determine the minimal polynomial of√7

2− 9

2.

A10 Let U =−9 +

√7

2. Eliminating

√7, we get 4U2 + 36U + 74 = 0, i.e., U2 + 9U + 74

4. The

polynomial G2 + 9G + 74

4is a monic polynomial in ℚ and it is irreducible (if it is reducible in

ℚ[G], then U would be rational, which is a contradiction).

Q11 Describe the units in the ring of integers in ℚ(√75).

A11While 75 ≡ 3mod 4, we can not use Proposition 63 to describe the ring of integers norProposition 66 to describe its units. Since

√75 = 5

√3, it follows by de�nition that ℚ(

√75) =

ℚ(√3). It now follows from Proposition 63 that its ring of integers is ℤ[

√3] and from Propos-

ition 66 that the units in ℤ[√3] are of the form B + C

√3 such that A2 − 3C2 = ±1. We know how

to solve Pell’s equation G2 − 3H2 = ±1. The continued fraction of√3 is [1; 1, 2] with ; = 2, hence

the fundamental solution is (B, C) = (B1, C1) = (2, 1). De�ning E= +F=√3 = (B + C

√3)= = (2+

√3)=,

the pairs (E=, F=) de�ne all the positive integer solutions to Pell’s equation G2−3H2 = ±1, henceunits. As the questions asks to describe all the units,

E= + F=√3,−E= + F=

√3, E= − F=

√3,−E= − F=

√3

de�ne the units. �

98

Documents

Number Theory Almost Final Dra˘