DISCRETE MATHEMATICS AND ITS APPLICATIONS · 2019-10-24 · proof of Fermat’s Last Theorem (FLT) for p = 3. Applications of unique fac-torization are given in terms of both Euler’s

DISCRETE MATHEMATICS AND ITS APPLICATIONSSeries Editor KENNETH H. ROSEN

Advanced Number Theory

with Appl icat ions

Richard A. Mollin University of Calgary

Alberta, Canada

DISCRETE MATHEMATICSITS APPLICATIONS

Series Editor

Kenneth H. Rosen, Ph.D.

Juergen Bierbrauer, Introduction to Coding Theory

Francine Blanchet-Sadri, Algorithmic Combinatorics on Partial Words

Richard A. Brualdi and Dragos Cvetkovic, A Combinatorial Approach to Matrix Theory and Its Applications

Kun-Mao Chao and Bang Ye Wu, Spanning Trees and Optimization Problems

Charalambos A. Charalambides, Enumerative Combinatorics

Gary Chartrand and Ping Zhang, Chromatic Graph Theory

Henri Cohen, Gerhard Frey, et al., Handbook of Elliptic and Hyperelliptic Curve Cryptography

Charles J. Colbourn and Jeffrey H. Dinitz, Handbook of Combinatorial Designs, Second Edition

Martin Erickson and Anthony Vazzana, Introduction to Number Theory

Steven Furino, Ying Miao, and Jianxing Yin, Frames and Resolvable Designs: Uses, Constructions, and Existence

Randy Goldberg and Lance Riek, A Practical Handbook of Speech Coders

Jacob E. Goodman and Joseph O’Rourke, Handbook of Discrete and Computational Geometry, Second Edition

Jonathan L. Gross, Combinatorial Methods with Computer Applications

Jonathan L. Gross and Jay Yellen, Graph Theory and Its Applications, Second Edition

Jonathan L. Gross and Jay Yellen, Handbook of Graph Theory

Darrel R. Hankerson, Greg A. Harris, and Peter D. Johnson, Introduction to Information Theory and Data Compression, Second Edition

Darel W. Hardy, Fred Richman, and Carol L. Walker, Applied Algebra: Codes, Ciphers, and Discrete Algorithms, Second Edition

Daryl D. Harms, Miroslav Kraetzl, Charles J. Colbourn, and John S. Devitt, Network Reliability: Experiments with a Symbolic Algebra Environment

Silvia Heubach and Toufik Mansour, Combinatorics of Compositions and Words

Leslie Hogben, Handbook of Linear Algebra

Derek F. Holt with Bettina Eick and Eamonn A. O’Brien, Handbook of Computational Group Theory

David M. Jackson and Terry I. Visentin, An Atlas of Smaller Maps in Orientable and Nonorientable Surfaces

Titles (continued)

Richard E. Klima, Neil P. Sigmon, and Ernest L. Stitzinger, Applications of Abstract Algebra with Maple™ and MATLAB®, Second Edition

Patrick Knupp and Kambiz Salari, Verification of Computer Codes in Computational Science and Engineering

William Kocay and Donald L. Kreher, Graphs, Algorithms, and Optimization

Donald L. Kreher and Douglas R. Stinson, Combinatorial Algorithms: Generation Enumeration and Search

C. C. Lindner and C. A. Rodger, Design Theory, Second Edition

Hang T. Lau, A Java Library of Graph Algorithms and Optimization

Elliott Mendelson, Introduction to Mathematical Logic, Fifth Edition

Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone, Handbook of Applied Cryptography

Richard A. Mollin, Advanced Number Theory with Applications

Richard A. Mollin, Algebraic Number Theory

Richard A. Mollin, Codes: The Guide to Secrecy from Ancient to Modern Times

Richard A. Mollin, Fundamental Number Theory with Applications, Second Edition

Richard A. Mollin, An Introduction to Cryptography, Second Edition

Richard A. Mollin, Quadratics

Richard A. Mollin, RSA and Public-Key Cryptography

Carlos J. Moreno and Samuel S. Wagstaff, Jr., Sums of Squares of Integers

Dingyi Pei, Authentication Codes and Combinatorial Designs

Kenneth H. Rosen, Handbook of Discrete and Combinatorial Mathematics

Douglas R. Shier and K.T. Wallenius, Applied Mathematical Modeling: A Multidisciplinary Approach

Jörn Steuding, Diophantine Analysis

Douglas R. Stinson, Cryptography: Theory and Practice, Third Edition

Roberto Togneri and Christopher J. deSilva, Fundamentals of Information Theory and Coding Design

W. D. Wallis, Introduction to Combinatorial Designs, Second Edition

Lawrence C. Washington, Elliptic Curves: Number Theory and Cryptography, Second Edition

Chapman & Hall/CRCTaylor & Francis Group6000 Broken Sound Parkway NW, Suite 300Boca Raton, FL 33487-2742

© 2010 by Taylor and Francis Group, LLCChapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed in the United States of America on acid-free paper10 9 8 7 6 5 4 3 2 1

International Standard Book Number: 978-1-4200-8328-6 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging‑in‑Publication Data

Mollin, Richard A., 1947-Advanced number theory with applications / Richard A. Mollin.

p. cm. -- (Discrete mathematics its applications)Includes bibliographical references and index.ISBN 978-1-4200-8328-6 (hardcover : alk. paper)1. Number theory. I. Title.

QA241.M597 2009512.7--dc22 2009026636

Visit the Taylor & Francis Web site athttp://www.taylorandfrancis.com

and the CRC Press Web site athttp://www.crcpress.com

www.copyright.com

www.copyright.com

http://www.copyright.com/

http://www.taylorandfrancis.com

http://www.crcpress.com

For Kate Mollin

vi

About the CoverThe surface on the cover was created using the equation for the

lemniscate of Bernoulli in three dimensions, namely

f(x, y) = (x2 + y2)2 ! 2a2(x2 ! y2).

In two dimensions, the equation (x2 + y2)2 = 2a2(x2 ! y2) leads tothe usual " sign–see Biography 5.4 on page 207. The polar form isr2 = a2 cos(2!).

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Algebraic Number Theory and Quadratic Fields 11.1 Algebraic Number Fields . . . . . . . . . . . . . . . . . . . . 11.2 The Gaussian Field . . . . . . . . . . . . . . . . . . . . . . . . 181.3 Euclidean Quadratic Fields . . . . . . . . . . . . . . . . . . . 321.4 Applications of Unique Factorization . . . . . . . . . . . . . 47

2 Ideals 552.1 The Arithmetic of Ideals in Quadratic Fields . . . . . . . . 552.2 Dedekind Domains . . . . . . . . . . . . . . . . . . . . . . . . 672.3 Application to Factoring . . . . . . . . . . . . . . . . . . . . . 88

3 Binary Quadratic Forms 973.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973.2 Composition and the Form Class Group . . . . . . . . . . . 1053.3 Applications via Ambiguity . . . . . . . . . . . . . . . . . . . 1183.4 Genus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1293.5 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483.6 Equivalence Modulo p . . . . . . . . . . . . . . . . . . . . . . 155

4 Diophantine Approximation 1594.1 Algebraic and Transcendental Numbers . . . . . . . . . . . 1594.2 Transcendence . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.3 Minkowski’s Convex Body Theorem . . . . . . . . . . . . . 182

5 Arithmetic Functions 1915.1 The Euler–Maclaurin Summation Formula . . . . . . . . . 1915.2 Average Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 2085.3 The Riemann !-function . . . . . . . . . . . . . . . . . . . . . 218

vii

viii

6 Introduction to p-Adic Analysis 2296.1 Solving Modulo pn . . . . . . . . . . . . . . . . . . . . . . . . . 2296.2 Introduction to Valuations . . . . . . . . . . . . . . . . . . . 2336.3 Non-Archimedean vs. Archimedean Valuations . . . . . . 2406.4 Representation of p-Adic Numbers . . . . . . . . . . . . . . 243

7 Dirichlet: Characters, Density, and Primes in Progression 2477.1 Dirichlet Characters . . . . . . . . . . . . . . . . . . . . . . . 2477.2 Dirichlet’s L-Function and Theorem . . . . . . . . . . . . . 2527.3 Dirichlet Density . . . . . . . . . . . . . . . . . . . . . . . . . . 263

8 Applications to Diophantine Equations 2718.1 Lucas–Lehmer Theory . . . . . . . . . . . . . . . . . . . . . . 2718.2 Generalized Ramanujan–Nagell Equations . . . . . . . . . 2768.3 Bachet’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . 2828.4 The Fermat Equation . . . . . . . . . . . . . . . . . . . . . . . 2868.5 Catalan and the ABC Conjecture . . . . . . . . . . . . . . . 294

9 Elliptic Curves 3019.1 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3019.2 Mazur, Siegel, and Reduction . . . . . . . . . . . . . . . . . 3109.3 Applications: Factoring & Primality Testing . . . . . . . . 3179.4 Elliptic Curve Cryptography (ECC) . . . . . . . . . . . . . 326

10 Modular Forms 33110.1 The Modular Group . . . . . . . . . . . . . . . . . . . . . . . 33110.2 Modular Forms and Functions . . . . . . . . . . . . . . . . . 33610.3 Applications to Elliptic Curves . . . . . . . . . . . . . . . . . 34710.4 Shimura–Taniyama–Weil & FLT . . . . . . . . . . . . . . . . 353

Appendix: Sieve Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

Solutions to Odd-Numbered Exercises . . . . . . . . . . . . . . . . . . . . . . . 401

Index: List of Symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

Index: Subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453

PrefaceThis book is designed as a second course in number theory at the senior

undergraduate/junior graduate level to follow a course in elementary methods,such as that given in [68], the contents of which the reader is assumed to haveknowledge. The material covered in the ten chapters of this book constitutes acourse outline for one semester.

Chapter 1 begins with algebraic techniques including specialization toquadratic fields with applications to solutions of the Ramanujan–Nagell equa-tions, factorization of Gaussian integers, Euclidean quadratic fields, and Gauss’proof of Fermat’s Last Theorem (FLT) for p = 3. Applications of unique fac-torization are given in terms of both Euler’s and Fermat’s solution to Bachet’sequation, concluding with a look at norm-Euclidean quadratic fields.

In Chapter 2 ideal theory is covered beginning with quadratic fields, and de-composition into prime ideals therein. Dedekind domains make up the secondsection, leading into Noetherian domains, and the unique factorization theo-rem for Dedekind domains. Principal Ideal Domains and Unique FactorizationDomains are compared and contrasted. The section ends with the ChineseRemainder Theorem for ideals. The chapter concludes with an application tofactoring using Pollard’s cubic integer method, which serves as a preamble forthe introduction of the number field sieve presented in the Appendix. Pollard’smethod is illustrated via factoring of the seventh Fermat number.

Chapter 3 is devoted to binary quadratic forms, starting with the basics onequivalence, discriminants, reduction, and class number. In the next section,composition is covered and linked to ideal theory. The form and ideal classgroups are compared and contrasted, including an explicit formula for the re-lationship between the form class number and both the narrow and wide idealclass numbers. A proof of the finiteness of the ideal class number is achieved viathe form class number, rather than the usual method of using Minkowski’s Con-vex Body Theorem, which we cover in §4.3. Section 3.3 investigates the notionof ambiguous forms and ideals and the relationship between their classes. Weshow how this applies to representations of integers as a sum of two squares andto Markov triples. In Section 3.4, genus is introduced and the assigned values ofgeneric characters are developed via Jacobi symbols. This is then applied to theprincipal genus, via a coset interpretation, using Dirichlet’s Theorem on Primesin Arithmetic Progression, the proof of which is given in Chapter 7. This is avaluable vehicle for demonstrating the fact that two forms are in the same genusexactly when their cosets are equal. We tie the above together with the fact thatthe genus group is essentially the group of ambiguous forms. Section 3.5 usesthe above to investigate representation problems. We begin with the algebraicinterpretation of prime power representation as binary quadratic forms usingthe ideal class number. Numerous applications to representations of primes inthe form p = a2 + Db2 are provided. The chapter ends with representationsmodulo a prime.

Chapter 4 develops Diophantine approximation techniques, starting with

ix

x Advanced Number Theory with Applications

Roth’s celebrated result. We prove Liouville’s Theorem, leading into an anal-ysis of enumerable sets, including a proof that the set of all algebraic numbersis enumerable, followed by the countability of the rational numbers and the un-countability of the reals. Indeed, it follows from this that almost all reals aretranscendental. The first section is completed with a proof of the fact that then-th root of a rational integer is an algebraic integer of degree n, when thatinteger is not a certain power. Transcendence is covered in the second sectionwith proofs that Liouville numbers, e, and " are all transcendental. Next theLindemann–Weierstrass Theorem is established, allowing the statement of themore general Schanuel conjecture. The discussion is rounded out by a lookat some renowned constants including those of Gel

,

fond, Gel,

fond–Schneider,Proulet–Thue–Morse, Euler, Apery, and Catalan. Section 4.3 introduces thegeometry of numbers and its techniques with a goal of proving Minkowski’sConvex Body Theorem that ends the chapter.

In Chapter 5, we extend the knowledge of arithmetic functions gained in afirst course, by proving the Euler–Maclaurin summation formula, for which weintroduce Bernoulli numbers, Bernoulli polynomials, and Fourier series. Withthis we are able to apply the formula to obtain Wallis’ formula, Stirling’s con-stant, Stirling’s formula, and perhaps the slickest of applications, namely theaccurate approximation of the Euler–Mascheroni constant. Average orders arethe topic of the second section starting with a proof of Hermite’s formula. Thisputs us into a position where we can derive the average order of the numberof divisors function, the sum of divisors function, and Euler’s totient #(m).The third section concentrates upon the Riemann !-function. We apply the Eu-ler–Maclaurin summation formula to obtain a formula for !(s). Then we discussthe Prime Number Theorem (PNT), Merten’s Theorem, and various arithmeticfunction equivalences to the PNT. Then the Riemann hypothesis (RH) and itsequivalent formulations are considered, after which we develop techniques toprovide a rather straightforward proof of the functional equation for !(s) as aclosing feature of the chapter.

In Chapter 6, we introduce p-adic analysis, commencing with solving modulopn for successively higher powers of a prime p. Hensel’s Lemma is the featuredresult of the first section. The second section introduces valuations, includingthe p-adic versions. Then Cauchy sequences come into play giving rise to p-adicfields and domains. We have tools to prove that equivalent powers are valua-tions, which ends the section. We compare Archimedean and non-Archimedeanvaluations in the third section, featuring a proof of Ostrowski’s Theorem. In thelast section, we apply what we have learned to representation of p-adic num-bers. This involves the proof that every rational number has a representationas a periodic power series in a given prime p to close the chapter.

Chapter 7 delves into Dirichlet, his characters, L-functions, and their ze-ros related to the RH. We see the implications of his theorem for primes inarithmetic progression, proved in the second section. In the third section weintroduce Dirichlet density and applications such as Beatty’s theorem. Thechapter ends with Dirichlet density on primes in arithmetic progression modulom which have density 1/#(m).

Preface xi

Chapter 8 comprises applications of the first seven chapters to Diophantineequations. We begin with an overview of Lucas–Lehmer theory, proving re-sults promised earlier in the text such as solutions of the generalized Ramanu-jan–Nagell equations in the second section and Bachet’s equation in the thirdsection. The Fermat equation is the topic of the fourth section with Kummer’sproof of FLT for regular primes. The chapter is rounded out with the ABC con-jecture and Catalan’s conjecture. We discuss the recent proof of the latter andits generalization, the still open Fermat–Catalan conjecture. More than a half-dozen consequences of the ABC conjecture are displayed and discussed, includ-ing the Thue–Siegel–Roth Theorem, Hall’s conjecture, the Erdos–Mollin–Walshconjecture, and the Granville–Langevin conjecture. We demonstrate how thesefollow from ABC.

Chapter 9 studies elliptic curves, launched by an introduction of the basics,illustrated and presented as a foundation. The second section defines torsionpoints, the Nagell–Lutz Theorem, Mazur’s Theorem, Siegel’s Theorem, andthe notion of reduction. This sets the stage for Lenstra’s elliptic curve fac-toring method and his primality testing method. We also look at the Gold-wasser–Killian primality proving algorithm. The chapter closes with a descrip-tion of the Menezes–Vanstone Elliptic Curve Cryptosystem as an application.

The last chapter is on modular forms. The modular group and modularforms are introduced as vehicles for much deeper considerations later in thechapter. Spaces and levels of modular forms are used as applications to ellipticcurves including j-invariants and the Weierstrass $-function. The main textends with Section 10.4 that looks, in detail, at the Shimura–Taniyama–Weilconjecture both in terms of L-functions and modular parametrizations. Mod-ular elliptic curves are introduced as the steppingstone to the proof of FLT.Chapter 10 ends with Ribet’s Theorem and a one-paragraph proof of FLT em-anating from it, called the Frey–Serre–Ribet approach, a fitting conclusion anddemonstration of the power of the theory.

An overview, without proofs, of sieve theory is relegated to the Appendix.We begin with a description of the goals of sieve theory and the e!ects its studyhas had on such open problems as the twin prime conjecture, the Goldbachconjecture, and Artin’s conjecture, among others. We provide a description ofthe Eratosthenes sieve from the perspective of the Mobius function in orderto lay the foundation for modern-day sieves. We begin with Brun’s Theoremand his constant, including a discussion of how computation of Brun’s constantled to the discovery of a flaw in the Pentium computer chip. Then we set thegroundwork for presentation of Selberg’s sieve by painting the picture of the ba-sic sieve problem in terms of upper and lower limits on certain related functions.Selberg’s sieve has many applications including the Brun–Titchmarsh Theorem,bounds for the twin prime conjecture, and the Goldbach conjecture. Then Lin-nik’s large sieve is developed as a generalization of Brun’s results and illustratedvia applications to Artin’s conjecture. Next is the Bombieri–Vinogradov The-orem and its applications to the Titchmarsh divisor problem. Then the classicresult, Bombieri’s asymptotic sieve, is presented via a hypothesis involving thegeneralized Mangoldt function. The most striking of the applications of the

xii Advanced Number Theory with Applications

asymptotic sieve is the Friedlander–Iwaniec Theorem that there are infinitelymany primes of the form a2 + b4. The aforementioned hypothesis involves theElliot–Halberstram conjecture (EHC), so we are naturally led to the recent re-sults by Goldston, Pintz, and Yildirim on gaps between primes. In particular,their result based upon the validity of the EHC is the satisfying conclusion thatlimn!" inf(pn+1 ! pn) # 16, where pn is the n-th prime. With these resultsas an illustration of the power of sieve theory, we turn our attention to the useof sieves in factoring by bringing out the big gun, the number field sieve andillustrate in detail its use in factoring of the ninth Fermat number.

The Bibliography has been set up in such a way that maximum informationis imparted. This includes a page reference for each and every citing of a givenitem, so that no guesswork is involved as to where this reference is used. Theindex has more than 1,500 entries presented for maximum cross-referencing.Similarly, any reference, in text, to a theorem, definition, etc. is coupled withthe page number on which it sits. These conventions ensure that the reader willfind data with ease. There are nearly 50 mini-biographies of the mathematicianswho helped to develop the results presented, in order to give a human face tothe number theory and its applications. There are nearly 340 exercises withsolutions of the odd-numbered exercises included at the end of the text, and asolutions manual for the even-numbered exercises available to instructors whoadopt the text for a course. The website below is designed for the reader toaccess any updates and the e-mail address below is available for any comments.

! Acknowledgments First of all, I am deeply grateful to the Killam Foun-dation for providing the award allowing the completion of this project in atimely fashion. Also, I am grateful for the proofreading done by the followingpeople. Thanks go to John Burke (U.S.A.) who took the time to e!ectivelycomment. Moreover, Keith Matthews (Australia) made valuable commentsthat helped polish the book. Also, thanks to John Robertson (U.S.A.) withwhom I had lengthy electronic conversations over development of several sec-tions of the book, especially Chapter 3 on binary quadratic forms. These in-terchanges had beneficial e!ects both for the book and our respective researchprograms. His insightful comments were most welcome. With Anitha Srini-vasan (India), I similarly had lengthy electronic exchanges that led to creative,and even perspective-changing results. Her input was extremely valuable. Myformer student, Thomas Zaplachinski (Canada) who is now a working cryptog-rapher in the field, gave the non-academic approach that was needed to roundout the input received, and was deeply appreciated. Overall, this was an inspir-ing project, and one that is intended to be a service to students studying themost dynamic area of mathematics—number theory.

July 15, 2009website: http://www.math.ucalgary.ca/˜ramollin/

e-mail: [email protected]

http://www.math.ucalgary.ca/%CB%9Cramollin/

xiii

About the AuthorRichard Anthony Mollin is a professor in the Mathematics Department

at the University of Calgary. Over the past twenty-three years, he has beenawarded 6 Killam Resident Fellowships—a record number of these awards, see:http://www.killamtrusts.ca/. His 2009 Killam award provided the opportunityto complete this book, Advanced Number Theory with Applications. He haswritten over 190 publications including 11 books in algebra, number theory, andcomputational mathematics. He is a past member of the Canadian and Ameri-can Mathematical Societies, the Mathematical Association of America and is amember of various editorial boards. He has been invited to lecture at numerousuniversities, conferences and scientific society meetings and has held several re-search grants from universities and governmental agencies. He is the founder ofthe Canadian Number Theory Association and hosted its first conference and aNATO Advanced Study Institute in Ban! in 1988–see [60]–[61].

On a personal note—in the 1970s he owned a professional photography busi-ness, Touch Me with Your Eyes, and photographed many stars such as PaulAnka, David Bowie, Cher, Bob Dylan, Peter O’Toole, the Rolling Stones, andDonald Sutherland. His photographs were published in The Toronto Globe andMail newspaper as well as New Music Magazine and elsewhere. Samples of hiswork can be viewed online at http://math.ucalgary.ca/$ramollin/pixstars.html.

His passion for mathematics is portrayed in his writings—enjoyed by mathe-maticians and the general public. He has interests in the arts, classical literature,computers, movies, and politics. He is a patron and a benefactor of The AlbertaBallet Company, Alberta Theatre Projects, The Calgary Opera, The CalgaryPhilharmonic Orchestra, and Decidedly Jazz Danceworks. His love for life com-prises cooking, entertaining, fitness, health, photography, and travel, with noplans to slow down or retire in the foreseeable future.

http://www.killamtrusts.ca/

http://math.ucalgary.ca/~ramollin/pixstars.html

This page intentionally left blank

Chapter 1

Algebraic Number Theoryand Quadratic Fields

I used to love mathematics for its own sake, and I still do, because it allowsfor no hypocrisy and no vagueness, my two betes noires.

Henri Beyle Stendhal (1783–1842), French novelist

In this introductory chapter, we introduce algebraic number theory with aconcentration on quadratic fields. We begin with a general look at numberfields. The reader should be familiar with the concepts in a course in numbertheory contained in [68] to which we will refer when needed.

1.1 Algebraic Number Fields

Algebraic number theory generalizes the notion of the ordinary or rationalintegers

Z = {. . . ,!2,!1, 0, 1, 2, . . .}.To see how this is done, we consider the elements of Z as roots of linear monicpolynomials, namely if a % Z, then a is a root of f(x) = x ! a. Then wegeneralize as follows.

Definition 1.1 Algebraic Integers

If % % C is a root of a monic, integral polynomial of degree d, namely a rootof a polynomial of the form

f(x) =d!

j=0

ajxj = a0 + a1x + · · · + ad#1x

d#1 + xd % Z[x],

1

2 1. Algebraic Number Theory and Quadratic Fields

which is irreducible over Q, then % is called an algebraic integer of degree d.

Example 1.1 a + b&!1 = a + bi, where a, b % Z, with b '= 0 is an algebraic

integer of degree 2 since it is a root of x2 ! 2ax + a2 + b2, but not a root of alinear, integral, monic polynomial since b '= 0.

The following notion allows us to look at some distinguished types of alge-braic integers.

Definition 1.2 Primitive Roots of Unity

For n % N = {1, 2, 3, . . .} (the natural numbers), !n denotes a primitive nth

root of unity, which is a root of xn ! 1, but not a root of xd ! 1 for any naturalnumber d < n.

Example 1.2 !3 = (!1 +&!3)/2 is a primitive cube root of unity since it is

a root of x3 ! 1, but clearly not a root of x2 ! 1 or x! 1.

A special kind of algebraic integer is given in the following.

Example 1.3 Numbers of the form

z0 + z1!n + z2!2n + · · · + zn#1!

n#1n , for zj % Z,

are called cyclotomic integers of order n.

Definition 1.2, in turn, is a special case of the following.

Definition 1.3 Units

An element % in a commutative ring R with identity 1R is called a unit in Rwhen there is a & % R such that %& = 1R. The multiplicative group of units inR is denoted by UR.

Example 1.4 In Z[&

2] = R, 1 +&

2 is a unit since

(1 +&

2)(!1 +&

2) = 1R = 1.

Definition 1.4 Algebraic Numbers and Number Fields

An algebraic number, %, of degree d % N is a root of a monic polynomial inQ[x] of degree d and not the root of any polynomial in Q[x] of degree less thand. In other words, an algebraic number is the root of an irreducible polynomialof degree d over Q. An algebraic number field, or simply number field, is of theform F = Q(%1, %2, . . . ,%n) ( C for n % N where %j for j = 1, 2, . . . , n arealgebraic numbers. Denote the subfield of C consisting of all algebraic numbersby Q, and the set of all algebraic integers in Q by A. An algebraic number ofdegree d % N over a number field F is the root of an irreducible polynomial ofdegree d over F .

1.1. Algebraic Number Fields 3

Remark 1.1 If F is a simple extension, namely of the form Q(%), for analgebraic number %, then we may consider this as a vector space over Q, in whichcase we may say that Q(%) has dimension d over Q having basis {1, %, . . . , %d#1}.(See [68, §2.1] and [68, Appendix A], where the background on these algebraicstructures is presented. Also, see Exercise 1.4 on page 16 to see that all numberfields are indeed simple.)

By Definition 1.4, Q is the smallest algebraic number field since it is ofdimension 1 over itself, and the simple field extension Q(%) is the smallestsubfield of C containing both Q and %.

We now demonstrate that A, as one would expect, has the proper structurein Q, which will lead us to a canonical subring of algebraic number fields.

Theorem 1.1 The Ring of All Algebraic Integers

A is a subring of Q.

Proof. It su"ces to prove that if %,& % A, then both % + & % A and %& % A.To this end we need the following.

Claim 1.1 If % % A, then Z[%] = {f(%) : f(x) % Z[x]} is a finitely generatedZ-module.

Since % % A, then there exist aj % Z for j = 0, 1, . . . , d ! 1 for some d ) 1such that

%d ! ad#1%d#1 ! · · ·! a1%! a0 = 0.

Therefore,

%d = ad#1%d#1 + ad#2%

d#2 + · · · + a1% + a0 % Z%d#1 + · · · + Z% + Z,

and

%d+1 = ad#1%d + ad#2%

d#1 + · · ·+ a1%2 + a0% % Z%d + Z%d#1 + · · ·+ Z%2 + Z%

( Z%d#1 + Z%d#2 + · · · + Z% + Z.

Continuing in this fashion we conclude, inductively, that

%c % Z%d#1 + Z%d#2 + · · · + Z% + Z,

for any c ) d. However, clearly,

%c % Z%d#1 + Z%d#2 + · · · + Z% + Z,

for c = 1, 2, · · · , d! 1, so

%c % Z%d#1 + Z%d#2 + · · · + Z% + Z,


for any c ) 0. Hence, Z[%] is a finitely generated Z-module. This completesClaim 1.1.

By Claim 1.1, both Z[%] and Z[&] are finitely generated. Suppose thata1, a2, . . . , ak are generators of Z[%] and b1, b2, . . . , b! are generators of Z[&].Then Z[%,&] is the additive group generated by the aibj for 1 # i # k and1 # j # '. Thus, Z[%,&] is finitely generated. Since % + &,%& % Z[%,&] ( A,then we have secured the theorem. !

Given an algebraic number field F , F *A is a ring in F , by Exercise 1.2 onpage 16. This leads to the following.

Definition 1.5 Rings of Integers

If F is an algebraic number field, then F * A is called the ring of (algebraic)integers of F , denoted by OF .

With Definition 1.5 in hand, we may now establish a simple consequence ofTheorem 1.1.

Corollary 1.1 The ring of integers of Q is Z, namely OQ = Q * A = Z.

Proof. If % % A *Q, then % = a/b where a, b % Z and gcd(a, b) = 1, with b '= 0.Since % % A, there exists an f(x) = a0 +

"dj=1 ajxj % Z[x], with ad = 1, such

that f(%) = 0. If d = 1, then we are done since a0 +% % Z and a0 % Z. If d > 1,then a0 +

"dj=1 aj%j % Z, so

d!

j=1

aj%j =

d!

j=1

ajajbd#j

bd% Z.

Therefore, bd## "d

j=1 ajajbd#j . Since d > 1, b## "d#1

j=1 ajajbd#j , so b## ad. But

gcd(a, b) = 1, so b = 1 and % % Z. !

Corollary 1.2 If F is an algebraic number field, then Q *OF = Z.

Proof. Since OF ( A, then by Corollary 1.1, Q * OF ( Z. But clearly Z (Q *OF , so we have equality. !

Remark 1.2 Now we establish the rings of integers for quadratic fields. First,we show that a given quadratic field is determined by a unique squarefree integer.We note that if

f(x) = x2 + ax + b % Q[x],

is irreducible, and % % C is a root of f(x), then the smallest subfield of Ccontaining both Q and % is given by adjoining % to Q, denoted by Q(%) so

Q(%) = {x + y% : x, y % Q},

which is what we call a quadratic field.


Quadratic polynomials with the the same squarefree part of the discriminantgive rise to the same quadratic field. To see this suppose that f(x) = x2 + bx +c, g(x) = x2 + b1x + c1 % Q[x] are irreducible, # = b2 ! 4c = m2D, and#1 = b2

1 ! 4c1 = m21D, where m, m1 % Z and D is squarefree. Then

Q(&

#) = Q(&

m2D) = Q(m&

D) = Q(&

D) =

Q(m1

&D) = Q

$%m2

1D

&= Q(

'#1).

Thus, we need the following to clarify the situation on uniqueness of quadraticfields.

Theorem 1.2 Quadratic Fields Uniquely Determined

If F is a quadratic field, there exists a unique squarefree integer D such thatF = Q(

&D).

Proof. Suppose that F = Q(%), where % is a root of the irreducible polynomialx2 + bx + c. By the quadratic formula,

% %(

%1 =!b +

&b2 ! 4c

2, %2 =

!b!&

b2 ! 4c

2

).

Since %1 = !%2 ! b with b % Q, then Q(%1) = Q(%2) = Q(%). However,

Q(%1) = Q*!b +

&b2 ! 4c

2

+= Q(

'b2 ! 4c).

Let a = b2 ! 4c = e/f % Q. Then a '= d2 for any d % Q since x2 + bx + c isirreducible in Q[x]. Without loss of generality we may assume that gcd(e, f) =1 and f is positive. Let ef = n2D, where D is the squarefree part of ef .Hence, D '= 1, and arguing as in Remark 1.2, Q(

&D) = Q(

&a), observing that

Q('

e/f) = Q(&

ef). This shows existence. It remains to prove uniqueness.If D1 is a squarefree integer such that Q(

&D) = Q(

&D1), then

&D = u + v

'D1

with u, v % Q. By squaring, rearranging, and assuming that uv '= 0, we get

'D1 =

D ! u2 !Dv2

2uv% Q,

which contradicts that D1 is squarefree. Thus, uv = 0. If v = 0, then&

D % Q,contradicting the squarefreeness of D. Therefore, u = 0 and D = v2D1, butagain, D is squarefree, so v2 = 1, which yields that D = D1. !

Now we are in a position to determine the ring of integers of an arbitraryquadratic field.


Theorem 1.3 Rings of Integers in Quadratic Fields

Let F be a quadratic field and let D be the unique squarefree integer suchthat F = Q(

&D). Then

OF =

(Z

,1+$

D2

-if D + 1(mod 4),

Z[&

D] if D '+ 1(mod 4).

Proof. Let

( =

(2 if D + 1(mod 4),1 if D '+ 1(mod 4).

Then since (1 +&

D)/( is a root of

x2 ! 2x

(+

1!D

(2,

then

Z + Z*

( ! 1 +&

D

(

+( OF .

It remains to prove the reverse inclusion.Let % % OF ( F . Then % = a + b

&D where a, b % Q. We may assume

that b '= 0 since otherwise we are done given that Z ( Z + Z.

"#1+$

D"

/. Since

OF is a ring, then %% = (a ! b&

D), % + %% = 2a, and %%% = a2 !Db2 are allin OF . However, the latter two elements are also in Q, and by Corollary 1.2,OF *Q = Z, so

2a, a2 !Db2 % Z. (1.1)

Case 1.1 a '% Z.

We must have a = (2c + 1)/2 for some c % Z. Therefore, by (1.1), 4(a2 !Db2) % Z, which implies 4Db2 % Z. However, since D is squarefree, then 2b % Z.(To see this, observe that if 2b = g/f where g, f % Z with gcd(f, g) = 1, andf > 1 is odd, then 4Dg2 = f2h for some h % Z. Thus, since gcd(4g, f) = 1, f2

##D contracting its squarefreeness.) If b % Z then, by (1.1), a % Z, contradictingthat a = (2c + 1)/2. Therefore, b = (2k + 1)/2 for some k % Z. Thus,

a2 !Db2 =(2c + 1)2

4! D(2k + 1)2

4= c2 + c! (k2 + k)D +

1!D

4,

which implies

D ! 14

= c2 + c! (k2 + k)D ! a2 + Db2 % Z,


hence, D + 1(mod 4) and:

% =2c + 1

2+

(2k + 1)&

D

2= (c! k) +

(2k + 1)(1 +&

D)2

% Z + Z*

1 +&

D

2

+= Z + Z

*( ! 1 +

&D

(

+.

Case 1.2 a % Z.

In this instance, by (1.1), Db2 % Z, and arguing as above, since D is square-free, b % Z. Hence,

% = a + b&

D % Z + Z&

D = Z + Z*

( ! 1 +&

D

(

+,

which completes the reverse inclusion that secures the theorem. !

Definition 1.6 Field Discriminants

If D is the unique squarefree integer such that F = Q(&

D) is a quadratic field,then the discriminant of F is given by

#F =0

D if D + 1(mod 4),4D if D '+ 1(mod 4).

Remark 1.3 Definition 1.6 follows from the fact that the minimal polynomialof F is x2 ! x + (1!D)/4 if D + 1(mod 4) and x2 !D if D '+ 1(mod 4).

Example 1.5 Suppose we have an irreducible quadratic polynomial

f(x) = ax2 + bx + c % Q[x].

Then # = b2 ! 4ac is the discriminant of not only f(x), but also the quadraticfield Q(

&#). By the quadratic formula, the roots of f(x) are given by

% =!b +

&#

2a, and %% =

!b!&

#2a

,

where %% is called the algebraic conjugate of %. By Exercise 1.1 on page 16,Q(%) = Q(

&#). This is the simplest nontrivial number field, a quadratic field

over Q—see Remark 1.2 on page 4.The reader will note that some easily verified properties of conjugates are

given as follows.

(a) (%&)% = %%&%.

(b) (% ± &)% = %% ± &%.


(c) (%/&)% = %%/&%, where %/& = ) % Q(&

#).

Remark 1.4 If, in Theorem 1.3, D < 0, F is called a complex (or imaginary)quadratic field, and if D > 0, F is called a real quadratic field. Also, the groupof units in a quadratic field forms an abelian group. For real quadratic fields wewill learn about this group in Chapter 7 since it is more complicated than thecomplex case which we tackle now. The reader will recall the notion of groupsand notation for a cyclic group, ,g-, generated by an element g—see [68, p. 300],for instance, and recall Definition 1.2 on page 2.

Theorem 1.4 Units in Complex Quadratic Fields

If F = Q(&

D) is a complex quadratic field, then

UF = UOF =

123

24

,!6- =5

1+$#3

2

6if D = !3,

,!4- = ,&!1- if D = !1,

,!2- = ,!1- otherwise.

Proof. By Theorem 1.3 on page 6 we may write u = a + b&

D % UOF , with2a, 2b % Z. Hence, if D '+ 1(mod 4), then a2 ! b2D = 1, for some a, b % Z sinceD < 0. If D < !1, then a2 ! b2D > 1 for b '= 0. Thus, b = 0 for D '+ 1(mod 4)with D < !1. In other words, UOF = ,!1- = ,!2-, if D + 2, 3(mod 4) andD < !1.

Now we assume that D + 1(mod 4), so a2!Db2 = 4 for a, b % Z. If D < !4,then for b '= 0, a2 !Db2 > 4, a contradiction. Hence, for D + 1(mod 4), andD < !4,

UOF = ,!2-.

It remains to consider the cases D = !1,!3. If D = !1, then by Theorem 1.3on page 6, OZ[i] = Z + Z[i], a + bi is a unit in OF if and only if a2 + b2 = 1.The solutions are (a, b) % {(0 ± 1), (±1, 0)}. In other words, UZ[i] = {±1,±i}.

If D = !3, then a2 + 3b2 = 4, so either a = b = 1, or b = 0 and a = 2.Hence, the units are ±1, (1 ±

&!3)/2, and (!1 ±

&!3)/2. However, 1 = !6

6 ,!1 = !3

6 , (1 +&!3)/2 = !6, (1 !

&!3)/2 = !5

6 , (!1 +&!3)/2 = !2

6 , and(!1!

&!3)/2 = !4

6 . Hence, UOQ(!"3)

= ,!6-, as required. !

The above development leads to the following notions and allows us to discussdivisibility in OF , which is not closed under division.

Definition 1.7 Division in OF

If F is a number field and %,& % OF , % is said to divide & if there exists a) % OF such that & = %), denoted by %

## & in OF . If no such ) exists, we saythat % does not divide &, denoted by % ! &, in OF . If %

## &1 and %## &2 for

&1, &2 % OF , % is said to be a common divisor of &1 and &2 in OF .


Example 1.6 In Z[&

10] = OF where F = Q(&

10), by Theorem 1.3 on page 6,then (4 +

&10)(4!

&10) = 6 = 6 + 0

&10, so % = (4 +

&10)

## 6 = & in Z[&

10].

Now we look at a new perspective, namely elements over an integral do-main—see [68, Remark 2.6, p. 81] for the basics on integral domains.

Definition 1.8 Elements Algebraic and Integral Over a Domain

If R ( S where R and S are integral domains, then % % S is said to be integralover R if there exists an

f(x) = xd + rd#1*d#1 + · · · + r1x + r0 % R[x]

such that f(%) = 0. If R is a field and % is integral over R, then % is said tobe algebraic over R. Also, if every nonconstant polynomial f(x) % R[x] has aroot in R, then R is said to be algebraically closed. Moreover, any extensionfield that is algebraic over R and is algebraically closed is called an algebraicclosure of R, and it may be shown that an algebraic closure is unique up toisomorphism.

Remark 1.5 It is trivially true that every element of R is integral over R since% % R satisfies f(%) = 0 for f(x) = x! % % R[x]. Note, as well, that in view ofDefinition 1.4 on page 2, and Definition 1.8, we may now restate the notion ofan algebraic number as a complex number that is algebraic over Q. Moreover,in view of Definition 1.1 on page 1 and Definition 1.8, we see that an algebraicinteger is a complex number that is integral over Z.

Given an element % that is algebraic over a number field F , Definition 1.8tells us that there is a monic polynomial f(x) % F [x] with f(%) = 0. Wemay assume that f has minimal degree. Hence, f must be irreducible, sinceotherwise, % would be the root of a polynomial of lower degree. Thus chosen,f is called the minimal polynomial of % over F . It turns out this polynomial isalso unique—see Theorem 1.6 on the next page.

We now want to demonstrate that algebraic integers are su"cient to char-acterize algebraic number fields. First we need the following crucial result.

Lemma 1.1 Algebraic Numbers as Quotients of Integers

Every algebraic number is of the form %/' where % is an algebraic integerand ' % Z is nonzero.

Proof. By Definition 1.4 on page 2, if * is an algebraic number, there existaj % Q for j = 0, 1, 2, . . . , d! 1 such that * is a root of

f(x) = a0 + a1x + a2x2 + · · · + ad#1x

d#1 + xd.

Since a0 +a1* +a2*2 + · · ·+ad#1*d#1 +*d = 0, we may form the least commonmultiple, ', of the denominators of the aj for j = 0, 1, . . . , d. Then

('*)d + ('ad#1)('*)d#1 + · · · + ('d#1a1)('*) + 'da0 = 0.


Thus '* is the root of a monic integral polynomial, so '* is an algebraic integer,say, %. Hence, * = %/', with % % A and ' % Z. !

Theorem 1.5 Number Fields—Algebraic Integer Extensions

If F is an algebraic number field, then there is an algebraic integer % suchthat F = Q(%). Also, & % F if and only if there are unique qj % Q for j =0, 1, . . . , n! 1, such that

& = q0 + q1% + · · · + qn#1%n#1,

where n = |F : Q|.

Proof. By Exercise 1.4 on page 16, F = Q(*) for some algebraic number *, andby Lemma 1.1, Q(*) = Q(%/') = Q(%) for some % % A. The second statementfollows from the first statement in conjunction with Claim 1.1 on page 3 andDefinition 1.8 on the previous page. !

Example 1.7 Let E = Q(&

2, i), where i = !4 =&!1 is a primitive fourth

root of unity. Then by Exercise 1.6 on page 17 ,

Q(i,&

2) = Q*&

22

(1 + i)

+,

and

!8 =&

22

(1 + i),

where !8 is a primitive eighth root of unity.

Theorem 1.6 Minimal Polynomials are Unique

A number % % C is an algebraic number of degree d % N over a number fieldF if and only if % is the root of an unique irreducible monic polynomial, denotedby m#,F (x) % F [x].

Any h(x) % F [x] such that h(%) = 0 must be divisible by m#,F (x) in F [x].

Proof. If % is an algebraic number of degree d over F , then by Definition 1.4on page 2, we may let f(x) % F [x] be a monic polynomial of minimal degreewith f(%) = 0, and let h(x) % F [x] be any other monic polynomial of minimaldegree with h(%) = 0. Then by the Euclidean algorithm for polynomials (see[68, Theorem A.11, p. 302]), there exist q(x), r(x) % F [x] such that

h(x) = q(x)f(x) + r(x),

where0 # deg(r) < deg(f) or r(x) = 0, the zero polynomial.


However f(%) = 0 so h(%) = 0 = f(%), so r(%) = 0, contradicting the minimalityof f unless r(x) = 0 for all x. Hence, f(x)

## h(x). The same argument can beused to show that h(x)

## f(x). Hence, h(x) = cf(x) for some c % F . However,f and h are monic, so c = 1 and h = f . This proves that f(x) = m#,F (x) isthe unique monic polynomial of % over F . The converse of the first statementfollows a fortiori.

To prove the second statement, assume that h(x) % F [x] such that h(%) = 0and use the Euclidean algorithm for polynomials as above to conclude thatm#,F (x)

## h(x) by letting m#,F (x) = f(x) in the above argument. !

Corollary 1.3 An irreducible polynomial over an algebraic number field has norepeated roots in C. In particular, all the roots of m#,F (x) are distinct.

Proof. If F is a number field and f(x) % F [x] is irreducible with a repeated root%, then

f(x) = c(x! %)2g(x),

for some c % F and g(x) % C[x]. By Theorem 1.6, m#,F (x)## f(x) so f(x) =

am#,F (x) for some a % F , since f is irreducible. However,

f %(x) = 2c(x! %)g(x) + c(x! %)2g%(x),

where f % is the derivative of f . Hence, f %(%) = 0, so by Theorem 1.6, again

m#,F (x)## f %(x),

contradicting the minimality of m#,F (x) since deg(f %) < deg(f). !

Corollary 1.4 If % % A, then m#,Q(x) % Z[x].

Proof. This follows from Definition 1.1 on page 1 and Theorem 1.6. !

Example 1.8 Returning to Example 1.7 on the preceding page, we see that ifF = Q(i) and % = !8, then

m#,F (x) = x2 ! i

is the minimal polynomial of % over F . Moreover, the minimal polynomial of %over Q is given by

m#,Q(x) =x8 ! 1x4 ! 1

= x4 + 1,

which is an example of the following type of distinguished polynomial.

Definition 1.9 Cyclotomic Polynomials

If n % N, then the nth cyclotomic polynomial is given by


$n(x) =7

gcd(n,j)=1

1&j&n

(x! !jn),

where !n is given by Definition 1.2 on page 2. The degree of $n(x) is #(n) where#(n) is the Euler totient—see [68].

Remark 1.6 The reader may think of the term cyclotomic as “circle dividing,”since the nth roots of unity divide the unit circle into n equal arcs. The cyclo-tomic polynomial also played a role in Gauss’s theory of constructible regularpolygons.

Note that since the roots of the nth cyclotomic polynomial are precisely theprimitive nth roots of unity, then the degree of $n(x) is necessarily #(n). Wenow demonstrate the irreducibility of the cyclotomic polynomial.

Theorem 1.7 Irreducibility of the Cyclotomic Polynomial

For n % N,$n(x) = m$n,Q(x),

so $n(x) is irreducible in Z[x].

Proof. We may let $n(x) = m$n,Q(x)g(x) for some g(x) % Z[x] by Theo-rem 1.6 on page 10.

Claim 1.2 m$n,Q(!pn) = 0 for any prime p ! n.

If m$n,Q(!pn) '= 0, then g(!p

n) = 0, so !n is a root of g(xp). By Theorem 1.6again,

g(xp) = m$n,Q(x)h(x)

for some h(x) % Z[x]. Let

f(x) =!

j

ajxj % Z[x]

have imagef(x) =

!

j

ajxj

under the natural mapZ[x] ./ (Z/pZ)[x].

Thus,g(xp) = m$n,Q(x)h(x).


However, g(xp) = gp(x) since char(Z/pZ) = p. Therefore,

0 = g(!pn) = (g(!n))p = g(!n).

Since $n(x)## (xn ! 1), then

xn ! 1 = $n(x)k(x) = m$n,Q(x)g(x)k(x),

for some k(x) % Z[x]. Therefore, in Z/pZ[x],

xn ! 1 = xn ! 1 = m$n,Q(x)g(x)k(x).

Since g and m$n,Q have a common root !n, then xn ! 1 has a repeated root.However, this is impossible by irreducibility criteria for polynomials over finitefields, since p ! n, (see [68, Corollary A.2, p. 301], for instance, where we seethat xn ! 1 is irreducible if and only if gcd(xn ! 1, xpi ! x) = 1 for all naturalnumbers i # 0n/21). We have established Claim 1.2, namely that !p

n is a rootof m$n,Q(x) for any prime p ! n.

Repeated application of the above argument shows that yp is a root ofm$n,Q(x) whenever y is a root. Hence, !j

n is a root of m$n,Q(x) for all j rel-atively prime to n such that 1 # j < n. Thus, deg(m$n,Q) ) #(n). However,m$n,Q(x)

## $n(x), so m$n,Q(x) = $n(x), as required. !

At this juncture, we look at general properties of units in rings of integers,in keeping with one of the themes of this section.

Proposition 1.1 Let % % A. Then the following are equivalent.

(a) % is a unit.

(b) %## 1 in A.

(c) If F = Q(%), then m#,F (0) = ±1.

Proof. The equivalence of (a) and (b) comes from Definition 1.3 on page 2.Now assume that % is a unit. Then, by Exercise 1.5 on page 17, m#,F (0) =(!1)d

8dj=1 %j = ±1 if and only % % UF , so (a) and (c) are equivalent. !

We have now developed su"cient algebraic number theory in quadratic fieldsto provide a solution to a Diophantine problem that we did not have the toolsto do in a first course — see [68, closing paragraph, p. 272].

Definition 1.10 Generalized Ramanujan–Nagell Equations

The Diophantine Equation

x2 !D = pn, for D < 0, n % N, and p prime (1.2)

is called the generalized Ramanujan–Nagell equation. This is a generalizationof the equation x2 + 7 = 2n studied by Ramanujan—see [68, Biography 7.1, p.273].


Theorem 1.8 Solutions of the Ramanujan–Nagell Equations

The only solutions ofx2 + 7 = 2n (1.3)

with x > 0 are (x, n) % {(1, 3), (3, 4), (5, 5), (11, 7), (181, 15)}.

Proof. If n is even, then

(2n/2)2 ! x2 = (2n/2 ! x)(2n/2 + x) = 7,

which implies that 2n/2 ±x = 7 so 2n/22x = 1, for which only n = 4 and x = 3provide a solution. Now assume that n!2 = m for odd m % N and since clearly(x, n) = (1, 3) is a solution, we may assume that m > 1.

By Theorem 1.3 on page 6, since !7 + 1(mod 4), then

OQ($#7) = Z[(1 +

&!7)/2],

so since x2 + 7 = 2n, then$

x +&!7

2

& $x!

&!7

2

&= 2m.

Therefore, there exist a, b % Z such that (a2 +7b2)/4 = 2, or a2 +7b2 = 8, wherewe may assume, without loss of generality, that a > 0. Thus, only a = 1 andb = ±1 work. Hence,

$x +

&!7

2

& $x!

&!7

2

&=

$1 +

&!7

2

&m $1!

&!7

2

&m

. (1.4)

Now let

% =1 +

&!7

2and & =

1!&!7

2,

so % + & = 1 and %& = 2.Since there are no factorizations for the right-hand side of (1.4) up to units,

we must have $x ±

&!7

2

&= ±%m or ± &m.

Using (1.4) we see that no matter which of the four possible selections is madefor (x + y

&!7)/2, we have

±&!7 = %m ! &m.

We show that the plus sign cannot occur. If the plus sign occurs, then

%! & =$

1 +&!7

2

&!

$1!

&!7

2

&=&!7 = %m ! &m. (1.5)


Therefore, since %& = 2, then %2 = (1 ! &)2 + 1(mod &2), where thecongruence, here and in what follows, takes place in Z[(1 +

&!7)/2]. Thus,

%m + %(%2)(m#1)/2 + %(mod &2). Therefore, by (1.5),

% + %m ! &m + & + % + & (mod &2),

so & + 0(mod &2), namely, &## 1OQ(

!"7)

, a contradiction, since & is not a unit.We have shown that !

&!7 = %m ! &m. Hence,

!1 =%m ! &m

&!7

=

.1+$#7

2

/m!

.1#$#7

2

/m

&!7

. (1.6)

Now we expand (1.6) by using the Binomial Theorem (see [68, Theorem 1.6,p.9]) and once done, (1.6) equals,

"mj=0

9mj

:(&!7)j#1 !

"mj=0

9mj

:(!1)j(

&!7)j#1

2m=

"mj=0

9mj

:(&!7)j#1[1! (!1)j ]2m

=

"(m+1)/2j=1

9 m2j#1

:(7)j#1

2m#1.

Hence,

!2m#1 =(m+1)/2!

k=1

$m

2k ! 1

&(7)k#1. (1.7)

From (1.7), we glean that !2m#1 + m(mod 7), and this has solutions if andonly if m + 3, 5, 13(mod 42). In other words, this occurs if and only if n + 5, 7, 15(mod 42), which are exactly the values for which we are searching. However,we must ensure that none of these distinct solutions are congruent modulo 42,our last remaining task.

If we have two distinct solutions m1 and m2 with m1 + m2 (mod 42) and 7!

for ' % N is the largest power of 7 dividing m1 !m2, then

%m1 = %m2%m1#m2 = %m2

$12

&m1#m2 91 +

&!7

:m1#m2, (1.8)

where$

12

&m1#m2

=

;$12

&6<(m1#m2)/6

+ 1 (mod 7!+1).

Now by an easy iterative argument, this leads to the congruence,

%m1#m2 + 1 + (m1 !m2)&!7 (mod 7!+1). (1.9)

However, using the Binomial Theorem as above, we have

%m2 + 1 + m2

&!7

2m2(mod 7). (1.10)


Substituting (1.10) and (1.9) into (1.8) yields the congruence

%m1 + %m2 +m1 !m2

2m2

&!7 (mod 7!+1).

By a similar argument,

&m1 + &m2 ! m1 !m2

2m2

&!7 (mod 7!+1).

Hence,%m1 ! &m1 + %m2 ! &m2 +

m1 !m2

2m2#1

&!7 (mod 7!+1).

We also know, by the same argument as that used on m above, that

%m1 ! &m1 = %m2 ! &m2 ,

so (m1!m2)&!7 + 0(mod 7!+1). Since m1, m2 % Z, then m1 + m2 (mod 7!+1),

which contradicts the fact that ' is the largest power of 7 dividing such a dif-ference. Hence, ' cannot exist, so m1 = m2. !

Later, when we have developed more algebraic number theory such as idealtheory, we will be able to prove results for the generalized Ramanujan-Nagellequation—see §8.2. For now we have exploited the most out of our developmentthus far, so this is a suitable juncture to end this section.

In the following section, we will concentrate upon a special type of quadraticfield called Gaussian, and we will look at it in detail as a mechanism for devel-oping more general concepts.

Exercises

1.1. Let Q(%) be an algebraic number field. Prove that Q(%) = Q(a% + b) forany a, b % Q with a '= 0.

1.2. Let R be a ring and let {Rj : j % I} for some indexing set I be any setof subrings of R. Prove that *j'IRj is a subring of R. Also, show that ifR1 ( R2 ( · · · ( Rj ( · · · , then 3j'IRj is a subring of R.

1.3. Let p be a prime and let !p be a primitive pth root of unity. Prove thatm$p,Q(x) = xp#1 + xp#2 + · · · + x + 1.

1.4. Prove that if an algebraic number field F is of the form"

F = Q(%1, %2, . . . ,%n)

for n % N where %j for j = 1, 2, . . . , n are algebraic numbers, then there isan algebraic number * such that F = Q(*). (Hence, all algebraic numberfields are simple extensions of Q.)


(Hint: It su!ces to prove this for n = 2 with %1 = % and %2 = &. Let

m#,Q(x) =d!7

j=1

(x! %j),

where the %j are the conjugates of % over Q, and let

m%,Q(x) =d"7

j=1

(x! &j),

where the &j are the conjugates of &1 = & over Q. Select q % Q withq '= (% ! %k)/(&j ! &) for any k = 1, 2, . . . , d# and any j = 1, 2, . . . , d%,and let

* = % + q&

andf(x) = m#,Q(* ! qx).

Prove that & is the only common root of f(x) and m%,Q(x). Show thatthis implies Q(%,&) ( Q(*). The reverse inclusion is clear.)

1.5. Let F be an algebraic number field. Prove that if % % UF , then %j % UF

for all j = 1, 2, . . . , d where

m#,F (x) = xd + ad#1xd#1 + · · · + a1x + a0,

for some d % N is the minimal polynomial of % over F , and %j are theroots of m#,F (x). Conclude that if F is an algebraic number field, then

% % UF if and only ifd7

j=1

%j = ±1.

1.6. Referring to Example 1.7 on page 10, prove that

Q(i,&

2) = Q*&

22

(1 + i)

+,

and that if !8 is a primitive eighth root of unity, then it is an odd powerof&

2(1 + i)/2.

1.7. Prove thatxn ! 1 =

7

d##n

$d(x),

where $d(x) is the cyclotomic polynomial given in Definition 1.9 onpage 11.


1.2 The Gaussian Field

One may say that mathematics talks about things which are of no concern toman. Mathematics has the inhuman quality of starlight, brilliant and sharp,but cold. But it seems an irony of creation that man’s mind knows how tohandle things the better the farther removed they are from the center of hisexistence. Thus, we are cleverest where knowledge matters least: in mathe-matics, especially number theory–see [102].

Hermann Weyl, German mathematician— see Biography 1.1 onpage 31

The ring of Gaussian integers Z[i] in the Gaussian number field, Q(&!1) =

Q(i), exhibits properties of the algebraic integers such as the greatest commondivisor, prime elements, relative primality, and unique factorization, which allowus a pedagogical means of introducing such concepts with minimal abstractionfor later elucidation. (Note that by Theorem 1.3 on page 6, we know that Z[i]must be the ring of integers of Q(i).) Indeed, the study in this section may beviewed as a link to the general theory of algebraic numbers to which we wereintroduced in §1.1. For the following we need to recall Example 1.5 on page 7.

Definition 1.11 Quadratic Conjugates and Norms

If F = Q(&

D) is a quadratic number field and % % F , then

NF (%) = %%%

is called the norm of % from F to Q, where %% is the conjugate of %.

Remark 1.7 Definition 1.11 is a precursor to the more general notion of a“norm” that we will study later in the text. We will see that the norm is theproduct of all the “conjugates” of % from a given number field. Exercise 1.8 onpage 28 tells us that the norm is multiplicative, and is equal to zero if and onlyif the preimage is zero, NF (%) % Q for any algebraic number %, and NF (%) % Zfor any algebraic integer %. In particular, if % + bi % Z[i], then

NF (%) = a2 + b2 ) 0.

Furthermore, for elements %,& % OF , if %## & in OF , then N(%)

## N(&) in Z.

Now we illustrate Definition 1.7 on page 8 for Gaussian integers, whichdisplays the divisibility within Z[i].

Example 1.9 Since 5 = (2 + i)(2! i), then we see that % = (2 + i)## & = 5 =

5+0i. Also, ±1## & and ±i

## & for any & % OF = Z[i] by Theorem 1.4 on page 8since ±1,±i are the units of Z[i].

1.2. The Gaussian Field 19

Definition 1.12 Associates

If F is an algebraic number field, % % OF , and u % UOF , then u% is called anassociate of %. If % and & are associates, we denote this fact by % $ & wherethe underlying OF will be assumed in context.

In order to introduce another concept that mimics one notion of “prime”number encountered in Z, we introduce the following based on units. The othernotion of “prime” is given in Exercise 1.28 on page 29. We will see the distinctionbetween the two comes into focus in §1.3 — in particular, see Remark 1.16 onpage 40.

Definition 1.13 Gaussian Primes

If % '= 0, and % is not a unit such that % is divisible only by units and associatesin OF , then % is called a Gaussian prime in OF .

Example 1.10 In Example 1.9 on the preceding page, 2±i are Gaussian primessince any divisor a + bi of 2 ± i must satisfy that NF (a + bi) = (a2 + b2)

## 5, bypart (e) of Exercise 1.8 on page 28. Therefore, a + bi is a unit or an associateof 2± i given that the only solutions to a2 + b2 = 5 for 1 # |a| < |b| are a = ±1and b = ±2.

Example 1.11 If F is a quadratic number field, then &## % in OF if and

only if &%## %%, where %% is the conjugate of %—see properties (a)–(c) given in

Example 1.5 on page 7. Thus, % % Z[i] is a Gaussian prime if and only if %% isa Gaussian prime.

In [68, Section 1.2], we studied the greatest common divisor for rationalintegers. We now elevate this to the Gaussian integers. As with the rationalintegers, to do this we need the notion of a Euclidean algorithm, albeit in thecase of Z[i], employing norms as follows. As with the rational integers, wefirst develop a division algorithm that is then repeatedly applied to yield theEuclidean algorithm. In the ensuing proof, we also use the floor function asstudied in [68, Section 2.5], to define the nearest integer function, Ne(x) =0x + 1/21, which is the integer closest to x % R.

Theorem 1.9 Division Algorithm for Gaussian Integers

Let %,& % Z[i] with & '= 0. Then there exists (, ) % Z[i] such that

% = &( + ),

where 0 # NF ()) < NF (&).

Proof. Let %/& = c + di % C. Set f = 0c + 1/21 = Ne(c), and g = 0d + 1/21 =Ne(d). Hence, there are k, ' % R such that

|k| # 1/2 and |'| # 1/2 (1.11)


withc + di = (f + k) + (g + ')i. (1.12)

Set( = f + gi and ) = %! &(. (1.13)

Then it remains to show that 0 # NF ()) < NF (&). From Remark 1.7 onpage 18, we know that NF ()) ) 0. Now we show that NF ()) < NF (&).

By part (b) of Exercise 1.8 on page 28 (the multiplicativity of the norm), wehave that

NF ()) = NF (%! &() = NF ((%/& ! ()&)

= NF (%/& ! ()NF (&) = NF (c + di! ()NF (&).

However, from (1.12)–(1.13), we get

c + di! ( = c + di! (f + gi) = (c! f) + (d! g)i = k + 'i.

Therefore, by (1.11),

NF ()) = NF (k + 'i)NF (&) =

(k2 + '2)NF (&) # ((1/2)2 + (1/2)2)NF (&) # NF (&)/2 < NF (&),

as required. !

Remark 1.8 The ( in Theorem 1.9 is called a quotient and the ) is calleda remainder of the division. This follows the notions set up for the divisionalgorithm in Z.

Remark 1.9 Although Theorem 1.9 gives us a criterion for the existence ofan algorithm for division in Z[i], there is no uniqueness attached to it. In otherwords, we may have many such representations as the following illustrationdemonstrates.

Example 1.12 Let % = 10 + i and & = 2 + 5i, then we may find (, ) % Z[i]using the techniques established in the proof of Theorem 1.9. We have

c + di =%

&=

10 + i

2 + 5i=

(10 + i)(2! 5i)(2 + 5i)(2! 5i)

=2529! 48

29i,

so

f ==c +

12

>=

=2529

+12

>= 1 and g =

=d +

12

>=

=!48

29+

12

>= !2.

Therefore,

( = 1! 2i and ) = %! &( = 10 + i! (2 + 5i)(1! 2i) = !2.


Moreover, we verify that NF ()) = NF (!2) = 4 < NF (&) = NF (2 + 5i) = 29with

% = 10 + i = (2 + 5i)(1! 2i)! 2 = &( + ). (1.14)

However, these choices are not unique since we need not follow the techniquesof Theorem 1.9. For instance, if we choose ( = 1! i and ) = 3! 2i, then

% = 10 + i = (2 + 5i)(1! i) + 3! 2i = &( + ), (1.15)

where NF ()) = 13 < 29 = NF (&). Thus, by (1.14)–(1.15), we see that, whenemploying the division algorithm for Gaussian integers, the quotient and re-mainder are not unique. See Exercises 1.12–1.15 on page 28.

We are now in a position to exhibit the notion of greatest common divi-sor that we studied for the rational integers in [68, Section 1.2]. (Also, seeRemark 1.13 on page 33.)

Definition 1.14 GCD for Algebraic Integers

If F is a number field, and %,& % OF , not both zero, then a greatest commondivisor (gcd) of % and & is a * % OF such that both of the following are satisfied.

(a) *## % and *

## &, namely * is a common divisor of % and &.(b) Suppose that ) % OF where )

## %, and )## &. Then )

## *, namely anycommon divisor of % and & divides *.

The first thing we need to know is that every pair of Gaussian integers indeedhas a gcd.

Theorem 1.10 Gaussian GCDs Always Exist

If %,& % Z[i] = OF , where at least one of % or & is not zero, then thereexists a gcd * % Z[i] of % and & which is unique.

Proof. Given fixed %,& % Z[i], not both zero, set

S = {NF ((% + +&) > 0 : (, + % Z[i]},

with S '= " since

NF (%) = NF (1 · % + 0 · &), and NF (&) = NF (0 · % + 1 · &) (1.16)

are both in S, at least one of which is not zero, and by Remark 1.7 on page 18,nonnegative. Thus, we may employ the well-ordering principle studied in [68,Section 1.1, p. 11] to get the existence of an element *0 = (0% + +0& % S, forwhich its norm is the least value in S, namely

NF (*0) # NF ((% + +&) for all (, + % Z[i].

Claim 1.3 *0 is a greatest common divisor of % and &.


Let , % Z[i] with ,## % and ,

## &. Thus, there exists )1, )2 % Z[i] such that% = ,)1 and & = ,)2. Hence,

*0 = (0% + +0& = (0,)1 + +0,)2 = ,((0)1 + +0)2), (1.17)

so ,## *0. It remains to show that *0 divides both % and &.

Let - = .1% + .2& such that NF (-) % S. Thus, by Theorem 1.9 on page 19,there exist µ, / % Z[i] such that

- = *0µ + / (1.18)

with0 # NF (/) < NF (*0). (1.19)

Also, by (1.17)–(1.18),

/ = -! *0µ = .1% + .2& ! ((0% + +0&)µ = (.1 ! (0µ)% + (.2 ! +0µ)&,

so NF (/) % S. However, by (1.19), this contradicts the minimality of NF (*0) inS, unless / = 0, by part (c) of Exercise 1.8 on page 28. We have shown that *0

divides every element whose norm is in S so, in particular, by (1.16), it divides %and &, which secures claim 1.3 via Definition 1.14. Hence, we have the result.!

Remark 1.10 By Exercise 1.17 on page 29, we know that * is a gcd of % and &in Z[i] if and only if all of its associates are also gcds. Therefore, we may ascribe“uniqueness” to the gcd of two elements by saying that we do not distinguishbetween associates when discussing their gcd. Another way of saying this isthat gcds are “unique up to associates.” In other words, the gcd, *, of any twoelements in Z[i] is unique in the sense that * $ ) for any greatest commondivisor ). In this sense, they are in the same “class.” Essentially this is whatwe do in the ordinary integers Z, since we allow only for a gcd to be positivegiven that the only units in Z are ±1; this eliminates !1 as a choice, the onlypossible associate of a positive gcd in Z. See Definition 1.20 on page 37.

Now we are in a position to state a generalization of another concept fromthe ordinary integers to the Gaussian integers.

Definition 1.15 Relatively Prime Algebraic Integers

Two algebraic integers % and & are said to be relatively prime if 1 is a gcd of %and &. Equivalently, % and & are relatively prime if the only gcd of % and & is1 up to associates, namely * is a gcd of % and & if and only if * $ 1.

By Remark 1.10, 1 is a gcd of two Gaussian integers if and only if ±1,±iare gcds of them. Now we are in a position to present a Euclidean algorithm aspromised earlier.


Theorem 1.11 A Euclidean Algorithm for Gaussian Integers

Let % = %0, & = &0 % Z[i] = OF be nonzero where & ! %. By applyingTheorem 1.9 on page 19 successively, the following sequence is obtained

%j = &j)j + *j with NF (*j) < NF (&j) for j = 0, 1, . . . , n

and n % N is the least value such that *n = 0. The value )j is the quotient ofthe division of %j by &j; *j is its remainder; and *n#1 is a greatest commondivisor of % and &.

Proof. Applying Theorem 1.9 to %0 and &0 we get

%0 = &0)0 + *0 with NF (*0) < NF (&0). (1.20)

Then by repeated application, we get for j % N,

%j = &j)j + *j with 0 # NF (*j) < NF (&j), (1.21)

where %j = &j#1 and &j = *j#1. Thus, for a given j % N,

0 # NF (*j#1) # NF (&j#1) = NF (*j#2) < NF (&j#2) < · · · < NF (&0),

so by induction,0 # NF (*j#1) # NF (&0)! j,

which tells us that NF (*n) = 0 for some 0 < n < NF (&0). Note that n > 0since we assumed that % is not divisible by &.

Since %n = &n)n + *n, then

*n#1 = &n

## %n = &n#1 = *n#2,

and similarly, *n#2

## *n#3. Continuing in this fashion, we see that *n#j

## *n#j#1

for each natural number j < n, so

*n#1

## *1

## *0 = &1. (1.22)

Thus, by Equation (1.21) with j = 1,

*n#1

## %1 = &0. (1.23)

Therefore, by Equations (1.20), (1.22)–(1.23), *n#1

## %0. Thus, *n#1 is a com-mon divisor of % and &. If ( is a common divisor of % and &, then by (1.20),(

## *0. However, by Equation (1.21) with j = 1,

& = &0 = %1 = &1)1 + *1 = *0)1 + *1,

so (## *1. Continuing in this fashion, we see that (

## *j for all nonnegativej < n. By Definition 1.14 on page 21, *n#1 is a gcd of % and &. !


Example 1.13 If % = 211 + 99i and & = 12 + 69i, then we may follow thesteps of the Euclidean algorithm to find a gcd of % and &.

%0 = 211 + 99i = (12 + 69i)(1! 3i)! 8 + 66i = &0)0 + *0 (1.24)

%1 = &0 = 12 + 69i = (!8 + 66i) · 1 + 20 + 3i = *0)1 + *1 = &1)1 + *1 (1.25)

%2 = &1 = !8 + 66i = (20 + 3i)(3i) + 1 + 6i = *1)2 + *2 = &2)2 + *2 (1.26)

%3 = &2 = 20 + 3i = (1 + 6i)(1! 3i) + 1 = *2)3 + *3 = &3)3 + *3 (1.27)

%4 = &3 = 1 + 6i = 1 · 6i + 1 = *3)4 + *4 = &4)4 + *4 (1.28)

%5 = &4 = 1 = *4 · 1 + 0 = &5)5 + *5.

Hence, *n = *5 = 0 and *4 = 1 is a gcd of % and &, so % and & are relativelyprime.

Now we may illustrate Theorem 1.10 on page 21 by working backward in theabove steps to get the gcd as a linear combination of % and & as follows. Webegin with *n#1 = *4 = 1 in terms of %4 and %5. Then successively work backto get *4 in terms of %j and %j#1 for j = 5, 4, 3, 2 thereby getting it as a linearcombination of % and &. From (1.28),

*n#1 = *4 = 1 = (1 + 6i) · 1! 6i · 1,

but by (1.27),*3 = 1 = (20 + 3i)! (1 + 6i)(1! 3i),

so

1 = (1 + 6i) · 1! 6i[(20 + 3i)! (1 + 6i)(1! 3i)] = (1 + 6i)(19 + 6i)! 6i(20 + 3i).

From (1.26),1 + 6i = !8 + 66i! (20 + 3i)(3i),

so1 = [!8 + 66i! (20 + 3i)(3i)](19 + 6i)! 6i(20 + 3i) =

(!8 + 66i)(19 + 6i) + (20 + 3i)(18! 63i).

From (1.25),20 + 3i = 12 + 69i! (!8 + 66i),

so1 = (!8 + 66i)(19 + 6i) + [12 + 69i! (!8 + 66i)](18! 63i) =

(12 + 69i)(18! 63i) + (!8 + 66i)(1 + 69i).

From (1.24),!8 + 66i = 211 + 99i! (12 + 69i)(1! 3i),

so

1 = (12 + 69i)(18! 63i) + [211 + 99i! (12 + 69i)(1! 3i)](1 + 69i) =


(12 + 69i)(!190! 129i) + (211 + 99i)(1 + 69i).

Hence,*n#1 = *4 = 1 = (1 + 69i)%! (190 + 129i)&,

an expression of our gcd as a linear combination of % and &.

Now we describe a means of ascribing parity to Gaussian integers.

Definition 1.16 Odd and Even Gaussian Integers

If % % Z[i], then % is said to be odd if (1 + i) ! %, and % is said to be even if(1 + i)

## %.

Remark 1.11 The notion of parity for Gaussian integers is based upon thefact that if (1 + i)

## %, then NF (1 + i) = 2## NF (%)—see part (e) of Exercise 1.8

on page 28.

Now we show how factorizations unfold in the Gaussian integers. There is amethodology to ensure uniqueness of factorizations in a stricter sense than thefollowing, which is developed in Exercise 1.34 on page 30.

Theorem 1.12 Unique Factorization for Gaussian Integers

Let % be a nonunit, nonzero Gaussian integer. Then

(a) % may be written as a product of Gaussian primes, and

(b) The factorization is unique in the following sense. If for m, n % N,

% =m7

j=1

%j =n7

j=1

&j , where the %j , &j are Gaussian primes, then

m = n and, after possibly renumbering, the %j and &j are associates forj = 1, 2, . . . , n.

Proof. For the proof of both parts, we use induction. For part (a), since %is a nonzero, nonunit, then NF (%) ) 2. If % is a Gaussian prime, then byDefinition 1.12 on page 19, % = & · u is the only factorization of % into aproduct of primes, where u % UF = UZ[i] and & is an associate of %. Assumenow the induction hypothesis, namely that any Gaussian integer, ), with 2 #NF ()) < NF (%), may be factored into a product of Gaussian primes. By theabove we may assume that % is not a prime, since otherwise we are done. Thus,% = (1(2 for (j % Z[i], and 2 # NF ((j) < NF (%) for j = 1, 2. By the inductionhypothesis, (j may be factored into a product of primes for each of j = 1, 2.This is part (a).

If % is a Gaussian prime, then by Definition 1.12 on page 19, % = & · u isthe only factorization of % into a product of primes, where u % UZ[i] and & is anassociate of %, a unique factorization in the sense of (b), namely up to associates.


This is the induction step. Assume now the induction hypothesis, namely thatany Gaussian integer, ), with 2 # NF ()) < NF (%), may be uniquely factoredinto a product of Gaussian primes, up to associates. Suppose that % is notprime and

% =m7

j=1

%j =n7

j=1

&j , where the m, n % N and %j , &j are Gaussian primes.

Therefore, %1

## 8nj=1 &j , which by Exercise 1.28 on page 29, tells us that %1

## &j

for some j = 1, 2, . . . , n. Without loss of generality, we may assume that j = 1since we may reorder the &1, &2, . . . ,&n, if necessary, to ensure %1

## &1. However,since &1 is a Gaussian prime, then %1 must be an associate of &1, namely,&1 = u%1 for some Gaussian unit u. Thus,

%1%2 · · ·%m = &1&2 · · ·&n = u%1&2 · · ·&n

so dividing both sides by %1, we get

%2%3 · · ·%m = u&2&3 · · ·&n.

Since NF (%1) ) 2, then 1 # NF (%2%3 · · ·%m) < NF (%), so by the inductionhypothesis, we infer that m! 1 = n! 1 and after possibly reordering the terms,%j is an associate of &j for j = 2, 3, . . . , n. This proves part (b) by induction.!

Example 1.14 The factorization, up to associates, of the Gaussian integer!91 + 117i is given by

!91 + 117i = (1 + i)(2 + 3i)2(1! 2i)(3 + 2i),

where (1 + i), (2 + 3i), (1! 2i), (3 + 2i) are all Gaussian primes by Exercise 1.38on page 31, since

NF (1 + i) = 2, NF (2 + 3i) = 13 = NF (3 + 2i), NF (1! 2i) = 5.

See Exercises 1.35–1.36.

In Chapter 3, we will be looking at sums of squares as representations ofnatural numbers, which will be an extension of the elementary presentation wegave in [68, Chapter 6]. However, the Gaussian integers provide a segue tosuch representations and thus a desirable topic with which to close this section.As noted in Remark 1.7 on page 18, the norms of Gaussian integers naturallyrepresent the corresponding rational integers as sums of two squares. Now welook at which natural numbers are so represented. The following was provedin [68, Theorem 6.1, p. 244] via fundamental techniques. The result presentedhere uses the Gaussian integers as a vehicle.

Theorem 1.13 Primes as Sums of Two Squares

If p + 1(mod 4) is prime in Z, then there exist unique a, b % N with 1 # b < asuch that p = a2 + b2.


Proof. By Exercise 1.38 on page 31, p is not a Gaussian prime. Therefore, thereexist %,& % OF = Z[i] neither of which is a unit such that p = u%&, where uis a unit. By taking norms we get p2 = NF (u%&) = NF (u)NF (%)NF (&), butNF (u) = 1, NF (%) > 1, and NF (&) > 1, so NF (%) = NF (&) = p is the onlypossibility. Suppose that % = a ± bi and & = c ± di. Since we may absorb anymultiplication of a unit times u into the representation for % and &, then wemay assume without loss of generality that a, b, c, d % N,

1 # b < a, (1.29)

and1 # d < c, (1.30)

thenp = a2 + b2 (1.31)

andp = c2 + d2. (1.32)

It remains to show uniqueness.Multiplying (1.31) by d2 and subtracting b2 times (1.32) we get

a2d2 ! b2c2 = (ad! bc)(ad + bc) = p(d2 ! b2).

Thus, since a, b, c, d <&

p, and p## (ad! bc) # p! 1 or p

## (ad + bc) < 2p, theneither

ad! bc = 0, (1.33)

orad + bc = p. (1.34)

If (1.33) holds, ad = bc so since p is prime, gcd(a, b) = gcd(c, d) = 1. Sincea

## bc, then a## c. Thus, for some f % N, c = af , so ad = bc = baf , which means

that d = bf . Hence, p = c2 + d2 = a2f2 + b2f2 = f2(a2 + b2) = f2p, forcingf = 1. Therefore, c = a and d = b.

If (1.34) holds, then since

p2 = (a2 + b2)(c2 + d2) = (ad + bc)2 + (ac! bd)2 (1.35)

= p2 + (ac! bd)2,

so ac! bd = 0. Thus, ac = bd, and a similar argument to the above yields thata = d and b = c. However, by (1.29)–(1.30), c = b < a = d < c, a contradiction.This is uniqueness so we have the entire result. !

Remark 1.12 The prime-squared representation given in (1.35) is a specialcase of the more general result given in [68, Remark 1.6, p. 46], namely forx, y, u, v,D % Z,

(x2 + Dy2)(u2 + Dv2) = (xu ± Dyv)2 + D(xv 2 yu)2.


Example 1.15 The representation 13 = 32 + 22 is the unique up to order ofthe factors. Notice that

13 = (2 + 3i)(2! 3i) = (3 + 2i)(3! 2i)

so the representation as a product of primes is unique up to the order of thefactors since 3 ! 2i = (2 + 3i)(!i). Thus, in the notation of Theorem 1.13,& = c + di = a! bi, so that c + di is the algebraic conjugate of % = a + bi.

However 3 has no representation as a sum of two integer squares. In fact, aswe proved in [68, Theorem 6.2, p. 245], N = m2n % N where n is squarefree andis representable as a sum of two integer squares if and only if n is not divisibleby any prime p + 3(mod 4). In [68, Theorem 6.3, p. 247], we also found thetotal number of primitive representations of a given N = a2 + b2, namely wheregcd(a, b) = 1. Furthermore, in [68, Chapter 6, Sections 6.2–6.4], we dealt withsums of three and four squares as well as sums of cubes.

Exercises

1.8. Given a quadratic number field F , and %,& % F , prove that

(a) NF (%) % Q.(b) NF (%&) = NF (%)NF (&).(c) NF (%) = 0 if and only if % = 0.(d) If % % OF , then NF (%) % Z.(e) If %

## & in OF , then NF (%)## NF (&) in Z.

1.9. Let F be an algebraic number field and let % be algebraic over F withminimal polynomial

m#,F (x) = xd + ad#1xd#1 + · · · + a1x + a0, where d % N,

and %j for j = 1, 2, . . . , d are all the roots of m#,Q(x). Prove that %j '= %k

for any j '= k.

1.10. Prove that if % % A, then %j % A for all j = 1, 2, . . . , d, where %j are theroots of m#,Q(x).

1.11. If % % Q, prove that % % A if and only if m#,Q(x) % Z[x].

1.12. For each of the following find a quotient and remainder for %/& using thedivision algorithm for Gaussian integers given on page 19.(a) % = 3 + i, & = 4! 3i. (b) % = 3, & = 3 + 5i.(c) % = 11! i, & = 4. (d) % = 4! 3i, & = 3.

1.13. For each of the following find a quotient and remainder for %/& using thedivision algorithm for Gaussian integers.(a) % = 7, & = 3! 3i. (b) % = 2! i, & = 1 + 5i.(c) % = 1! i, & = 3! i. (d) % = !3i, & = 3 + 6i.


1.14. If & = 2 + i find all % % Z[i] such that &## % in Z[i].

1.15. If & = 4 + 5i find all % % Z[i] such that &## % in Z[i].

1.16. Let F be a number field and let *1, *2 % OF . Prove that

*1 and *2 are associates of one another if and only if *1

## *2 and *2

## *1.

1.17. Suppose that F is a number field with %,& % OF not both zero. Provethat * is a greatest common divisor of % and & if and only if all associatesof * are greatest common divisors thereof. Conclude that any two gcds,*1, *2, of % and & must be associates.

1.18. Given a number field F with % % OF and u % UF , prove that 1 is a gcdof % and u.

1.19. Let F be a quadratic number field. Prove that if % % OF , then NF (%) =±1 if and only if % % UF .

1.20. Prove that if %,& % Z[i] and gcd(NF (%), NF (&)) = 1, then % and & arerelatively prime as Gaussian integers.

1.21. Let F be a quadratic number field. If %,& % F with % $ &, prove that|NF (%)| = |NF (&)|.

1.22. Is the converse of Exercise 1.21 true? If so, prove it. If not, provide acounterexample.

1.23. Is the converse to Exercise 1.20 true? If so, prove that it is and if notprovide a counterexample.

In each of Exercises 1.24–1.27, use the Euclidean algorithm given in Theo-rem 1.11 on page 23 to find a gcd in Z[i] for each pair.

1.24. (a) (1 + 5i, 7 + 9i) (b) (111 + 7i, 71 + 9i)

1.25. (a) (12 + 9i, 2 + 69i) (b) (2 + 8i, 21 + 9i)

1.26. (a) (111 + 7i, 7 + 9i) (b) (1 + 7i, 7 + 4i)

1.27. (a) (17 + 7i, 71 + 4i) (b) (1 + 77i, 55 + 4i)

1.28. Let + % Z[i] be a prime, and suppose that %j % Z[i] for j = 1, 2, . . . , n % N.Prove that if +

## 8nj=1 %j , then +

## %j for some j = 1, 2, . . . , n.

1.29. Prove that %,& % Z[i] are relatively prime if and only if their conjugates,%% and &%, are relatively prime.


In Exercises 1.30–1.34, a primary Gaussian integer is an element

% = a + bi % Z[i] such that a is odd, b is even, and a + b + 1 (mod 4).

These are often used in establishing properties of what are called higher reci-procity laws. See [64, pp.290 !], for instance. In the following exercises, weemploy the topic to establish properties of primary integers that are of interestin their own right.

1.30. Prove that the only primary Gaussian unit is 1.

1.31. Prove that a + bi is a primary Gaussian integer if and only if

a + bi + 1 (mod 2 + 2i) in Z[i].

1.32. Prove that any primary Gaussian integer must be odd.

1.33. Prove that, given any odd Gaussian integer, exactly one of its four asso-ciates is primary.

1.34. Suppose that % is a primary non-unit Gaussian integer. Prove that % canbe uniquely factored into a product of primary Gaussian primes %j with

% =n7

j=1

%j where NF (%j) # NF (%j+1) for n % N with j = 1.2, . . . , n! 1.

(Note that this is in contrast to the general case, given in Theorem 1.12 onpage 25, where an arbitrary, non-unit, Gaussian integer can be factoredinto a product of Gaussian primes “up to associates,” since there existmore than one associate for a given Gaussian integer but only one primaryassociate for a given primary integer by Exercise 1.33.)In Exercises 1.35–1.36, find a factorization of the Gaussian integer intoGaussian primes with positive real part and units equal to 1.

1.35. (a) 323 + 1895i.(b) 420! 65i.(c) 9497 + 4112i.(d) !355 + 533i.

1.36. (a) !64 + 83i.(b) !271! 178i.(c) 561! 62i.(d) 212! 137i.

1.37. Prove that any prime p % Z with p + 3(mod 4) is a Gaussian prime.


1.38. Prove that if % % Z[i] = OF and NF (%) = p where p is prime in Z, then% is a Gaussian prime but p is not a Gaussian prime and p + 1(mod 4)or p = 2.

Biography 1.1 Hermann Klaus Hugo Weyl was born on November 9, 1885in Elmshorn, Schleswig–Holstein, Germany. He began his advanced educationat the University of Munich, studying mathematics and physics. Later he con-tinued these studies at the University of Gottingen. His supervisor there wasDavid Hilbert, under whose direction he received his doctorate in 1908–see Bi-ography 3.5 on page 127. His thesis was on singular integral equations thatinvited deep study of Fourier integral theory. At Gottingen, he took up his firstteaching position which he held until 1913. There he wrote his habilitationthesis, which involved the spectral theory of singular Sturm–Liouville problems.He also published his first book in 1913, entitled Idee der Riemannschen Flache,which gave a rigorous foundation to the geometric function theory previouslydeveloped by Riemann. This was accomplished by essentially bringing togetheranalysis, geometry, and topology. The fact that the original text was reprintedin 1997 shows its impact on the progress of mathematics. Eventually he took upa chair in Zurich, Switzerland, where he gave lectures that formed the founda-tion for his second book Raum–Zeit–Materie, published in 1919. Later editionsdeveloped his gauge field theory. During this time he also made contributionsto the theory of uniform distribution modulo 1, an important area of analyticnumber theory. In 1927-28, he taught a course on group theory and quantummechanics. This lead to his third book Gruppentheorie und Quantenmechanikwhich was published in 1928. Essentially Weyl had laid the foundation for thefirst unified field theory for which the Maxwell electromagnetic field and grav-itational field appear as geometrical properties of space-time. From 1930-33,he held the chair of mathematics at Gottingen to fill the vacancy created byHilbert’s retirement. However, the Nazi rise to power convinced him to accepta position at the newly created Institute for Advanced Study at Princeton inthe U.S.A., where he remained until his retirement in 1951. During his yearsat Princeton, he published other influential books, perhaps the most importantof which was Symmetry published in 1952. On December 8, 1955, while on avisit to Zurich, he collapsed and died after mailing thank you letters to thosewho had wished him a happy seventieth birthday. During his life he contributedto the geometric foundations of manifolds and physics, topological groups, Liegroups, representation theory, harmonic analysis, analytic number theory, andthe foundations of mathematics itself. In regard to the latter he said: “Thequestion for the ultimate foundations and the ultimate meaning of mathemat-ics remains open; we do not know in which direction it will find its final solutionnor even whether a final objective answer can be expected at all. “Mathematiz-ing” may well be a creative activity of man, like language or music, of primaryoriginality, whose historical decisions defy complete objective rationalization.”


1.3 Euclidean Quadratic Fields

Keeping an open mind is a virtue—but as the space engineer James Obergonce said, not so open that your brains fall out.

From The Demon-Haunted World (1995)Carl Sagan (1934–1996), American astronomer and astrochemist

In Theorem 1.9 on page 19, we proved that for %,& % OF = Z[i], there areGaussian integers (, ) such that

% = &( + ), where 0 # NF ()) < NF (&), (1.36)

where ( is a quotient and ) is called a remainder. Condition (1.36) is a specialinstance of the following notion that is the topic of this section. The title ofthis section speaks to Euclidean “fields,” but this slight abuse of language is asuccinct way of saying the “ring of integers of a given quadratic number field.”

Definition 1.17 Euclidean Functions and Domains

Let R be an integral domain. If there exists a function,

f : R ./ N 3 {0},

which satisfies the following conditions,

(a) If %,& % R with %& '= 0, then f(%) # f(%&), and

(b) If %,& % R with & '= 0, there exist (, ) % R, such that

% = &( + ), where f()) < f(&),

then f is called a Euclidean function, and R is called a Euclidean domain withrespect to f .

Example 1.16 We show that the Gaussian integers provide an illustration ofa Euclidean domain. Let % = a + bi % R = Z[i] and define

f(%) = a2 + b2.

Then by Exercise 1.41 on page 45, f(%) # f(%&) for any %& '= 0. This iscondition (a) of Definition 1.17. To verify condition (b), let

& = c + di % R.

Then%/& =

a + bi

c + di=

(a + bi)(c! di)c2 + d2

=

1.3. Euclidean Quadratic Fields 33

ac + bd

c2 + d2+

bc! ad

c2 + d2i = u + vi % C.

Let x, y % Z such that

|u! x| # 1/2, and |v ! y| # 1/2.

Then,|%/& ! (x + yi)| = |(u! x) + (v ! y)i| =

(u! x)2 + (v ! y)2 # 1/4 + 1/4 < 1. (1.37)

Hence, if we let( = x + yi, and ) = %! &(,

thenf()) = f(%! &() = |%! &(| = |&||%/& ! (| < |&| = f(&),

where the inequality follows from (1.37). Hence, condition (b) is satisfied as well.Therefore, R is Euclidean with respect to the norm function f(%) = NF (%) —seeDefinition 1.18 on the following page.

Remark 1.13 In Theorem 1.11 on page 23, we provided a Euclidean algorithmfor Gaussian integers. Now we generalize this, in light of Example 1.16 to an ar-bitrary Euclidean domain, and the proof follows along the lines of Theorem 1.11.Note that the following also extends the notion of a gcd from algebraic integersgiven in Definition 1.14 on page 21 to elements in a Euclidean domain, and soextends the notion of relative primality given in Definition 1.15 on page 22 toEuclidean domains as well. See Exercise 1.39 on page 45.

Theorem 1.14 Euclidean Algorithm in Euclidean Domains

Let R be a Euclidean domain with respect to f , and let % = %0, & = &0 % Rwith %0&0 '= 0 and &0 ! %0. We can define %j % R and &j % R for j = 1, 2, . . . , nrecursively by

%j = &j)j + *j, with f(*j) < f(&j), (1.38)

where %j = &j#1 and &j = *j#1. Also, if n % N is the least value such that*n = 0, then *n#1 is a gcd of % and &.

Proof. For each j = 1, 2, . . . , n, (1.38), follows from condition (b) of Definition1.17, given that we begin with % = %0 and & = &0 where

f(0) < f(%j) < f(%j+1)

for each j = 0, 1, . . . , n. Thus, there must exist a value n % N such that *n = 0,observing that n '= 0 since & ! %. Since %n = &n)n + *n, then

*n#1 = &n

## %n = &n#1 = *n#2.


Similarly, *n#2

## *n#3. Continuing in this way, we see that

*n#j

## *n#j#1

for all natural numbers j < n. Hence, *n#j

## *1

## *0 = &1, so since

%1 = &1)1 + *1,

then *n#1

## %1 = &0. Also, since

%0 = &0)0 + *0, (1.39)

then *n#1

## %0. We have shown that *n#1 is a common divisor of % and &. Itremains to show that it is divisible by any common divisor of the two. If ( is acommon divisor of % and &, then (

## *0 by (1.39). Thus,

& = &0 = %1 = &1)1 + *1 = *0)1 + *1,

so (## *1. Continuing in this way, (

## *j for all natural numbers j < n. Inparticular, (

## *n#1. We have shown that any pair of elements in a Euclideandomain possesses a gcd and that such a gcd may be found by the Euclideanalgorithm described above. !

Example 1.16 on page 32 is a motivator for another aspect of Euclideanquadratic fields that is worthy of exploring, namely those that satisfy the fol-lowing property.

Definition 1.18 Norm-Euclidean Quadratic Fields

A quadratic number field F = Q(&

D) is said to be norm-Euclidean if given%,& % OF with & '= 0, there exist (, ) % OF such that

% = &( + ) where |NF ())| < |NF (&)|.

Remark 1.14 Now we look to determine which quadratic fields are Euclidean.The reader should first solve Exercise 1.46 on page 45. Note that condition (c) inthat exercise was established by G.R. Veldkamp in [97], who essentially wantedto show that condition (a) of Definition 1.17 on page 32 is redundant.

Theorem 1.15 Euclidean Complex Quadratic Fields

Let OF be the ring of integers of the quadratic number field F = Q(&

D)with D < 0. Then the following are equivalent.

(a) OF is Euclidean.

(b) OF is norm-Euclidean.

(c) D % {!1,!2,!3,!7,!11}.


Proof. To show that (a) and (b) are equivalent, we first show that the normfunction for quadratic fields given in Definition 1.17 on page 32 is indeed a Eu-clidean function according to Definition 1.17 on page 32. Part (a) of Definition1.17 is satisfied since if %& '= 0, then

|N(%&)| = |N(%)||N(&)|

by part (b) of Exercise 1.8 on page 28, and

|N(%)||N(&)| ) |N(&)|.

Part (b) of Definition 1.17 is part of Definition 1.18.Now we show that Euclidean complex quadratic fields are norm-Euclidean.

Assume that |D| > 11 and OF is Euclidean with respect to f . Select & % OF

with & '= 0,±1 such that

f(&) = min{f(%) : % % OF , % '= 0,±1}. (1.40)

Thus, by property (b) of Definition 1.17, for every % % OF , there exists a( % OF such that %! (& = 0,±1. In particular, if % = 2, then |&| # 3, since

(& = % or (& = % ± 1. (1.41)

However, if |&| = 3, this contradicts (1.40) since

f((&) = 3 > f(%) = f(2),

using part (a) of Definition 1.17. Thus, |&| # 2 since either &## % or &

## (%± 1).Hence, NF (&) # 4. If D '+ 1(mod 4), there exist a, b % Z such that & = a+b

&D

by Theorem 1.3 on page 6. So since

4 ) NF (&) = a2 !Db2 > a2 + 11b2,

we must have b = 0 and |a| # 1, namely & = 0 or |&| = 1 both of whichcontradict the choice of &. If D + 1(mod 4), then by Theorem 1.3 again, thereexist integers a, b of the same parity such that & = (a + b

&D)/2. Hence,

16 ) a2 !Db2 ) a2 + 15b2,

so |b| = 0, 1, respectively |a| = 0, 1. In the former case, & = 0 contradicting thechoice of & and in the latter case,

& = (1 +&!15)/2.

However, by (1.41), we must have % = 2 = (& in this case, so there exist x, y % Zof the same parity such that

2 =$

x + y&!15

2

& $1 +

&!15

2

&=

$x! 15y + (x + y)

&!15

4

&,


so x = !y and x! 15y = 8. This implies !16y = 8, a contradiction. Hence (a)is equivalent to (b).

To show that (b) is equivalent to (c), we employ condition (c) of Exercise 1.46on page 45. Assume that OF is Euclidean for D < 0. First we look at the casewhere D '+ 1(mod 4). Then by Theorem 1.3, for a given + = q + r

&D % F , we

must find( = a + b

&D % Z[

&D]

with|(q ! a)2 !D(r ! b)2| < 1. (1.42)

Let + =&

D/2, then we must have, from (1.42), that

|a2 !D

$12! b

&2

| < 1,

which means that $b! 1

2

&2

|D| + a2 < 1.

However, for any b % Z, (b! 1/2)2 ) 1/4, so

|D|4

< 1! a2 # 1,

namely |D| < 4 for which only the values D = !1,!2 hold.Now assume that D + 1(mod 4) and let + = (1 +

&D)/4. Then by (1.42),

#####

$14! a

2

&2

!D

$14! b

2

&2##### < 1,

namely $14! a

2

&2

+ |D|$

14! b

2

&2

< 1.

However, for any x % Z, |1/4!x/2| ) 1/4, so 1+ |D| < 16, from which we inferthat D = !3,!7,!11. We have shown that (b) implies (c).

It remains to verify that the values on our list actually are Euclidean, inorder to prove that (c) implies (b). To do this, we recall the nearest integerfunction, Ne, described on page 19.

If D = !1,!2, then by taking a = Ne(q), b = Ne(r), (1.42) holds since

|(q ! a)2 !D(r ! b)2| #

#####

$12

&2

+ 2$

12

&2##### < 1.

It remains to consider D + 1(mod 4). We let

b = Ne(2r), for which |2r ! b| # 1/2.


If we select a % Z to be such that |q!a!b/2| # 1/2. Then, for D = !3,!7,!11,#####

$q ! a! b

2

&2

!D

$r ! b

2

&2##### #

####14

+1116

#### =1516

< 1,

so (1.42) holds, as required. !

Remark 1.15 The case for real quadratic fields is more complicated. We’ll alsoaddress some of these fields in §1.4–see Theorem 1.21 on page 50.

We now look at factorization in rings of integers of number fields. To doso we need to introduce some notions related to that of primes. Note that thismore general definition refines the definition given for Gaussian integers in Def-inition 1.13 on page 19, which we will shortly show to be equivalent in the caseof domains having a certain property shared with Z[i]. — see Definition 1.20.

Definition 1.19 Irreducible and Prime Elements

A nonzero, nonunit element % in an integral domain R is called irreducible ifwhenever there exist &, * % R with % = &*, then one of & or * is unit. If thisproperty fails to hold for % then it is called reducible.

If % % R, then % is called prime if whenever %## &* for &, * % R, then %

## &or %

## *.

Example 1.17 In the Gaussian integers

5 = (2 + i)(2! i)

where 2 + i and 2! i are irreducibles, and shortly we will see that they are alsoprimes in the sense of Definition 1.19. Also, in Z[

&10],

6 = 2 · 3 = (4 +&

10)(4!&

10),

where each of the four factors is irreducible. In the latter case the two factor-izations are distinct since 2 and 3 are not associates of 4+

&10 or 4!

&10—See

Exercises 1.47–1.48 for proofs of the above facts. This nonuniqueness of factor-ization is at the core of fundamental aspects of algebraic number theory andmotivates the following notion.

Definition 1.20 Unique Factorization

If R is an integral domain in which every nonzero, nonunit element of R canbe expressed as a finite product of irreducible elements of R, then R is called afactorization domain. A factorization domain R is called a unique factorizationdomain (UFD) if the following property holds:

Suppose that % % R such that

% = u&b11 &b2

2 · · ·&bnn


where bj % N, and the &j are nonassociated irreducible elements of R for 1 #j # n, and u is a unit of R. Suppose further that we have another factorizationgiven by

% = v*a11 *a2

2 · · · *amm , where aj % N and v is a unit of R.

Then m = n, the *j are nonassociated irreducible elements of R and (afterpossibly rearranging the &j), &j $ *j for j = 1, 2, . . . , n.

The following links Definition 1.13 on page 19 and Definition 1.20.

Lemma 1.2 Primes are Irreducible

If % is prime in an integral domain R, then % is irreducible.

Proof. Let % be a prime element in R. If % = &* where &, * % R, then %## & or

%## *. Without loss of generality, assume that %

## &. Therefore, there exists a) % R such that & = %), so % = &* = %)*. Since R is an integral domain, wemay cancel the % from both sides to get that 1 = )*, so * $ 1. We have shownthat % is irreducible. !

Theorem 1.16 Criterion for Unique Factorization Domains

An integral domain R is a unique factorization domain if and only if everyirreducible element in R is prime.

Proof. Assume that R is a unique factorization domain. Let % % R be irre-ducible, and assume that %

## &*. It remains to show that %## & or %

## *. Since& and * may be uniquely represented as

& = u(a11 (a2

2 · · ·(amm

and* = v)b1

1 )b22 · · · )bn

n

where aj , bk % N, u, v are units in R, (j for j = 1, 2, . . . ,m, respectively )k fork = 1, 2, . . . , n, are nonassociated irreducibles, there exists a + % R such that

+% = &* = uvm7

j=1

(aj

j

n7

k=1

)akk .

Since % is irreducible, then % $ (j for some j = 1, 2, . . . ,m, or % $ )k for somek = 1, 2, . . . , n. In other words, %

## & or %## *.

Conversely, suppose that every irreducible in R is prime. Let

um7

j=1

(aj

j = vn7

k=1

)bkk , (1.43)


where u, v are units in R and (j , )k are nonassociated irreducibles (primes) forj = 1, 2, . . . ,m, respectively k = 1, 2, . . . , n. We will use induction on m to provethat m = n and (j $ )k for some j, k. If m = 0, then the result vacuously holds.Assume that m % N and induction hypothesis is that unique factorization holdsfor all factorizations of nonassociated irreducibles of length less than m. Thenif (1.43) holds,

(m

## vn7

k=1

)bkk ,

so (m

## )k for some k, since (m is prime, so (m ! v. By renumbering the )k ifnecessary, we may conclude that (m

## )n. But since both (m and )n are primes,then (m = w)n for some unit w in R. Thus,

uw(a11 (a2

2 · · ·(am"1m#1 )am

n = v)b11 )b2

2 · · · )bn"1n#1 )bn

n .

Without loss of generality assume that am ) bn. Then

uw(a11 (a2

2 · · ·(am"1m#1 )am#bn

n = v)b11 )b2

2 · · · )bn"1n#1 ,

so if am > bn, then

)n

##n#17

k=1

)k,

and since )n is prime it must be an associate of )k for some 1 # k # n ! 1,contradicting the fact that the )k are nonassociated for distinct k. Hence, am =bn, and

uw(a11 (a2

2 · · ·(am"1m#1 = v)b1

1 )b22 · · · )bn"1

n#1 ,

so by the induction hypothesis m!1 = n!1 and the (j are associates of the )k

in some order. This completes the induction and we have unique factorization.!

Theorem 1.17 Euclidean Domains are UFDs

Euclidean domains are unique factorization domains.

Proof. Let R be a Euclidean domain with respect to f , and let % % R be nonzero.First, we establish the existence of factorizations into irreducible elements. ByExercise 1.43 on page 45, f(%) = f(1) if and only if % % UR. In this case %is vacuously a product of irreducible elements. Hence, we may use inductionon f(%). By Exercise 1.42, f(1) # f(%). Assume that % '% UR, and that any& % R with f(&) < f(%) has a factorization into irreducible elements. If % isirreducible, we are done. Assume otherwise. Then

% = &* for &, * % R and &, * '% UR.

Thus, by property (a) of Euclidean domains given in Definition 1.17 on page 32,f(&) # f(%), and f(*) # f(%). By part (b) of Exercise 1.44, f(*) '= f(%), and


f(&) '= f(%), so by the induction hypothesis, both & and * have factorizationsinto irreducibles. Thus, so does %, and existence is established. It remains toestablish uniqueness.

Let %## &* where % is irreducible. If % ! &, then % and & are relatively

prime—see Remark 1.13 on page 33. Therefore, by Exercise 1.39 on page 45,there are (, ) % R such that

1 = %( + &).

Therefore,* = %(* + &)*.

Since %## &*, the latter implies that %

## *. In other words, % is prime. Hence,all irreducibles are primes. By Theorem 1.16, we have secured the result. !

Remark 1.16 Note that in Theorem 1.10 on page 21, we proved that gcd’salways exist for the Gaussian integers. This is clear from Theorem 1.17 andExample 1.16 on page 32 since the Gaussian integers form a Euclidean domain.We cannot ensure the existence of gcds without unique factorization, whichis guaranteed in Euclidean domains by Theorem 1.17. Indeed, the definitionof a Gaussian prime given in Definition 1.13 on page 19, uses the fact thatall irreducibles in the Gaussian integers are primes, a fact we now know fromExample 1.16 on page 32 and Theorem 1.17 on the previous page.

We may speak about factorizations in domains that are not UFDs. However,we cannot speak about factorizations of elements in this regard; rather we mustmove to the level of ideals and this is to come later when we study ideal theory.This is part of the history of the development of algebraic number theory whereDedekind looked at factorizations in non-UFDs using ideal theory that we willstudy in Chapter 2.

Example 1.18 The Gaussian integers 2± i are primes, which is equivalent tobeing irreducible in any UFD, as noted in Remark 1.16. However, the converseof Lemma 1.2 does not hold. For instance, by Example 1.17 on page 37, 2 isirreducible in Z[

&10], but

2 is not prime, since 2## (4 +

&10)(4!

&10) without dividing either factor.

As shown in Example 1.17, Z[&

10] is not a unique factorization domain. At theheart of this fact in general for the nonexistence of unique factorization in afactorization domain is the failure of irreducibles to be primes, as Theorem 1.16on page 38 essentially validates.

The next topic is the introduction of another property, which cannot beguaranteed to exist, unless we are in a UFD. This mimics the notion studied forthe rational integers in [68, Section 1.2].

Definition 1.21 Least Common Multiple in UFDs

Let R be a UFD. A least common multiple of %,& % R is an element ) % Rsatisfying the two properties:


(a) %## ) and &

## ).

(b) If %## ( % R and &

## (, then )## (.

Example 1.19 By Exercise 1.49 on page 45, any two least common multiplesof a given pair of elements in a UFD are associates. Thus, as with greatestcommon divisors, least common multiples are unique up to associates.

For instance,

% = (2 + i)## 5 = ) and & = (2! i)

## 5.

Moreover, if (2 + i)## ( % R and (2! i)

## (, then

( = (2 + i))1 = (2! i))2

for )1, )2 % R. In particular, (2+ i)## )2, so 5

## (. Thus, ) = 5 is a least commonmultiple of 2+i and 2!i. Hence, ±5i and ±5 are all of their common multiples.

We conclude this section with an application to a famous result due toFermat.

Remark 1.17 In what follows, we use the symbol gcd(x, y) for

x, y % Z[!3] = Z[(!1 +&!3)/2]

to mean the unique gcd of elements up to associates as dictated by Exercise 1.17on page 29. Moreover, the congruences in the following proof all take place inZ[!3]. See [68, Biography 1.7, p. 33].

Note that Fermat’s Last Theorem (FLT) is the assertion that

xn + yn = zn (1.44)

has no solutions in positive integers x, y, z for n % N with n > 2. For an overviewand background, see [68, Biography 1.10, p.38]. Also, see Biography 5.5 of Wileson page 225 for a synopsis of its solution.

Theorem 1.18 Gauss’ Proof of FLT for p = 3

There are no solutions of

%3 + &3 + *3 = 0

for nonzero %,&, * % OF = Z[!3], where F = Q(!3). In particular, there are nosolutions to

x3 + y3 = z3,

in nonzero rational integers x, y, z.


Proof. We assume that there are nonzero %,&, * % OF such that

%3 + &3 + *3 = 0,

and achieve a contradiction. Without loss of generality, we may assume that

gcd(%,&) = gcd(%, *) = gcd(&, *) = 1.

Let. = 1! !3.

Then sinceNF (.) = ..% = 3,

we must have .## 3. Also, by Theorem 1.17 on page 39 and Exercise 1.52

on page 46, . is prime in OF . We will achieve the desired contradiction byan infinite descent argument. This is not done directly, but rather we get acontradiction to the equation

%3 + &3 + .3n+3 = 0,

for any n % N and + % OF . Thus, we first show that the latter equation holds.We require three claims.

Claim 1.4 If . ! ) % OF , then ) + ±1(mod .).

Let) = a + b!3, where a, b % Z.

Then ) = u + v., where u, v % Z. If .|u, then .## ), a contradiction, so . ! u.

Since .|3, then 3 ! u, so u + ±1(mod 3) in Z. Thus, there is a t % Z such that

) = ±1 + 3t + v..

But .|3, so there exists a ( % OF such that

) = ±1 + t(. + v. = ±1 + .(t( + v).

In other words, ) + ±1(mod .) as required.

Claim 1.5 If . ! ) % OF , then )3 + ±1(mod .4).

Since . ! ), then by Claim 1.4, ) + ±1(mod .). We may assume that

) + 1 (mod .)

since the other case is similar. Therefore, ) = 1 + .( for some ( % OF . Thus,

)3 ! 1 = () ! 1)() ! !3)() ! !23 ) = .((.( + 1! !3)(.( + 1! !2

3 ) =

.((.( + .)(.( + .(1 + !3)) = .3((( + 1)(( ! !23 ), (1.45)


where the last equality follows via Exercise 1.54 on page 46, from the fact that

1 + !3 + !23 = 0. (1.46)

Since !23 ! 1 = (!3 + 1)(!3 ! 1) = (!3 + 1)., then !2

3 + 1(mod .), so by (1.45)and Claim 1.4,

0 + ()3 ! 1).#3 + ((( + 1)(( ! !23 ) + ((( + 1)(( ! 1) + (((2 ! 1) (mod .).

Hence,)3 + 1 (mod .4),

and we have Claim 1.5.

Claim 1.6 .## %&*.

Suppose that . ! %&*. Then by Claim 1.5,

0 = %3 + &3 + *3 + ±1 ± 1 ± 1 (mod .4),

from which it follows that .4## 1 or .4

## 3. The former is impossible since . isprime, and the second is impossible since

3 = (1! !3)(1! !23 ) = (1! !3)2(1 + !3) = .2(1 + !3),

and 1+ !3 is a unit, so not divisible by .2. This contradiction establishes Claim1.6.

By Claim 1.6, we may assume without loss of generality that .## *. However,

by the gcd condition assumed at the outset of the proof, . ! %, and . ! &. Letn % N be the highest power of . dividing *. In other words, assume that* = .n+, for some + % OF with . ! +. Thus, we have

%3 + &3 + .3n+3 = 0. (1.47)

We now use Fermat’s method of infinite descent (which we studied in [68, §7.4,p. 281]) to complete the proof. First we establish that n > 1. If n = 1, then byClaim 1.5,

!.3+3 = %3 + &3 + ±1 ± 1 (mod .4).

The signs on the right cannot be the same since . ! 2. Therefore,

!.3+3 + 0 (mod .4),

forcing .## +, a contradiction that shows n > 1. Given the above, the following

claim, once proved, will yield the full result by descent.

Claim 1.7 If Equation (1.47) holds for n > 1, then it holds for n! 1.


LetX =

& + %!3

., Y =

&!3 + %

., and Z =

(& + %)!23

..

Observe thatX, Y, Z % OF

by Claim 1.5, Equation (1.47), and the fact that !3 + 1(mod .). Also, byExercise 1.54 again,

X + Y + Z = 0,

and

XY Z =&3 + %3

.3=

$!.n+

.

&3

= .3n#3 (!+)3 ,

so.3n#3

## XY Z, but .3n ! XY Z,

since . ! +. Also, since

& = !!3X + !23Y , and % = !3Z !X,

then by the gcd condition assumed at the outset of the proof, we have

gcd(X, Y ) = gcd(X, Z) = gcd(Y, Z) = 1.

Hence, each of X, Y , and Z is an associate of a cube in OF . Also, we mayassume without loss of generality that .3n#3

## Z. By unique factorization inOF , we may let

X = u103, Y = u21

3, and Z = u3.3n#3/3

for some 0, 1, / % OF , and uj % UF for j = 1, 2, 3. Therefore, we have

03 + u413 + u5.

3n#3/3 = 0, (1.48)

where uj = u#11 uj#2 for j = 4, 5. Therefore, 03 + u413 + 0(mod .3). By Claim

1.503 + ±1 (mod .4), and 13 + ±1 (mod .4).

Hence, ±1 ± u4 + 0(mod .3). Since the only choices for u4 are ±1, ±!3, and±!2

3 , then the only values that satisfy the last congruence are u4 = ±1, since

.3 ! (±1 ± !3), and .3 ! (±1 ± !23 ).

If u4 = 1, then Equation (1.48) provides a validation of Claim 1.7. If u4 = !1,then replacing 1 by !1 provides a validation of the claim. This completes theproof. !

Remark 1.18 In §8.3 we will generalize the above proof, also due to Kummer,to prove that (1.44) fails to hold for any xyz '= 0 when n = p ) 3 is any so-called“regular” prime p ! xyz—see Remark 8.6 on page 291. However, this requiresdeeper tools.


Exercises

1.39. Let R be a Euclidean domain. Theorem 1.14 on page 33 shows that anytwo nonzero elements %,& % R have a greatest common divisor. Provethat any such gcd may be written in the form

* = +% + 1&

for some +, 1 % R.(Hint: Mimic the proof of Theorem 1.10 on page 21.)

1.40. Give an example of a ring in which there exist elements with no greatestcommon divisor.

1.41. Prove that in Definition 1.17 on page 32, condition (a) is equivalent to thefollowing condition.If %

## & for any %,& % R with %& '= 0, then f(%) # f(&).

1.42. Let R be a Euclidean domain with respect to f and multiplicative identity1R. Prove that f(1R) # f(%) for all nonzero % % R.

1.43. Prove that in a Euclidean domain R with respect to f , and multiplicativeidentity 1R, f(%) = f(1R) for % % R if and only if % is a unit in R.

1.44. Prove that if R is a Euclidean domain with respect to f , then for %,& % R,each of the following hold.

(a) If % $ &, then f(%) = f(&).(b) If %

## & and f(%) = f(&), then % $ &.(c) % % UR if and only if f(%) = f(1R).(d) If % '= 0, then f(%) > f(0).

1.45. Prove that any real quadratic field has infinitely many units.(You may use the fact, established in [68, Theorem 5.15, pp. 234–235],that the Pell equations x2 !Dy2 = 1 has infinitely many solutions.)

1.46. Let F = Q(&

D) be a quadratic number field. Prove that the condition inDefinition 1.18 on page 34 is equivalent to the statement.(c) For any + % F there exists a ( % OF such that |NF (+! ()| < 1.

1.47. In Example 1.17 on page 37, show that 2 + i and 2 ! i are irreducible inZ[i].

1.48. In Example 1.17, show that 2, 3, 4+&

10, 4!&

10 are irreducible in Z[&

10],and that 2, 3 are not associates of 4 +

&10, 4!

&10.

1.49. Prove that if )1 and )2 are least common multiples of %,& % R where Ris a UFD, then )1 $ )2.


1.50. Let % % OF where F = Q(&

D) is a quadratic number field. Prove that ifNF (%) = ±p, where p is prime in Z, then % is irreducible in OF .

1.51. Is the converse of Exercise 1.50 true? If so prove it, and if not provide acounterexample.

1.52. Let F be a quadratic number field that is a UFD. Prove that if % % OF

with NF (%) = ±p, a prime in Z, then % is a prime in OF .

1.53. Is the converse of Exercise 1.52 true? If so prove it, and if not provide acounterexample.

1.54. If !p is a primitive p-th root of unity for a prime p, prove that

p#1!

j=0

!jp = 0.

Biography 1.2 Julius Wilhelm Richard Dedekind (1831–1916) was born inBrunswick, Germany on October 6, 1831. There he attended school from thetime he was seven. In 1848, he entered the Collegium Carolinum, an educa-tional bridge between high school and university. He entered Gottingen at theage of 19, where he became Gauss’ last student, and achieved his doctoratein 1852, the topic being Eulerian integrals. Although he taught in Gottingenand in Zurich, he moved to Brunswick in 1862 to teach at the TechnischeHochschule, a technical high school. In that year he also was elected to theGottingen Academy, one of many honours bestowed on him in his lifetime.He maintained this position until he retired in 1894. Dedekind’s creation ofideals was published in 1879 under the title Uber die Theorie der ganzen al-gebraischen Zahlen. Hilbert extended Dedekind’s ideal theory, which was lateradvanced further by Emmy Noether. Ultimately this led to the general notion ofunique factorization of ideals into prime powers in what we now call Dedekinddomains.

Another of his major contributions was a definition of irrational numbers interms of what we now call Dedekind cuts. He published this work in Stetigkeitund Irrationale Zahlen in 1872. He never married, and lived with his sisterJulie until she died in 1914. He died in Brunswick on February 12, 1916.

1.4. Applications of Unique Factorization 47

1.4 Applications of Unique Factorization

...a mathematical proof, like a chess problem, to be aesthetically satisfying,must possess three qualities: inevitability, unexpectedness, and economy; thatit should ‘resemble a simple and clear-cut constellation, not a scattered clusterin the milky way.’

From page 447 of Enigma (2001)by Robert Harris—see [38]

In §1.3 we looked at some instances of unique factorization in quadraticfields. In particular, we applied unique factorization in

Z[!3] = Z[(!1 +&!3)/2]

to present the Gauss’ proof of the Fermat result, Theorem 1.18 on page 41. Infact, earlier, we tacitly used unique factorization from Z[

&!7], in Theorem 1.8

on page 14, to provide solutions of the Ramanujan-Nagell equation. In thissection we look at unique factorization in other quadratic rings of integers. Webegin with Z[

&!2] to find solutions of certain Bachet equations, those of the

formy2 = x3 + k (1.49)

where k % Z—see [68, Biography 7.2, p. 279].Let us begin with a solution of (1.49) for k = !2 by Euler—see [68, Bi-

ography 1.17, p. 56]. As with the proof of Theorem 1.8, we use the notationgcd(x, y) in this section for x, y in the ring of integers of a given quadratic field,to mean the unique gcd up to associates.

Theorem 1.19 Euler’s Solution of Bachet’s Equation

The only integer solutions of (1.49) for k = !2 are x = 3 and y = ±5.

Proof. First, we rule out the possibility that x is even or y is even. If x is even,then y2 + !2(mod 4), and if y is even, then x3 + 2(mod 4), both of which areimpossible. Hence, both x and y are odd. We may factor in

OF = Z[&!2],

where F = Q(&!2) as follows,

(y +&!2)(y !

&!2) = x3.

First we show that gcd(y +&!2, y !

&!2) = 1. Suppose that

(a + b&!2)

## gcd(y +&!2, y !

&!2) for a, b % Z,


then in particular,

NF (a + b&!2) = (a2 + 2b2)

## NF (y +&!2! (y !

&!2)) = NF (2

&!2) = !8,

(a2 + 2b2)## NF (y +

&!2 + y !

&!2) = 4y2,

and(a2 + 2b2)

## NF (x3) = x6.

The first two equations show that (a2 + 2b2)## 4, since y is odd. Coupled with

the third equation, this shows (a2 + 2b2)## 1, so a + b

&!2 is a unit. Thus,

gcd(y +&!2, y !

&!2) = 1.

We may now invoke unique factorization to conclude that

y +&!2 = ±(c + d

&!2)3, for some c, d % Z,

since ±1 are the only units in Z[&!2] by Theorem 1.4 on page 8. By multiplying

out the right-hand side and comparing coe"cients, we get

y = ±(c3 ! 6cd2), and 1 = ±d(3c2 ! 2d2).

The latter equation implies that d = ±1, so 1 = ±(3c2 ! 2), the only possiblesolutions of which are c = ±1. Hence, putting these back into the equationfor y, we get that y = ±(±1 ± 6), the only possible solutions for which arey % {±7,±5}. However, y = ±7 implies that 51 = x3, which is impossible.Hence, y = ±5, and x = 3. !

We will extend the above solution to much more general instances of (1.49) inTheorem 8.4 on page 282. However, we will need to develop deeper tools beforewe get there. For now, we exploit the unique factorization in the Gaussianintegers to solve another instance of Bachet’s equation purportedly solved byFermat.

Theorem 1.20 Fermat’s Solution of Bachet’s Equation

The only integer solutions of (1.49) for k = !4 are

(x, y) % {(5,±11), (2,±2)}.

Proof. We work in the ring of Gaussian integers F = Z[i], which has uniquefactorization by Theorem 1.15 on page 34. We have the factorization

(2 + xi)(2! xi) = y3,

in F . We first show that gcd(2+xi, 2!xi) = 1 in the case where x is odd. Anycommon divisor a + bi in F must satisfy the property that

(a2 + b2)|NF (4) = 16 = NF (2 + ix + 2! ix),


and(a2 + b2)|4x2 = NF (2 + xi! (2! xi)),

so (a2 + b2)|4. Thus, a, b % {±2,±1, 0}. By part (e) of Exercise 1.8 on page 28,

(a2 + b2)## (x2 + 4),

which is odd, so one of a or b must be 0. In other words, the only commondivisors are units, so 2 + ix and 2 ! ix are relatively prime. Thus, by uniquefactorization in Z[i] ensured by Theorem 1.15 and Theorem 1.17 on page 39,

2 + ix = (a + bi)3 (1.50)

for some a, b % Z. Therefore,

2! ix = (a! ib)3. (1.51)

(Note that although uniqueness is up to units and associates, assume withoutloss of generality that (1.50)–(1.51) hold since u(a + bi)3, where u % {±1,±i},for instance, may be written as a cube in Z[i] since all units are cubes therein.)Adding (1.50)–(1.51) yields

4 = 2a(a2 ! 3b2),

so a|2 forcing a = ±1,±2. Of these, only

(a, b) % {(!1,±1), (2,±1)}

are possible. Hence,

y3 = ((a + bi)(a! bi))3 = (a2 + b2)3,

where y = a2 + b2 % {2, 5}. Therefore, since x is odd, x2 + 4 = 125, withx = ±11. Thus, the solution (x, y) = (±11, 5) is achieved. Now we assume thatx is even. Set x = 2X and y = 2Y . Then

X2 + 1 = 2Y 3,

where X must be odd. In other words, for odd X,

(1!Xi)(1 + Xi) = (1 + i)(1! i)Y 3.

Since gcd(1 + iX, 1! iX) = 1! i, then by unique factorization,

1 + iX = (1 + i)(a + bi)3,

for some a + bi % Z[i]. By comparing the constant terms,

1 = a3 ! 3a2b! 3ab2 + b3 = (a + b)(a2 ! 4ab + b2),

from which it follows that a + b = ±1, and a2 ! 4ab + b2 = ±1. Therefore, oneof a or b is zero and the other is ±1. Hence, X = ±1, and x = ±2, so y = 2. !

We will look at Bachet’s equation again in §8.3 once we have even more toolsto tackle more general solutions to (1.49).


Remark 1.19 Now we explore some of the real quadratic fields which arenorm-Euclidean, and so UFDs—see Definition 1.18 on page 34 and Theorem 1.17on page 39. Unlike the case with complex quadratic fields, there are realquadratic fields that are Euclidean but not norm-Euclidean. For instance, see[15] where it is shown that OF for F = Q(

&69) is Euclidean but not norm-

Euclidean.The history of the resolution towards the complete list of real quadratic

norm-Euclidean fields is due to the e!orts of many researchers. In 1938, H.Heilbronn proved in [41] that there are only finitely many such fields–see Biog-raphy 1.3. That list was eventually determined to be

D % {2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73}.

This was due to the e!orts of O. Perron [77], R. Remak [80], and N. Hofreiter[43], among others–see Biographies 1.4 and 1.5 on page 53 for instance. Thefinal step was accomplished by H. Chatland and H. Davenport [14] in 1950–seeBiography 1.6 on page 54. We will not give the full result here since it in-volves the geometry of numbers. The following partial result was proved by A.Oppenheim in 1934 — see [76].

Biography 1.3 Hans Arnold Heilbronn (1908–1975) was born in Berlin, Ger-many on October 8, 1908. He entered the University of Berlin in 1926, buteventually moved to Gottingen, where he began to study number theory underthe direction of Edmund Landau. He obtained his degree in 1933, when Hitlercame to power. Heilbronn, who was Jewish, fled to England after being dis-missed from his position at Gottingen. Eventually he was o"ered a position atthe University of Bristol where he published, arguably, his most famous result,coauthored with Linfoot, on a conjecture of Gauss concerning complex quadraticfields of class number equal to 1, showing that there are at most ten of them. Ashort while thereafter he was o"ered the Brevan Fellowship in Trinity College,Cambridge, in May 1935. There he began his most long-standing collabora-tion with Davenport that lasted until Davenport died in 1969. For his lifetimeachievements, he was elected as a Fellow of the Royal Society in 1951. In 1964,he moved to North America, and after a brief stay in the U.S.A., he moved tothe University of Toronto in Canada becoming a Canadian citizen in 1970, anda member of the Royal Society of Canada in 1971. A heart attack in Novem-ber of 1973 eventually led to complications and he died while undergoing anoperation to fit a pacemaker on April 28, 1975.

Theorem 1.21 Some Norm-Euclidean Real Quadratic Fields

If F = Q(&

D) is a quadratic number field and

D % {2, 3, 5, 6, 7, 13, 17, 21, 29},

then OF is norm-Euclidean.


Proof. First set

2 =

(2 if D + 1(mod 4),1 if D + 2, 3(mod 4),

and observe that any

( = r1 + s&

D % F , r1, s % Q,

may be assumed without loss of generality to be in the form

( = r1 + (r2/2)&

D,

since we may write s = 2s/2 = r2/2 when D + 1(mod 4).Based upon Exercise 1.46 on page 45, we need to establish that for any

+ = r1 + (r2/2)&

D % F , for r1, r2 % Q,

there exists a( = (x + y

&D)/2 % OF , where x, y % Z

(by Theorem 1.3 on page 6) such that

|NF (+! ()| = |(r1 ! x/2)2 ! (r2 ! y)2D/22| < 1. (1.52)

We assume that Equation (1.52) fails for some r1, r2 % Q and all x, y % Z. Thenwe show that for D # 832, the only values for which (1.52) does not fail are theones on our list.

Claim 1.8 We may assume without loss of generality that 0 # rj # 1/2, forj = 1, 2.

First, for j = 1, 2 we set

zj =

(0rj1 if 0 # rj ! 0rj1 # 1/2,0rj1+ 1 if 1 > rj ! 0rj1 > 1/2,

where 0rj1 is the floor of rj , or greatest integer less than or equal to rj . Sincewe are assuming that Equation (1.52) fails for all x, y % Z, then in particular itwill fail for

x = 2z1 + )1x1, and y = z2 + )2y1,

for any integers x1, y1, where )j = 1 if zj = 0rj1 and )j = !1 otherwise forj = 1, 2. Thus,

|(r1 ! x/2)2 ! (r2 ! y)2D/22|

becomes|(s1 ! x1/2)2 ! (s2 ! y1)2D/22|,

for any x1, y1 % Z, where 0 # sj = |rj ! zj | # 1/2, j = 1, 2. This completes theproof of Claim 1.8.


By Claim 1.8, for all x, y % Z, one of the following inequalities must hold forsome 0 # rj # 1/2, j = 1, 2,

(r1 ! x/2)2 ) 1 + (r2 ! y)2D/22, (1.53)

or

(r2 ! y)2D/22 ) 1 + (r1 ! x/2)2. (1.54)

If x = y = 0, and (1.53) holds, then

14) r2

1 ) 1 +r22D

32) 1,

a contradiction, so (1.54) must hold, namely

r22D

32) 1 + r2

1. (1.55)

If x = 3, y = 0, and (1.53) holds, then 1 ) (r1 ! 1)2 ) 1 + r22D/32 ) 1, a

contradiction, unless r1 = r2 = 0, which contradicts (1.55), so (1.54) must hold,namely

r22D

32) 1 + (r1 ! 1)2. (1.56)

If x = !3, y = 0, and (1.53) holds, then

(r1 + 1)2 ) 1 +r22D

32) 2 + (r1 ! 1)2, (1.57)

which implies that r1 ) 1/2, which in turn forces r1 = 1/2 by Claim 1.8.Plugging this into (1.57), we get

94

= (r1 + 1)2 ) 1 +r22D

32) 2 + (r1 ! 1)2 =

94,

which forces1 +

r22D

32=

94.

Thus, 4r22D = 532, so 3 = 2, which forces r2 = 1 contradicting Claim 1.8. We

have shown that if x = !3, y = 0, then (1.54) must hold, namely,

r22D

32) 1 + (r1 + 1)2,

and by Claim 1.8, this implies that

D

432) r2

2D

32) 2,

whence,D ) 832. (1.58)


In view of (1.58), if D < 832, then OF is norm-Euclidean. If 3 = 2, then D < 32,for which we get the values D = 5, 13, 17, 21, 29 and if 3 = 1, then D < 8 forwhich we get D = 2, 3, 6, 7. This yields the values of D listed in the statementof the theorem. !

Biography 1.4 Oskar Perron (1880–1975) was born in Frankenthal, Pfalz,Germany on May 7, 1880. He studied at several universities includingGottingen. One of his best-known texts is on continued fractions, entitled DieLehre von den Kettenbruchen, published in 1913 with revisions in 1950 and1954. In 1914 he was appointed as ordinary professor in Heidelberg. AfterWorld War I, he was appointed a chair at Munich where he continued to teach,even beyond retirement, until 1960. He not only contributed to number theory,but also to di"erential equations, matrices, and geometry. He published over200 papers and books including his text on non-Euclidean geometry publishedwhen he was 82. He died on February 22, 1975 in Munich.

Biography 1.5 Robert Remak (1888–1942) studied for his doctorate at theUniversity of Berlin under Frobenius. He received his degree in 1911 and thisimportant work, which has his name attached to it, along with Weddeburn,Schmidt, and Krull, was on the decomposition of finite groups into productsof irreducible factors. He taught at the University of Berlin until 1933 whenHitler’s new laws got him dismissed. He was arrested in 1938 and sent to aconcentration camp near Berlin. After eight weeks there he was released andhis wife arranged for him to go to Amsterdam. There in 1942, he was arrestedand sent to Auschwitz, Poland where he died in that year. He made importantcontributions to algebraic number theory and the geometry of numbers duringhis life.

Exercises

1.55. Show that the only rational integer solutions of

y2 = x3 ! 1

are x = 1 and y = 0 using unique factorization in Z[i].

In Exercises 1.56–1.63, assume that F = Q(&

D) is a quadratic number fieldwhere OF is a UFD.

1.56. Prove that any rational prime p is either a prime in OF or a product oftwo primes therein.(Hint: See Exercises 1.37–1.38 on page 31.)

1.57. Prove that if % is a prime in OF , then there is exactly one rational primep such that %

## p.

1.58. Establish each of the following where p is an odd rational prime.


(a) p ! D is a product of two primes %,& in OF if and only if the Legendresymbol (D

p ) = 1. (See [68, §4.1, pp. 177–188]).(b) If p = %&, the product of two primes, then % '$ &, but % $ &%, the

latter being equivalent to %% $ &.

1.59. Prove that if % is a prime in OF , but % is not a rational prime, then|NF (%)| = p for some rational prime p.

1.60. Prove that if D + 3(mod 4), then 2 $ %2 where % is a prime in OF .

1.61. If D + 5(mod 8), prove that 2 is a prime in OF .

1.62. If D + 1(mod 8) show that 2 is the product of two nonassociated primesin OF .

1.63. Prove that if p## D, then p $ %2, where % is a prime in OF .

1.64. By Theorem 1.21 on page 50,

OF = Z[(1 +&

21)/2]

is a UFD. With reference to Exercise 1.63, find a prime % % OF such that3

## %2, and find a u % UF such that 3 = %2u.

Biography 1.6 Harold Davenport (1907–1969) was born in Huncoat, Lan-cashire, England on October 30, 1907. He entered Manchester University in1924. He graduated in 1927, then went to Trinity College, Cambridge. Therehe wrote his doctorate under the direction of Littlewood. His thesis topic wasthe distribution of quadratic residues by employing new methodology using char-acter and exponential sums. In 1930, he won the Rayleigh prize and two yearslater was elected to a Trinity fellowship. Shortly thereafter he visited Hasse inMarburg, Germany where he also met Heilbronn, with whom he began a lengthycollaboration. In 1937, he accepted an o"er from Mordell at the University ofManchester, where he interacted with Mahler, Erdos, and Segre. In 1940, hewas elected as a member of the Royal Society, and won the Adams prize fromthe University of Cambridge. In the following year, he was appointed as chairof mathematics at the University College of North Wales in Bangor. In 1945,he moved to London, to succeed Je"rey as Astor professor of mathematics inUniversity College there. From 1957 to 1959, he was President of the Lon-don Mathematical Society and in the middle of this, in 1958, he returned toCambridge as Rouse Ball professor of mathematics after Besicovitch retired.During his life he contributed to number theory including his work on Waring’sproblem where he showed that every su!ciently large natural number is the sumof sixteen fourth powers. He also wrote several texts which have become clas-sics such as The Higher Arithmetic published in 1952 and many subsequenteditions. Indeed his distinguished contribution to the theory of numbers wasperhaps best honoured by his being awarded the Sylvestor Medal in 1967. Hewas a heavy smoker and finally succumbed to lung cancer on June 9, 1969 inCambridge.

Chapter 2

Ideals

It remains an old maxim of mine that when you have excluded the impossible,whatever remains, however improbable, must be the truth.

spoken by Sherlock Holmes in The Adventure of Beryl Coronet.Sir Arthur Conan Doyle (1859–1930), Scottish-born writer of

detective fiction

2.1 The Arithmetic of Ideals in Quadratic Fields

We first mentioned the notion of an ideal on page 16 in reference to howwe would need such a theory to delve deeper into some Diophantine analysisproblems such as the generalized Ramanujan–Nagell equation. We also havesome background in [68, Appendix A, pp. 303–305]. Now we have su"cienttools to introduce the concepts involved here.

Definition 2.1 Ideals

An R-ideal is a nonempty subset I of a commutative ring R with identity havingthe following properties.

(a) If %,& % I, then % + & % I.

(b) If % % I and r % R, then r% % I.

Remark 2.1 It is inductively clear that Definition 2.1 implies the following.If %1, %2, . . . ,%n % I for any n % N, then r1%1 + r2%2 + · · · + rn%n % I for anyr1, r2, . . . , rn % R. Moreover, if 1 % I, then I = R. Also, if we are given a set

55

56 2. Ideals

of elements {%1, %2, . . . ,%n} in an integral domain R, then the set of all linearcombinations of the %j for j = 1, 2, . . . , n

13

4

n!

j=1

rj%j : rj % R for j = 1, 2, . . . , n

?@

A

is an ideal of R denoted by (%1, %2, . . . ,%n). In particular, when n = 1, we havethe following.

Definition 2.2 Principal and Proper Ideals

If R is an integral domain and I is an R-ideal, then I is called a principal R-idealif there exists an element % % I such that I = (%), where % is called a generatorof I. If I '= R, then I is called a proper ideal.

Example 2.1 Let n % Z and set

nZ = {nk : k % Z},

which is an ideal in Z and nZ = (n) = (!n) is indeed a principal ideal. Moreover,it is a proper ideal for all n '= ±1.

Example 2.1 is a segue to the question about how rings of integers behavein terms of intersection with Z. This is answered in the next result which willalso be valuable in §2.2–see Theorem 2.9 on page 73–but is also of interest in itsown right since it employs minimal polynomials characterized in Theorem 1.6on page 10.

Theorem 2.1 OF-Ideals Intersecting Z

If F is a number field and I is a nonzero OF -ideal, then I * Z contains anonzero element of Z.

Proof. Let % % I where % '= 0 and consider

m#,Q(x) = a0 + a1x + · · · + ad#1xd#1 + xd,

where aj % Z for all j = 0, 1, . . . , d ! 1 by Corollary 1.4 on page 11. If d = 1,then a0 = !% '= 0, and if d > 1, then a0 '= 0 since m#,Q(x) is irreducible inQ[x] by Corollary 1.4. Hence,

a0 = !a1%! · · ·! ad#1%d#1 ! %d % I,

2.1 The Arithmetic of Ideals in Quadratic Fields 57

as required. !

For the following illustration and what follows, the reader has to be familiarwith basic module theory. For those not so well versed or needing a reminder,see Exercise 2.1 on page 65.

Example 2.2 In R = Z[i], (2) and (3) are proper principal ideals. Moreover,the latter is an example of a special type of ideal that we now define and aboutwhich we will prove this assertion—see Example 2.3 on the following page.

Definition 2.3 Prime Ideals

If R is an integral domain, then a proper R-ideal P is called a prime R-ideal ifit satisfies the property that whenever, %& % P, for %,& % R, then either % % Por & % P.

In order to discuss any more features of ideal theory, we need to understandhow multiplication of ideals comes into play.

Definition 2.4 Products of ideals

If R is an integral domain and I, J are R-ideals, then the product of I and J ,denoted by IJ , is given by

IJ = {r % R : r =n!

j=1

%j&j where n % N, and %j % I, &j % J for 1 # j # n}.

Theorem 2.2 Criterion for Prime Ideals

If R is an integral domain and I is a proper R-ideal, then I is a primeR-ideal if and only the following property is satisfied:

for any two R-ideals J, K, with JK ( I, either J ( I or K ( I. (2.1)

Proof. Suppose that (2.1) holds. Then if %,& % R such that %& % I, thencertainly

(%&) = (%)(&) ( I,

taking J = (%) and K = (&) in (2.1), which therefore implies that (%) ( I or(&) ( I. Hence, % % I or & % I. We have shown that (2.1) implies I is prime.

Conversely, suppose that I is a prime R-ideal. If (2.1) fails to hold, thenthere exist R-ideals J, K such that JK ( I but K '( I and J '( I. Let % % J

58 2. Ideals

with % '% I and & % K with & '% I, then %& % I with neither of them being in Iwhich contradicts Definition 2.3 on the previous page. Hence, (2.1) holds andthe result is secured. !

Now we prove a result that links the notion of prime element and prime idealin the principal ideal case.

Theorem 2.3 Principal Prime Ideals and Prime Elements

If R is an integral domain and % % R is a nonzero, nonunit element, then

(%) is a prime R-ideal if and only if % is a prime in R.

Proof. Suppose first that (%) is a prime R-ideal. Then for any &, * % R suchthat %

## &*,&* % (&*) ( (%).

Since (%) is a prime R-ideal, then & % (%) or * % (%) by Definition 2.3 on thepreceding page. In other words, %

## & or %## *, namely % is a prime in R.

Conversely, suppose that % is prime in R. If &, * % R such that &* % (%),then there exists an r % R with &* = %r. Since % is prime, then %

## & or%

## *. Suppose, without loss of generality, that %## &. Thus, there is an s % R

such that & = %s, so & % (%). We have shown that (%) is a prime R-ideal byDefinition 2.3, which completes the proof. !

Example 2.3 In Example 2.2 on the previous page, (2) and (3) were consideredas principal ideals in the Gaussian integers. By Exercises 1.37–1.38 on page 31,3 is a Gaussian prime, but 2 is not. Therefore, by Theorem 2.3, (3) is a primeideal in the Gaussian integers but (2) is not.

Now that we may look at products of ideals, we may we look at the notionof division in ideals in order to link this with the element level and primes.Moreover, it will provide a segue for us to talk about explicit representation ofideal products in OF for quadratic fields F .

Definition 2.5 Division of Ideals

If R is an integral domain, then a nonzero R-ideal I is said to divide an R-idealJ if there is another R-ideal H such that J = HI.

The following shows that division of ideals implies containment.

Lemma 2.1 To Divide is to Contain

If R is an integral domain and I, J are R-ideals, with I## J , then J ( I.


Proof. Since I## J , then by Definition 2.3, there is an R-ideal H such that

J = IH. However, by Definition 2.1, J = IH ( I. !

Corollary 2.1 Suppose that R is an integral domain and I is an R-ideal sat-isfying the property that whenever I

## JK for R-ideals J, K, we have I## J or

I## K. Then I is a prime R-ideal.

Proof. Suppose that I## JK, then by Lemma 2.1, JK ( I, and the property

implies that either J ( I or K ( I, so by Theorem 2.2 on page 57, I is a primeR-ideal. !

Now we look at multiplication of ideals in quadratic fields. If the reader isin need of a reminder about the basics involved in modules and their transitionto ideals in the rings of integers in quadratic fields, then see Exercises 2.1–2.4.In any case, see Exercise 2.8 on page 66.

Multiplication Formulas for Ideals in Quadratic Fields.

Suppose that F = Q(&

D) is a quadratic number field, and OF is its ring ofintegers–see Theorem 1.3 on page 6. Let #F be the field discriminant givenin Definition 1.6 on page 7, and let

Ij = (aj , (bj +'

#F )/2),

for j = 1, 2 be OF -ideals, then

I1I2 = (g)$

a3,b3 +

&#F

2

&,

wherea3 =

a1a2

g2with g = gcd

$a1, a2,

b1 + b2

2

&,

andb3 +

1g

.)a2b1 + µa1b2 +

/

2(b1b2 + #F )

/(mod 2a3),

where ), µ, and / are determined by

)a2 + µa1 +/

2(b1 + b2) = g.

Note the above formulas are intended for our context, namely the ring ofintegers of a quadratic field OF , called the maximal order. In an order containedin OF that is not maximal, the above does not work unless we restrict to invert-ible ideals. For the details on, and background for, orders in general, see either

60 2. Ideals

[62, §1.5] or [65, §3.5]. Also, see Definition 2.14 on page 76 and Exercise 2.18on page 86.

Example 2.4 Consider #F = 40, with

I1 = (3, 1 +&

10) and I2 = (3,!1 +&

10),

so in the notation of the above description of formulas for multiplication, wehave

a1 = a2 = 3, b1 = 2 = !b2, g = 3, ) = 0 = /, µ = 1, b3 = 1 , and a3 = 1,

soI1I2 = (3, 1 +

&10)(3,!1 +

&10) = (3). (2.2)

Hence, the product of I1 and I2 is the principal ideal (3) in Z[&

10] = OF , andby Theorem 2.3 on page 58, (3) is not a prime ideal in OF since (3) divides I1I2

but does not divide either factor. To see this, note that if

(3)## (3,±1 +

&10),

then by Lemma 2.1 on page 58,

(3,±1 +&

10) ( (3),

which is impossible since it is easy to show that ±1 +&

10 '% (3). Moreover, byExercise 2.7 on page 66, I1 and I2 are prime OF -deals.

Example 2.4 motivates a study of prime decomposition of ideals in quadraticfields. For instance, (2.2) is the decomposition of the ideal (3) in Z[

&10] = OF

for F = Q(&

10) into the product of the two prime ideals I1 and I2. In whatfollows, we have a complete description. The notation (4/p) in the followingdenotes the Legendre symbol–see [68, §4.1].

Theorem 2.4 Prime Decomposition in Quadratic Fields

If OF is the ring of integers of a quadratic field F = Q(&

D), and p % Zis prime, then the following holds, where P1,P2, and P are distinct prime OF -ideals with norm p–see Exercise 2.7.

(p) = pOF =

13

4

P1P2 if p > 2,(D/p) = 1, or p = 2, D + 1 (mod 8),P if p > 2, (D/p) = !1, or p = 2, D + 5 (mod 8),P2 if p > 2, p

## D, or p = 2, D + 2, 3 (mod 4).


Proof. For the sake of simplicity of elucidation in the following Cases 2.1–2.3,we present only the instance where OF = Z[

&D] since the proof for OF =

Z[(1 +&

D)/2] is similar.

Case 2.1 (D/p) = 1 for p > 2.

The Legendre symbol equality tells us that there exists a b % Z such thatb2 + D (mod p). Also, since p ! D, then p ! b. Let

P1 = (p, b +&

D) and P2 = (p,!b +&

D).

If P1 = P2, then2b = b +

&D ! (!b +

&D) % P1,

so p## 2b by the minimality of p as demonstrated in Exercises 2.2–2.4, namely

2b % P1 * Z = (p).

Thus, P1 and P2 are distinct OF -ideals. By the multiplication formulas givenon page 59, we have, in the notation of those formulas, a3 = 1 and g = p, so

P1P2 = (p).

Case 2.2 (D/p) = !1 for p > 2.

Let %& % (p), where

% = a1 + b1

&D,& = a2 + b2

&D % Z[

&D].

Suppose that & '% (p). We have

%& = a1a2 + b1b2D + (a2b1 + a1b2)&

D = p(x + y&

D),

for some x, y % Z. Therefore,

a1a2 + b1b2D = px, (2.3)

anda2b1 + a1b2 = py. (2.4)

If b1 = 0, then by (2.3), p## a1a2. If p

## a1, then % = a1 % (p), so byDefinition 2.3 on page 57, (p) is an OF -prime ideal. If p

## a2, then p ! b2 since& '% (p), so by (2.4) p

## a1 and we again have that % % (p). Hence, we mayassume that b1 '= 0. Similarly, we may assume that a1 '= 0.

Multiplying (2.4) by a1 and subtracting b1 times (2.3), we get

b2(a21 ! b2

1D) = p(a1y ! b1x).

If p## (a2

1 ! b21D), then there exists a z % Z such that a2

1 ! b21D = pz. Therefore,

!1 =$

D

p

&=

$b21D

p

&=

$a21 ! pz

p

&=

$a21

p

&= 1,

a contradiction. Hence, p## b2. By (2.4), this means that p

## a2b1. If p## a2, then

p## (a2 + b2

&D), so & % (p), a contradiction to our initial assumption. Thus,

p## b1, so p

## (a1 + b1

&D), which means that % % (p).

62 2. Ideals

Case 2.3 p > 2 and p## D.

Let P = (p,&

D). Then by the multiplication formulas on page 59, witha3 = 1 and g = p in the notation there, P2 = (p). This completes Case 2.3.

It remains to consider the three cases for p = 2.

Case 2.4 p = 2 and D + 1(mod 8).

LetP =

.2, (1 +

&D)/2

/and P2 =

.2, (!1 +

&D)/2

/.

Then by the multiplication formulas as used above with a3 = 1 and g = 2, wehave

P1P2 = (2).

If P1 = P2, then (1+&

D)/2+(!1+&

D)/2 =&

D % P1, which is not possible.Thus, P1 and P2 are distinct. This is Case 2.4.

Case 2.5 p = 2 and D + 5(mod 8).

Let %& % (2), where

% = (a1 + b1

&D)/2, & = (a2 + b2

&D)/2 % Z[(1 +

&D)/2],

with aj and bj of the same parity for j = 1, 2. Suppose that & '% (2). We have

%& =a1a2 + b1b2D + (a2b1 + a1b2)

&D

4= 2

*x + y

&D

2

+= x + y

&D,

where x, y % Z are of the same parity. Thus,

a1a2 + b1b2D = 4x, (2.5)

anda2b1 + a1b2 = 4y. (2.6)

Multiplying (2.6) by a1 and subtracting b1 times (2.5), we get

b2(a21 ! b2

1D) = 4(ya1 ! xb1).

If a21 ! b2

1D is even then either a1 and b1 are both odd or both even. In theformer case, 1 + a2

1 + b21D + 5(mod 8), a contradiction, so they are both even.

Hence,

% = 2

*a1/2 + (b1/2)

&D

2

+% (2),

so (2) is a prime OF -ideal by Definition 2.3. If b2 is even, then by (2.6), 2## a2b1.

If 2## a2, then

& = 2

*a2/2 + (b2/2)

&D

2

+% (2),


contradicting our initial assumption. Hence, b1 is even and so a1 is even sincethey must be of the same parity. As above, this implies that % % (2). Thus, (2)is prime. This completes Case 2.5.

Case 2.6 p = 2 and D + 2(mod 4).

Let P = (2,&

D), which is an OF -ideal by Exercise 2.7 on page 66. Moreover,P2 = (2), by the multiplication formulas on page 59 with a3 = 1 and g = 2.

Case 2.7 p = 2 and D + 3(mod 4).

Let P = (2, 1 +&

D), which is an OF -ideal by Exercise 2.7. Moreover, as inCase 2.6, P2 = (2).

This completes all cases. !

Remark 2.2 Although we have not developed the full theory for ideals in gen-eral number fields, we will be able to talk about decomposition of ideals inquadratic fields. The following terminology will be suited to the more generalcase so we introduce it here—see [64]. Suppose that F = Q(

&D) is a quadratic

number field, #F is given as in Definition 1.6 on page 7, and (#F /p) denotesthe Kronecker symbol–see [68, pp. 199–200]. If p % Z is a prime, then

(p) is said to split in F if and only if$

#F

p

&= 1,

(p) is said to ramify in F if and only if$

#F

p

&= 0,

and(p) is said to be inert in F if and only if

$#F

p

&= !1.

Note, as well, that from the proof of Theorem 2.4, when (p) = P1P2, namelywhen (p) splits, then P2 is the conjugate of P1. This means that if

P1 = (p, b +&

D),

thenP2 = (p,!b +

&D).

Example 2.5 In Example 2.4 on page 60, with #F = 40, we saw that

(3) = I1I2 = (3, 1 +&

10)(3,!1 +&

10),

where $#F

3

&=

$403

&= 1,

64 2. Ideals

so (3) splits in Q(&

10) into the two prime Z[&

10]-ideals I1 and I2.In Examples 2.2 on page 57 and 2.3 on page 58, we saw that (2) is not a

prime ideal in Z[i] and that (3) is a prime Z[i]-ideal. Since

(2) = (1 + i)2,

whereP = (2, 1 + i) = (1 + i) = (2, 1! i) = (1! i)

is a prime Z[i]-ideal, then (2) is ramified in F = Q(i), where$

#F

2

&=

$!42

&= 0.

Also, (3) is a prime ideal and we see that$

#F

3

&=

$!43

&= !1,

so (3) is inert in F .

The following illustration shows that the converse of Lemma 2.1 on page 58does not hold in general and that the multiplication formulas, on page 59, donot necessarily hold if we do not have the ring of integers of a quadratic field inwhich to work.

Example 2.6 If R = Z[&

5], then

I = (2, 1 +&

5)

is an R-ideal by Exercise 2.3 on page 66, and clearly (2) = (2, 2&

5) ( I. IfI

## (2), then there exists an R-ideal J such that (2) = IJ . Thus, J has arepresentation

J = (a, b + c&

D)

with a, c % N, b % Z, 0 # b < a, such that c## a, c

## b and ac## (b2 ! c2D).

Moreover, J## (2), so by Lemma 2.1, (2) ( J , so there exist x, y % Z such that

2 = ax + (b + c&

D)y.

Therefore, y = 0 and a## 2. If a = 1, then I = (2), which means that

1 +&

5 % (2),

a contradiction, so a = 2. If b = 1, then c = 1, so

I2 = (2). (2.7)


However, by considering the multiplication of basis elements for I we see that

I2 = (4, 2(1 +&

5), 6 + 2&

5) = (4, 2(1 +&

5)),

where the last equality follows since 6+2&

5 is a linear combination of the otherbasis elements so is redundant. Thus,

I2 = (4, 2(1 +&

5)) = (2)(2, 1 +&

5) = (2)I,

and combining this with (2.7), we get (2) = (2)I, which implies

2(1 +&

5) % (2),

again a contradiction. We have shown both that although (2) ( I, I does notdivide (2), and that the multiplication formulas for ideals in R fail to hold. Note,that R is not the ring of integers of a quadratic field by Theorem 1.3 on page 6.(R is what is known as an order in OF = Z[(1+

&5)/2] for F = Q(

&5), and I is

an example of an ideal in R which is not invertible in R–see [62, Chapter 1, pp.23–30]. In an integral domain R, an invertible R-ideal is one for which thereis an R-ideal J such that IJ = R. It can be shown that all ideals in the ringof integers of a quadratic field are invertible, which is why the multiplicationformulas work there since they fail only for ideals that are not invertible.)

There are rings of integers for which the converse of Lemma 2.1 holds, calledDedekind domains, the topic of §2.2.

Exercises

2.1. Suppose that G is an additive abelian group, and that R is a commutativering with identity 1R which satisfy each of the following axioms:

(a) For each r % R and g, h % G, r(g + h) = (rg) + (rh).(b) For each r, s % R and g % G, (r + s)g = (rg) + (sg).(c) For each r, s % R and g % G, r(sg) = (rs)g.(d) For each g % G, 1R · g = g.

Then G is a (two-sided) module over R, or for our purposes, simply anR-module. Prove that (in general) being a Z-module is equivalent to beingan additive abelian group.

2.2. Let R = Z[4D], D % Z not a perfect square, and 4D = (( ! 1 +&

D)/(,with ( = 1 if D '+ 1(mod 4) and ( = 2 otherwise. Then every Z-submodule of R has a representation in the form I = [a, b + c4D] wherea, c % N and b % Z with 0 # b < a. Moreover, a is the smallest naturalnumber in I and c is the smallest natural number such that b + c4D % Ifor any b % Z. (Note that when c = 1, I is called primitive.)

66 2. Ideals

2.3. With reference to Exercise 2.2, prove that I = (a, b + c4D) is an R-idealif and only if c

## a, c## b, and ((b + c(( ! 1))2 + c2D (mod (2ac). (Note

that we use the square brackets for Z-modules and the round brackets forideals.)

2.4. With reference to Exercise 2.2, prove that the Z-module [a, b + c4D] fora, c % N, b % Z, is an R-ideal (a, b + c4D) if and only if c

## a, c## b, and

((b+c((!1))2 + c2D (mod (2ac). (Here a is the smallest natural numberin I, called the norm of I.)

2.5. Let [%,&] = %Z + &Z and [*, )] = *Z + )Z be two Z-modules, where%,&, *, ) % R, where R is given in Exercise 2.2. Prove that [%,&] = [*, )]if and only if $

%

&

&= X

$*

)

&,

where X % GL(2, Z), which is the general linear group of 2 5 2-matriceswith entries from Z, namely, those 252-matrices A such that det(A) = ±1,also called unimodular matrices. (Note that, in general, GL(n, Z) is thegeneral linear group of n5 n matrices with entries from Z.)

2.6. With reference to Exercise 2.2, prove that if % % R, and I = (a, %) is anR-ideal, then I = (a, na ± %) for any n % Z.

2.7. Let F be a quadratic number field and let P = (p, (b +&

#F )/2) be anOF -ideal where p % N is prime. Prove that P is a prime OF -ideal.

2.8. Verify the multiplication formulas on page 59.

2.2. Dedekind Domains 67

2.2 Dedekind Domains

I can’t cut this steak, he confidedTo the waiter who simply recited,Your prime cut of courseIs as tough as a horseSince you can’t take a prime and divide it.From Mathematical Conversation Starters (2002)—see [22, p. 221]

John dePillis, American mathematician at U.C. Riverside

In §1.3 we discussed unique factorization of elements in integral domains andlooked at applications thereof in §1.4. In §2.1 we introduced the notion of primeideals, and so the question of unique factorization of ideals in integral domainsnaturally arises. In particular, at the end of §2.1, we talked about the validityof the converse of Lemma 2.1 on page 58 in certain domains which is the topicof this section. In order to discuss this topic, we must prepare the stage withsome essential topics. First of all there are types of ideals which are core to thetheory, and to which we were introduced in [68, Definition A.21, p. 305].

Definition 2.6 Maximal Ideals

In an integral domain R, an ideal M is called maximal if it satisfies the propertythat whenever M ( I ( R, for any R-ideal I, then either I = R or I = M .

The next concept is necessary to prove our first result about maximal ideals.Note first that if I, J are R-ideals, then I + J is necessarily an R-ideal since forany r % R, r(% + &) % I + J by Definition 2.1 on page 55. We formalize this inthe following.

Definition 2.7 Sums of Ideals are Ideals

If R is a commutative ring with identity, and I, J are R-ideals, then

I + J = {% + & : % % I, & % I},

is an ideal in R.We use the above to prove our first result that we need to link maximality

with primality.

Theorem 2.5 Quotients of Prime Ideals are Integral Domains

If R is an integral domain, then an R-ideal P is prime if and only if R/P isan integral domain.

Proof. We note that R/P is a commutative ring with identity 1R+P and additiveidentity 0 + P. If 0 + P = 1R + P, then P = R, contradicting that P is prime. If

(% + P)(& + P) = P,

68 2. Ideals

then %& + P = P, so %& % P. Since P is prime, then either % % P or & % P. Inother words, either % + P = P or & + P = P. We have shown that R/P has nozero divisors, namely that it is an integral domain.

Conversely, if R/P is an integral domain, then 1R + P '= P, so 1R '% P,consequently P '= R. Since R/P has no zero divisors, then %& % P implies that

%& + P = P,

namely(% + P)(& + P) = P.

Thus, either % + P = P or & + P = P. In other words, either % % P or & % P, soP is a prime R-ideal. !

Now we link prime ideals with maximal ones.

Theorem 2.6 Maximal ideals are Prime

If R is an integral domain, then every nonzero maximal R-ideal is prime.

Proof. If M '= (0) is a maximal R-ideal, and M## (%)(&) for some %,& % R, with

M dividing neither factor, then by Definition 2.7 on the preceding page, M +(%)and M +(&) are R-ideals, both of which properly contain M , so M '= R. Hence,by the maximality of M , we have,

M + (%) = R = M + (&).

Therefore,

M 6 R = R2 = (M + (%))(M + (&)) ( M2 + (%)M + (&)M + (%)(&)M ( M,

a contradiction. We have shown that either M## (%) or M

## (&). Therefore, byCorollary 2.1 on page 58, M is prime. !

The next result tells us when an ideal is maximal with respect to quotientsin general.

Theorem 2.7 Fields and Maximal ideals

If R is an integral domain, then M is a maximal R-ideal if and only if R/Mis a field.

Proof. First we need the following fact.

Claim 2.1 R is a field if and only if the only ideals in R are (0) and R.


If R is a field and I '= (0) is an R-ideal, then there exists a nonzero element% % I. However, since R is a field, then there exists an inverse %#1 % R of %.By Definition 2.1 on page 55, %%#1 = 1R % I, so I = R.

Conversely, suppose that the only R-ideals are (0) and R. If % % R isnonzero, let (%) = %R = I. By hypothesis, I = R. Thus, there exist a & % Rsuch that &% = 1R, so % is a unit. However, % was chosen as an arbitrarynonzero element in R, so R is a field. This is Claim 2.1.

Suppose that R/M is a field for a given R-ideal M . If M ( I ( R foran R-ideal I, then I/M is an ideal of R/M , so by Claim 2.1, I/M = (0) orI/M = R/M . In other words, either I = M or R = M , namely M is maximal.

Conversely, if M is maximal, then by Theorem 2.6, either M = (0) or Mis prime. If M = (0), then R/(0) $= R is a field by Claim 2.1 given that (0)is maximal so R has no proper ideals. If M is prime, then by Theorem 2.5 onpage 67, R/M is an integral domain. Thus, it remains to show that all nonzeroelements of R/M have multiplicative inverses, namely that if %+M '= M , then% + M has a multiplicative inverse in R/M . Given % + M '= M , then % '% M .Thus, M is properly contained in the ideal (%) + M . Hence, (%) + M = R. Inother words,

1R = m + r%

for some m % M and r % R. Therefore, 1R ! r% = m % M , so

1R + M = r% + M = (r + M)(% + M),

namely r + M is a multiplicative inverse of % + M in R/M , so R/M is a field.!

Example 2.7 If R = Z/nZ, where n % N, a ring we studied in [68, pp. 79 !],then Z/nZ is a field if and only if n is prime. Hence, nZ is a maximal ideal inZ if and only if n is prime–see [68, Theorem 2.2, p. 81].

Example 2.8 Let F be a field, r % F is a fixed nonzero element, and

I = {f(x) % F [x] : f(r) = 0}.

We now demonstrate that I is a maximal ideal in F [x]. First, we show that I isindeed an ideal in F [x]. If g(x) % F [x], then for any f(x) % I, g(r)f(r) = 0, sog(x)f(x) % I, and clearly f(r)+h(r) = 0 whenever f(x), h(x) % I, which showsthat I is an F [x]-ideal. In fact, I = ker(#), where # is the natural map

# : F [x] ./ F [x]/I,

so I is maximal andF $= F [x]/I

–see [68, Example A.8. p. 305].

70 2. Ideals

Remark 2.3 A few comments on the notion of finite generation are in orderbefore we proceed. By Definition 1.4 on page 2 and Claim 1.1 on page 3 inthe proof of Theorem 1.1, we know that for any number field F , OF is finitelygenerated as a Z-module. Thus, any OF -ideal I will have a representation as

I = (%1, %2, . . . ,%d) with %j % OF for j = 1, 2, . . . , d,

and we say that I is finitely generated. In the instance where d = 1, we are inthe case of Definition 2.2 on page 56, namely a principal ideal.

We also need the following notion in order to complement Definition 1.8 onpage 9.

Definition 2.8 Integral Closure

If R ( S where R and S are integral domains, then R is said to be integrallyclosed in S if each element of S that is integral over R is actually in R.

Example 2.9 The integral domain Z is integrally closed in Q, but not in Csince

&!1 % C is integral over Z. However, Q is an instance of the following

notion that is also of interest to us here.

Definition 2.9 Field of Quotients

If D is an integral domain, then the field F consisting of all elements of theform %&#1 for %,& % D with & '= 0 is called the field of quotients or simply thequotient field of D.

Example 2.10 If F is any field, then the quotient field of the polynomialdomain F [x] is the field F (x) of rational functions in x. Moreover, the quotientfield of Z is Q. Indeed, the following result shows that the quotient field of OF

for any number field F is F .

Theorem 2.8 Quotient Fields of Number Rings

If F is a number field, then the quotient field of OF is F .

Proof. LetK = {%&#1 : %,& % OF , & '= 0},

which is the quotient field of OF . Suppose that * = %&#1 % K. Since OF ( F ,then * % F , so K ( F . Now if * % F , then by Lemma 1.1 on page 9, * = %/'where % % A and ' % Z. However, since

% = *' % F * A = OF

by Definition 1.5 on page 4, then % % OF ( F , so K ( F . Hence, K = F . !


Remark 2.4 It can easily be shown that if D is an integral domain and F isits field of quotients, then there is an isomorphic copy of D in F — just considerD1 = {% ·1#1 = % ·1 : % % D} ( F . We merely identify D1 with D and considerD as a subdomain of F .

Now we are in a position to define the main topic of this section—see Biog-raphy 1.2 on page 46.

Definition 2.10 Dedekind Domains

A Dedekind Domain is an integral domain R satisfying the following properties.

(A) Every ideal of R is finitely generated.

(B) Every nonzero prime R-ideal is maximal.

(C) R is integrally closed in its quotient field F .

Remark 2.5 Condition (C) says that if %/& % F is the root of some monicpolynomial over R, then %/& % R, namely &

## % in R.

The following is crucial in the sequel.

Definition 2.11 Ascending Chain Condition (ACC)

An integral domain R is said to satisfy the ascending chain condition (ACC) ifevery chain of R-ideals

I1 ( I2 ( · · · In ( · · ·

terminates, meaning that there is a n0 % N such that In = In0 for all n ) n0.

Remark 2.6 An equivalent way of stating the ACC is to say that R does notpossess an infinite strictly ascending chain of ideals.

The above is a segue to the following important notion that will carry usforward towards our goals–see Biography 2.1 on page 73.

Definition 2.12 Noetherian Domains

An integral domain R possessing the ACC is called a Noetherian Domain.

Lemma 2.2 Finite Generation and Noetherian Domains

If R is an integral domain, then R is a Noetherian Domain if and only ifevery R-ideal is finitely generated.

72 2. Ideals

Proof. Suppose that every R-ideal is finitely generated. Let

I1 ( I2 ( · · · ( In ( · · ·

be an ascending chain of ideals. It follows from Exercise 1.2 on page 16 that

I = 3"i=1Ij

is an R-ideal, and since any R-ideal is finitely generated, then there exist %j % Rfor j = 1, 2, . . . , d % N such that

I = (%1, %2, . . . ,%d).

Therefore, for each j = 1, 2, . . . , d, there exists a kj such that %j % Ikj . Let

n = max{k1, k2, . . . , kd}.

Then since In ( I and Ikj ( In since kj # n for each such j, then

(%1, %2, . . . ,%d) ( In,

which implies that I ( In. Hence,

In = 3"i=1Ij

and so In = Ij for each j ) n. Since the chain terminates, R satisfies the ACCso is a Noetherian domain.

Conversely, suppose that R is a Noetherian domain. If I is an R-ideal thatis not finitely generated, then I '= (0), so there exists %1 % I with %1 '= 0, and(%1) 6 I. Since I '= (%1), given that the former is not finitely generated, thenthere exists %2 % I and %2 '% (%1), so we have

(%1) 6 (%1, %2) 6 I.

Continuing inductively in this fashion, we get the strictly ascending chain ofideals,

(%1) 6 (%1, %2) 6 · · · 6 (%1, %2, . . . ,%n) 6 · · · 6 I,

which contradicts that R is a Noetherian domain. Hence, every R-ideal is finitelygenerated. !

Corollary 2.2 If F is a number field, then OF is a Noetherian domain.

Proof. This follows from Remark 2.3 on page 70 and Lemma 2.2. !

Corollary 2.3 Let R be a Noetherian domain. Then every nonempty subset ofR-ideals contains a maximal element.

Proof. Let T be the set of ideals with the property that for every ideal I of T,there exists an ideal J of T with I 6 J . If T '= ", then by its definition wemay construct an infinite strictly ascending chain of ideals in T, contradictingLemma 2.2 on the preceding page. This is the result. !

Immediate from Corollary 2.3 is the following result.


Corollary 2.4 In a Noetherian domain R, every proper R-ideal is containedin a maximal R-ideal.

Remark 2.7 Given Lemma 2.2, Condition (A) of Definition 2.10 may be re-placed by the condition that R is a Noetherian domain.

Biography 2.1 Emmy Amalie Noether (1882–1935) was born in Erlangen,Bavaria, Germany on March 23, 1882. She studied there in her early yearsand, in 1900, received certification to teach English and French in Bavariangirls’ schools. However, she chose a more di!cult route, for a woman of thattime, namely to study mathematics at university. Women were required to getpermission to attend a given course by the professor teaching it. She did thisat the University of Erlangen from 1900 to 1902, and passed her matricula-tion examination in Nurnberg in 1903, after which she attended courses at theUniversity of Gottingen from 1903 to 1904. By 1907, she was granted a doctor-ate from the University of Erlangen. By 1909, her published works gained herenough notoriety to receive an invitation to become a member of the DeutscheMathematiker-Vereiningung, and in 1915, she was invited back to Gottingenby Hilbert and Klein. However, it took until 1919 for the university to, grudg-ingly, obtain her habilitation, and permit her to be on the faculty. In that yearshe proved a result in theoretical physics, now known as Noether’s Theorem,praised by Albert Einstein as a penetrating result, which laid the foundationsfor many aspects of his general theory of relativity. After this, she workedin ideal theory, developing ring theory to be of core value in modern algebra.Her work Idealtheorie in Ringbereichen, published in 1921, helped cement thisvalue. In 1924, B.L. van der Waerden published his work Moderne Algebra,the second volume of which largely consists of Noether’s results. Her mostsuccessful collaboration was in 1927 with Helmut Hasse and Richard Braueron noncommutative algebra. She was recognized for her mathematical achieve-ments through invitations to address the International Mathematical Congress,the last at Zurich in 1932. Despite this, she was dismissed from her positionat the University of Gottingen in 1933 due to the Nazi rise to power given thatshe was Jewish. She fled Germany in that year and joined the faculty at BrynMawr College in the U.S.A. She died at Bryn Mawr on April 14, 1935. She wasburied in the Cloisters of the Thomas Great Hall on the Bryn Mawr campus.

One of our main goals is the following result that leads us toward a uniquefactorization theory for ideals in rings of algebraic integers.

Theorem 2.9 Rings of Integers are Dedekind Domains

If F is an algebraic number field, then OF is a Dedekind domain.

Proof. By Corollary 2.2 (in view of Remark 2.7), condition (A) of Definition2.10 is satisfied. In order to verify condition (B), we require some results asfollows.

74 2. Ideals

Assume that there is a prime OF -ideal P '= (0) that is not maximal. There-fore, the set S '= ", where S is the set of all proper OF -ideals that strictly containP. By Corollary 2.3, there is a maximal ideal M % S such that P 6 M 6 OF .By Theorem 2.6 on page 68, M is a prime OF -ideal. By Theorem 2.1 on page 56,there exists a nonzero a % P*Z. By Exercise 1.2 on page 16, P*Z is a Z-ideal.Suppose that ab % P * Z, where a, b % Z. Since P is a prime OF -ideal, thena % P or b % P so a % P * Z or b % P * Z, which means that P * Z is a primeZ-ideal. If p % P * Z is a rational prime, then

(p) ( P * Z

and (p) is a maximal Z-ideal by Theorem 2.7 on page 68 since Z/(p) is a fieldby Example 2.7 on page 69. Hence, since P*Z '= Z, then (p) = P*Z. However,

(p) = P * Z ( M * Z 6 Z,

where 1 '% M so(p) = P * Z = M * Z.

Since M % S, then P '= M , so there exists an % % M such that % '% P. Consider

m#,Q(x) = xd + ad#1xd#1 + · · · + a1x + a0 % Z[x] for some d % N.

Then m#,Q(%) % P. Now define ' % N to be the least value for which there existintegers bj such that

%! + b!#1%!#1 + · · · + b1% + b0 % P, (2.8)

for j = 0, 1, · · · , '! 1. Since % % M , then by properties of ideals,

%(%!#1 + b!#1%!#2 + · · · + b1) % M.

Also, since m#,Q(%) % P 6 M , then, again by properties of ideals,

m#,Q(%)!!#1!

j=1

%jbj ! %! = b0 % M, (2.9)

so b0 % M * Z = P * Z. If ' = 1, then % % P, a contradiction, so ' > 1. Thus,by (2.8)–(2.9),

%! + b!#1%!#1 + · · · + b1% + b0 ! b0 = %(%!#1 + b!#1%

!#2 + · · · + b1) % P.

However, since P is prime and % '% P, then

%!#1 + b!#1%!#2 + · · · + b1 % P,

contradicting the minimality of ' > 1. We have shown S = ", which establishesthat condition (B) of Definition 2.10 holds.


It remains to show that condition (C) holds. By Theorem 2.8 on page 70,OF has quotient field F . Let % % F be integral over OF . Also, OF is integralover Z – see Remark 1.5 on page 9 – so % is an algebraic integer in F . However,by Definition 1.5 on page 4, F * A = OF , so % % OF , which means that OF isintegrally closed and we have condition (C) that establishes the entire result. !

Now we aim at the main goal of this section, which is a unique factorizationtheorem for rings of integers. To this end, we first settle conditions for whichthe converse of Lemma 2.1 on page 58 holds. First, we require a more generalnotion of “ideal” in order to proceed.

Definition 2.13 Fractional Ideals

Suppose that R is an integral domain with quotient field F . Then a nonemptysubset I of F is called a fractional R-ideal if it satisfies the following threeproperties.

1. For any %,& % I, % + & % I.

2. For any % % I and r % R, r% % I.

3. There exists a nonzero * % R such that *I ( R.

When I ( R, we call I an integral R-ideal (which is the content of Defini-tion 2.1 on page 55) to distinguish it from the more general fractional ideal.

Remark 2.8 It is immediate from Definition 2.13 that if I is a fractional R-ideal, then there exists a nonzero * % R such that *I = J where J is an integralR-ideal. Hence, if R is Noetherian domain, then by Lemma 2.2 on page 71,there exist %1, %2, . . . ,%d for some d % N such that J = (%1, . . . ,%d). Hence,

I =1*

J =$

%1

*,%2

*, . . . ,

%d

*

&

is also finitely generated. Indeed, in a Noetherian domain, a fractional R-idealis the same as a finitely-generated R-submodule of the quotient field of R.

Example 2.11 Let R = Z, and F = Q. Then the fractional R-ideals are thesets

Iq = {qZ : q % Q+}.Since qZ = (!q)Z, we may restrict attention to the positive rationals Q+ with-out loss of generality. Also,

Iq1Iq2 = q1q2Z = Iq1q2 .

We have the isomorphism

S = {Iq : q % Q} $= Q+,

as multiplicative groups. The unit element of S is Z and the inverse element ofIq % S is (Iq)#1 = q#1Z. (See Exercise 2.18 on page 86.)

76 2. Ideals

Example 2.11 motivates the following.

Theorem 2.10 Inverse Fractional Ideals

If R is an integral domain with quotient field F , and I is a fractional R-ideal,then the set

I#1 = {% % F : %I ( R}

is a nonzero fractional R-ideal.

Proof. If %,& % I#1, then %I ( R and &I ( R, so

(% + &)I ( %I + &I ( R,

so % + & % I#1. If % % I#1 and r % R, %I ( R so r%I ( R, which impliesr% % I#1. Lastly, let * be a nonzero element of I. Then for any % % I#1,%I ( R, so in particular, *% % R. Hence, *I#1 ( R. This satisfies all threeconditions in Definition 2.13. !

Definition 2.14 Invertible Fractional Ideals

In an integral domain R, a fractional R-ideal I is called invertible if

II#1 = R,

where I#1, given in Theorem 2.10, is called the inverse of I.Now we may return to Dedekind domains and the pertinence of the above

to them.

Theorem 2.11 Invertibility in Dedekind Domains

If R is a Dedekind domain, then every nonzero integral R-ideal is invertible.

Proof. Since R is a Dedekind Domain, then every R-ideal I is finitely generated,so for I '= (0), there are %j % R for 1 # j # d such that I = (%1, %2, . . . ,%d). Ifd = 1, then I#1 = (%#1

1 ) and II#1 = R. Now the result may be extrapolatedby induction, and the result is established. !

Via the above, we are in a position to provide the promised converse ofLemma 2.1 on page 58.

Corollary 2.5 To Divide is the Same as to Contain

If R is a Dedekind domain, and I, J are R-ideals, then

I## J if and only if J ( I.


Proof. In view of Lemma 2.1, we need only prove one direction. Suppose that

J ( I. (2.10)

Now let H = I#1J , in which case J = IH where H is an R-ideal since by (2.10),

I#1J ( II#1 = R,

where the equality follows from Theorem 2.11. Thus, I## J , and we have secured

the result. !

As a consequence of Corollary 2.5, we see that a prime R-ideal P in aDedekind domain R satisfies the same property as prime elements in Z.

Corollary 2.6 Suppose that R is a Dedekind domain. Then P is a prime R-ideal if it satisfies the property that for any R-ideals I, J ,

P## IJ if and only if P

## I or P## J.

Proof. By Corollary 2.5, P## IJ if and only if IJ ( P and the latter holds, by

(2.1), if and only if I ( P or J ( P, so applying Corollary 2.5 to the latter weget the result. !

Also, we have the following result that mimics the same law for nonzeroelements of Z.

Corollary 2.7 Cancellation Law for Ideals in Dedekind Domains

Let R be a Dedekind domain. If I, J, L are R-ideals with I '= (0), andIJ ( IL, then J ( L.

Proof. If IJ = IL, then by Theorem 2.11,

J = RJ = I#1IJ ( I#1IL = RL = L,

as required. !

Now we are ready for the promised unique factorization result.

Theorem 2.12 Unique Factorization of Ideals

Every proper nonzero ideal in a Dedekind domain R is uniquely representableas a product of prime ideals. In other words, any R-ideal has a unique expression(up to order of the factors) of the form

I = Pa11 Pa2

2 . . .Pann ,

where the Pj are the distinct prime R-ideals containing I, and aj % N forj = 1, 2, . . . , n.

78 2. Ideals

Proof. First we must show existence. In other words, we must show thatevery ideal is indeed representable as a product of primes. Let S be the setof all nonzero proper ideals that are not so representable. If S '= ", then byCorollary 2.3 on page 72, S has a maximal member M . Thus, M is not a primeR-ideal, but by Corollary 2.4, M ( P where P is maximal, and so prime byTheorem 2.6 on page 68. Hence, R ( P#1 ( M#1, which implies that

M ( MP#1 ( MM#1 = R,

where the equality follows from Theorem 2.11 on page 76. We have shown thatMP#1 is an integral R-ideal. If P#1M = M , then

PP#1M = PM ( P,

where the latter inclusion comes from the fact that P is an ideal. Hence, M = Pby the maximality of P, a contradiction to M % S. Thus, M 6 P#1M , soP#1M is an integral ideal not in S which means there are prime ideals Pj forj = 1, 2, . . . d % N such that

P#1M = P1P2 · · ·Pd,

which impliesM = RM = PP#1M = PP1P2 · · ·Pd,

contradicting that M % S. We have shown S = ", thereby establishing exis-tence. It remains to show uniqueness of representation.

Let Pj and Qk be (not necessarily distinct) prime R-ideals such that

P1 · · ·Pr = Q1 · · ·Qs. (2.11)

Hence,P1 7 Q1 · · ·Qs,

so Qj ( P1 for some j = 1, 2, . . . , s. Without loss of generality, we may assumethat j = 1, by rearranging the Qj if necessary. However, by condition B ofDefinition 2.10, P1 = Q1. Multiplying both sides of (2.11) by P#1

1 , we get

P2 · · ·Pr = Q2 · · ·Qs.

Continuing in this fashion, we see that by induction, r = s and Pj = Qj for1 # j # s = r. !

In view of Theorem 2.12, we have an immediate consequence that is theprimary goal sought in this section.

Corollary 2.8 If F is a number field, then every proper, nonzero OF -ideal isuniquely representable as a product of prime ideals.

Proof. By Theorem 2.9 on page 73, OF is a Dedekind domain, so the result isa special case of Theorem 2.12. !


Example 2.12 In R = Z[&

10] let us look at the unique factorization of theR-ideal (6) as a product of prime ideals. Note that

P = (2,&

10),Q = (3, 1 +&

10), and Q% = (3, 1!&

10)

are prime ideals in Z[&

10] = OF for F = Q(&

10). The unique factorization ofthe principal ideal (6) is now apparent, as an exercise for the reader by employingthe multiplication formulas on page 59:

(6) = P2QQ%.

We note that the element 6 in R does not have unique factorization since

6 = 2 · 3 = (4 +&

10)(4!&

10),

where each factor is irreducible. Hence, unique factorization is restored at theideal level by Dedekind’s contribution of the theory of ideals.

The developments in this section allow us to now define gcd and lcm conceptsfor ideals that mimic those for rational integers.

Definition 2.15 A gcd and lcm for Ideals

If R is a Dedekind domain, and I, J are R-ideals, then

gcd(I, J) = I + J,

andlcm(I, J) = I * J.

If gcd(I, J) = R, then I and J are said to be relatively prime.

Remark 2.9 The notion of relative primality given in Definition 2.15 is thedirect analogue for rational integers since R = (1R) is a principal ideal. Thisis of course what we mean in Z since the pair of integers can have no commondivisors. Let us look at this directly.

If I, J are relatively prime, then

gcd(I, J) = I + J = R.

If an R-ideal H divides both I and J , then by Corollary 2.5 on page 76, I ( Hand J ( H, so I + J = R ( H, which means that H = R. Hence, the onlyR-ideal that can divide both I and J is R = (1).

The next result is the exact analogue for rational integers of the one that weproved in [68, Theorem 1.13 (b), p.26].

80 2. Ideals

Lemma 2.3 Product of the Ideal-Theoretic gcd and lcm

If R is a Dedekind domain and I, J are R-ideals, then

gcd(I, J) · lcm(I, J) = (I + J)(I * J) = IJ.

Proof. By the definition of an ideal, any elements of I + J times any element ofI * J must be in I and J , so in IJ . Thus,

(I * J)(I + J) ( IJ.

Conversely, any element of IJ is in both I and J , so in I * J , and trivially inI + J . Thus,

IJ ( (I * J)(I + J),

from which the desired equality follows. !

The following exploits our unique factorization result to provide an analogueof the same result for rational integers that we proved in [68, Theorem 1.17, p.34].

Theorem 2.13 Prime Factorizations of gcd and lcm of Ideals

Suppose that R is a Dedekind domain and I, J are R-ideals with prime fac-torizations given via Theorem 2.12 by

I =r7

j=1

Paj

j , and J =r7

j=1

Pbj

j ,

where Pj are prime R-ideals with integers aj , bj ) 0. Then

gcd(I, J) =r7

j=1

Pmj

j , and lcm(I, J) =r7

j=1

PMj

j ,

where mj = min(aj , bj) and Mj = max(aj , bj), for each j = 1, . . . , r.

Proof. Since gcd(I, J) = I + J , then

gcd(I, J) =r7

j=1

Paj

j +r7

j=1

Pbj

j =r7

j=1

Pmj

j (r7

j=1

Paj#mj

j +r7

j=1

Pbj#mj

j ).

However, for each j, one of aj !mj or bj !mj is zero, so the right hand sum isR since the two summands are relatively prime. In other words,

gcd(I, J) =r7

j=1

Pmj

j ,


as required. Now, by Lemma 2.3 on the preceding page, (I *J)(I +J) = IJ , so

IJ =r7

j=1

Paj+bj

j =r7

j=1

Pmj

j (I * J) = (I + J)(I * J),

so

lcm(I, J) = I * J =r7

j=1

Paj+bj#mj

j =r7

j=1

PMj

j ,

and we have the complete result. !

Remark 2.10 Theorem 2.13 tells us that, when R is a Dedekind domain,lcm(I, J) is actually the largest ideal contained in both I and J and gcd(I, J)is the smallest ideal containing both I and J .

The following allows us to compare unique factorization of elements withthat of ideals and show where Dedekind’s contribution comes into play.

Definition 2.16 Irreducible Ideals, gcds and lcms

If R is an integral domain, then an R-ideal I is called irreducible if it satisfiesthe property that whenever an R-ideal J

## I, then J = I or J = R.

Theorem 2.14 Irreducible = Prime in Dedekind Domains

If R is a Dedekind domain, and I is an R ideal, then I is irreducible if andonly if I is a prime R-ideal.

Proof. Let I be irreducible and let J, K be R-ideals such that I## JK. Since

gcd(I, J)## I, then gcd(I, J) = I or gcd(I, J) = R. If gcd(I, J) = I, then

I + J = I, which means that

I = J = gcd(I, J).

Now suppose that I ! J . Then gcd(I, J) = R, so there exist % % I and & % Jsuch that % + & = 1R. Therefore, given an arbitrary * % K,

* = *% + *&.

Since I## JK, then by Corollary 2.5 on page 76, JK ( I, so &* % I since

&* % JK. However, %* % I so * % I. This shows that K ( I, so by Corollary2.5, we have that I

## K. Hence, by Theorem 2.2 on page 57, I is prime.Conversely, suppose that I is prime. If I = HJ for some nontrivial R-ideals

H and J , then either I|H or I|J . If I|H, there is an R-ideal L such that H = IL.Therefore,

I = HJ = ILJ.

By Corollary 2.7 on page 77, (1) = R = LJ . Hence, J = (1) = R, so I isirreducible. !

The following is immediate from Theorem 2.14, and is the analogue of thedefinition of a rational prime.

82 2. Ideals

Corollary 2.9 If R is a Dedekind domain, then I is a prime R-ideal if andonly if it satisfies the property that

whenever J## I for a proper R-ideal J then I = J.

Remark 2.11 It follows from Lemma 1.2 and Theorem 1.16 on page 38 thatthe failure of unique factorization in an integral domain R is the failure ofirreducible elements to be prime in R. However, since Theorem 2.14 tells usthat irreducible ideals are the same as prime ideals in a Dedekind domain,then we have unique factorization restored at the ideal level via Theorem 2.12on page 77. In particular, rings of integers OF of number fields F have uniquefactorization ideals since Theorem 2.9 on page 73 tells us that OF is a Dedekinddomain. Thus, the magnitude of of Dedekind’s contribution is brought to lightby this fact.

We need the following concept that is intimately linked to the notion of aUFD, especially when we are dealing with Dedekind domains–see Definition 1.20on page 37.

Definition 2.17 Principal Ideal Domain (PID)

An integral domain R in which all ideals are principal is called a principal idealdomain, or PID for convenience.

Theorem 2.15 PIDs and Noetherian Domains

If R is a PID, then R is a Noetherian domain.

Proof. If we have a nested sequence of R-ideals

(%1) ( (%2) ( · · · (%j) ( · · · ,

then it follows from Exercise 1.2 on page 16 that 3"j=1(%j) is an R-ideal. Thus,since R is a PID, there exists an % % R such that

3"j=1(%j) = (%),

so there exists an n % N such that % % (%n). Therefore,

(%j) = (%n) = (%)

for all j ) n. Thus, the ACC condition of Definition 2.11 on page 71 is satisfiedand R is a Noetherian domain. !

Theorem 2.16 PIDs and UFDs

If R is a PID then R is a UFD.


Proof. Let S be the set of all % % R such that (%) is not a product of irreducibleelements. If S '= ", then by Corollary 2.3 on page 72, via Theorem 2.15, S hasa maximal element (m). Thus, (m) is a proper ideal (since a unit is vacuously aproduct of irreducible elements by Definition 1.19 on page 37). Therefore, (m)is contained in a maximal R-ideal (M) for some M % R by Corollary 2.4 onpage 73, again via Theorem 2.15. Thus, M

## m and (M) '= (m) by Theorem 2.6on page 68. Since M is a product of irreducible elements, there exists an %

## msuch that % is irreducible. Therefore, m = %& for some & % R. If & is aunit, then m is irreducible since associates of irreducibles are also irreducible,a contradiction. Hence, & is not a unit. If (&) '% S, then & is a product ofirreducibles, and so is m, a contradiction. Thus, (&) % S. However, &

## m,so (m) ( (&), by Corollary 2.5 on page 76. Also, (m) '= (&) since % is not aunit, given that it is irreducible. Hence, (m) is properly contained in (&) ( S,a contradiction to the maximality of (m) in S, so S = ". This establishes thatall nonzero elements are expressible as a product of irreducible elements.

We may complete the proof by showing that all irreducible elements areprime and invoke Theorem 1.16 on page 38. Suppose that r % R is an irreducibleelement and r

## %&, %,& % R with r not dividing %. Then by the irreducibilityof r, we must have that r and % are relatively prime, namely

R = (r) + (%),

so there exist s1, s2 % R such that 1R = rs1 + %s2. Therefore,

(&) = (rs1& + %s2&) ( (r),

since r## %& implies that (r) 7 (%&), so both rs1& % (r) and %s2& % (r). In

other words, r## &, so r is prime as required. !

Now we look at PIDs and UFDs in the case of Dedekind domains, which willbe of value when we study binary quadratic forms in §3.2.

Theorem 2.17 UFDs are PIDs for Dedekind domains

If R is a Dedekind domain, then R is a UFD if and only if R is a PID.

Proof. In view of Theorem 2.16 on the preceding page, we need only provethat R is a PID when it is a UFD. Let R be a UFD. If there exists an R-idealthat is not principal, then by Theorem 2.12 on page 77, there exists a primeR-ideal P that is not principal. Let S consist of the set of all R-ideals I suchthat PI is principal. By Exercise 2.11 on page 85, S '= ". By Remark 2.7 andCorollary 2.3 on page 72, S has a maximal element M . Let

PM = (%).

If % = &* where & % P is irreducible, then (&) = PJ where J is an R-ideal suchthat J

## M , so J 7 M . By the maximality of M , we have J = M , so * is a unit

84 2. Ideals

and % is irreducible. Since P is not principal, there is a nonzero ) % P ! (%),and since M = (%) would imply that P = R, there is a nonzero ( % M ! (%).Thus,

)( % PM ( (%),

so %## )(. However, % divides neither ) nor (, so % is not prime. This contradicts

Theorem 1.16 on page 38. !

In view of Theorem 1.17 on page 39 and Theorem 2.17, it is now apparentwhy we introduced Euclidean domains in §1.3, where we were concerned withintroducing the importance of the notion of unique factorization of algebraicintegers.

We conclude this section with a result that is the analogue of [68, Theorem1.22, p. 40]. The reader should be familiar with the basics on ring actions suchas that covered in [68, pp. 303–305].

Theorem 2.18 Chinese Remainder Theorem for Ideals

Let R be a commutative ring with identity and let I1, . . . , Ir be pairwiserelatively prime ideals in R. Then the natural map

5 : R/ *rj=1 Ij ./ R/I1 5 · · ·5R/Ir

is an isomorphism.The above statement is equivalent to saying that if &1, &2, . . . ,&r % R, there

exists a & % R such that &!&j % Ij for each j = 1, 2, . . . , r, where & is uniquelydetermined modulo *r

j=1Ij. The latter means that

any * satisfying * ! &j % Ij for each such j implies & ! * % *rj=1Ij . (2.12)

Proof. Since 5(s) = 0 if and only if s % *rj=1Ij , then ker(5) = (0), since the Ij

are pairwise relatively prime. It remains to show that 5 is a surjection. Let

&1, &2, . . . ,&r % R.

We must show that there is a & % R such that 5(&) = (&1, . . . ,&r). This istantamount to saying: there is a & % R such that & ! &k % Ik for each k. SinceIi + Ij = R for all i '= j, then by induction

Ik + *j (=kIj = R.

Thus, for each such k, there exists an %k % Ik and rk % *j (=kIj such that

&k = %k + rk with &k ! rk % Ik and rk % Ij for all j '= k.

Set

& =r!

j=1

rj .


Then& ! &k =

!

j (=k

rj + (rk ! &k) % Ik,

as required. !

Remark 2.12 In Theorem 2.18, we may use the notation

* + &j (mod Ij),

to denote * ! &j % Ij . Then (2.12) becomes:

any * satisfying * + &j (mod Ij) for 1 # j # r implies & + * (mod *rj=1 Ij).

For more on this concept see Exercises 8.32–8.39 on pages 292–293.

Exercises

2.9. Let R be a Dedekind domain. If I, J are R-ideals, prove that there existsan % % I such that gcd((%), IJ) = I.

2.10. Let R be a Dedekind domain, and let I, J, H be R-ideals. Prove thatI(J + H) = IJ + IH.

2.11. Let R be a Dedekind domain and I, J nonzero R-ideals. Prove that thereis an R-ideal H, relatively prime to J , such that HI is principal.

2.12. Prove that, in a Noetherian domain, every ideal can be represented as theintersection of a finite number of irreducible ideals.

2.13. A commutative ring R with identity is said to satisfy the descending chaincondition, denoted by DCC for convenience, on ideals if every sequenceI1 7 I2 7 · · · 7 Ij 7 · · · of R-ideals terminates. In other words, thereexists an n % N such that Ij = In for all j ) n. Prove that R satisfiesthe DCC if and only if every nonempty collection of ideals contains aminimal element. (Rings of the above type are called Artinian rings–seeBiography 2.2 on page 87.)

2.14. Let R be an integral domain with quotient field F . Prove that everyinvertible fractional R-ideal is a finitely generated R-module.

2.15. Let R,S be commutative rings with identity such that R ( S, and s % S.Prove that if s is integral over R, then R[s] is a finitely-generated R-module.

2.16. Let R be an integral domain with quotient field F . Prove that everynonzero finitely-generated submodule I of F is a fractional R-ideal.

2.17. Prove that in an integral domain R, the following are equivalent.

86 2. Ideals

(a) Every nonzero fractional R-ideal is invertible.(b) The set G of all fractional R-ideals forms a multiplicative group.

2.18. Prove that in an integral domain R, the following are equivalent.

(i) R is a Dedekind domain.(ii) Every proper R-ideal is a unique product of a finite number of prime

ideals (up to order of the factors), and each is invertible.(iii) Every nonzero R-ideal is invertible.(iv) Every fractional R-ideal is invertible.(v) The set G of all fractional R-ideals forms a multiplicative abelian

group.(vi) R is an integrally closed, Noetherian domain, and every nonzero

prime ideal is maximal.

(Hint: Use Exercises 2.14–2.17.)

2.19. Suppose that R is a Dedekind domain with quotient field F and I is anR-ideal. Also, we define ordP(I) = a where a ) 0 is the largest power ofthe prime ideal P dividing I, namely Pa

## I but Pa+1 does not divide I.The value ordP(I) is called the order of I with respect to P. Prove thefollowing.

(a) For R-ideals I, J , ordP(IJ) = ordP(I) + ordP(J).(b) For R-ideals I, J , ordP(I + J) = min(ordP(I), ordP(J)).(c) For any R-ideal I, there exists an % % F such that ordP((%)) =

ordP(I) for any prime R-ideal P## I.

2.20. Prove that every R-ideal in a Dedekind domain R can be generated by atmost two elements.(Hint: Use Exercise 2.19.)


Biography 2.2 Emil Artin (1898–1962) was born on March 3, Vienna, Aus-tria in 1898. He served in the Austrian army in World War I, after which heentered the University of Leipzig. In 1921 he obtained his doctorate, the the-sis of which was on quadratic extensions of rational function fields over finitefields. In 1923, he had his Habilitation, allowing him to become Privatdozent atthe University of Hamburg. In 1925, he was promoted to extraordinary profes-sor at Hamburg. In that same year, he introduced the theory of braids, which isstudied today by algebraists and topologists. In 1928, he worked on rings withminimum condition, the topic of Exercise 2.13, which are now called Artinianrings. In 1937, Hitler enacted the New O!cial’s Law, which enabled a mecha-nism for removing not only Jewish teachers from university positions but alsothose related by marriage. Since Artin’s wife was Jewish, although he was not,he was dismissed. In 1937, he emigrated to the U.S.A. and taught at severaluniversities there, including eight years at Bloomingdale at Indiana Universityduring 1938–1946, as well as Princeton from 1946 to 1958. During this time,in 1955, he produced what was, arguably, the catalyst for the later classificationof finite simple groups, by proving that the only (then-known) coincidences inorders of finite simple groups were those given by Dickson in his Linear Groups.In 1958, he returned to Germany where he was appointed again to the Univer-sity of Hamburg. Artin’s name is attached not only to the aforementioned rings,but also to the reciprocity law that he discovered as a generalization of Gauss’squadratic reciprocity law. One of the tools that he developed to do this is whatwe now call Artin L-functions. He also has the distinction of solving one ofHilbert’s famous list of twenty-three problems posed in 1900.

He was an outstanding and respected teacher. In fact, many of his Ph.D.students such as Serge Lang, John Tate, and Max Zorn went on to majoraccomplishments. He also had an interest in astronomy, biology, chemistry,and music. He was indeed an accomplished musician in his own right playingthe flute, harpsichord, and clavichord. He died in Hamburg on December 20,1962.

88 2. Ideals

2.3 Application to Factoring

If you want a helping hand, you’ll find one at the end of your arm.Audrey Hepburn (originally Edda Van Heemstra), (1929–1993)

Belgian Actress

In [68, §4.3, pp. 201–208], we saw the importance of factoring methods,especially in terms of the security of certain cryptosystems.

In this section, we will look at factoring using certain cubic integers, namelythe integers from

OF = Z[ 3&!2] = Z[ 3

&2]

(since 3&!2 = ! 3

&2), which is the ring of integers of

F = Q( 3&!2) = Q( 3

&2),

by Exercise 2.21 on page 96). In this section, we will show how we may employthese cubic integers in Z[ 3

&!2] to factor integers in Z. In order to do this we

need to introduce some more general aspects of number fields upon which wehave only touched. In Definition 1.11 on page 18, we introduced the notion ofthe norm of an element in a quadratic field. We need to generalize this in orderto apply the notion needed for cubic integers, and other number fields later on.In order to do this, we need to motivate another important concept related to anumber field. This is motivated by our quadratic case. For instance, if F = Q(i)is the Gaussian field, then there are exactly two monomorphisms

61(x + yi) = x + yi and 62(x + yi) = x! yi (x, y % Q)

from F into C, the complex field. Since the degree of the Gaussian field overQ is |F : Q| = 2, one might expect that the number of such monomorphisms is|F : Q| for a general number field F , and this is indeed the case. The readershould be familiar with the aforementioned notation for field degree, as well aspolynomial degree and background material that is, for instance, contained in[68, Appendix A, pp. 298–306].

Theorem 2.19 Monomorphisms of a Number Field

If F is a number field with degree |F : Q| = n, then there exist exactly nmonomorphisms

6j : F / C,

for j = 1, 2, . . . , n.

Proof. By Theorem 1.5 on page 10, there is an algebraic integer % such thatF = Q(%). Let m#,Q(x) be the minimal polynomial of % over Q. It follows fromCorollary 1.3 on page 11 that

deg(m#,Q) = |Q(%) : Q| = n.

2.3. Application to Factoring 89

Since m#,Q(x) has n distinct roots, say % = %1, %2, . . . ,%n,

m#,Q(x) = (x! %1)(x! %2) · · · (x! %n).

By Theorem 1.5 on page 10, each element & % F can be expressed uniquely inthe form

& = q0 + q1% + · · · + qn#1%n#1

where q0, q1, . . . , qn#1 % Q, so for j = 1, 2, . . . , n we define

6j : F / C,

by

6j(&) = 6j(q0 + q1% + · · · + qn#1%n#1) = q0 + q1%j + · · · + qn#1%

n#1j .

Claim 2.2 For j = 1, 2, . . . , n, 6j is a field homomorphism.

Let &, * % F . Then for qi, ri % Q, (1 # i # n! 1)

& = q0 + q1% + · · · + qn#1%n#1 and * = r0 + r1% + · · · + rn#1%

n#1. (2.13)

Therefore,

& + * = (q0 + r0) + (q1 + r1)% + · · · + (qn#1 + rn#1)%n#1,

from which we get, for 1 # j # n,

6j(& + *) = (q0 + r0) + (q1 + r1)%j · · · + (qn#1 + rn#1)%n#1j =

(q0 + q1%j + · · · + qn#1%n#1j ) + (r0 + r1%j + · · · + rn#1%

n#1j ) =

6j(&) + 6j(*),

so 6j is additive. It remains to show the 6j are multiplicative.In view of (2.13), let

f(x) = q0 + q1x + · · · + qn#1xn#1 and g(x) = r0 + r1x + · · · + rn#1x

n#1,

and use the Euclidean algorithm for polynomials (which we had occasion to usein the proof of Theorem 1.6 on page 10), to establish that there exist q(x), r(x) %Q[x] such that

f(x)g(x) = m#,Q(x)q(x) + r(x),

where deg(r(x)) < deg(m#,Q(x)) = n. Since f(%) = & and g(%) = * whilem#,Q(%) = 0, then

&* = f(%)g(%) = m#,Q(%)q(%) + r(%) = r(%).

Thus,

6j(&*) = 6j(r(%)) = r(%j) = m#,Q(%j)q(%j)+r(%j) = f(%j)g(%j) = 6j(&)6j(*),

so 6j is multiplicative and we have established Claim 2.2.

90 2. Ideals

Claim 2.3 For j = 1, 2, . . . , n, 6j is a monomorphism.

Suppose that

& = q0 + q1% + · · · + qn#1%n#1 % F, * = r0 + r1% + · · · + rn#1%

n#1 % F,

and6j(&) = 6j(*),

soq0 + q1%j + · · · + qn#1%

n#1j = r0 + r1%j + · · · + rn#1%

n#1j ,

which means that %j is a root of

h(x) = (q0 ! r0) + (q1 ! r1)x + · · · + (qn#1 ! rn#1)xn#1,

where deg(h(x)) < n. Since deg(h(x)) > 0 would contradict that

m#,Q(x) = m#j ,Q(x)

is the minimal polynomial of %j , then deg(h(x)) = 0, so qi ! ri = 0 for i =0, 1, . . . , n! 1. This means that qi = ri for each such i so

& = q0 + q1% + · · · + qn#1%n#1 = r0 + r1% + · · · + rn#1%

n#1 = *,

which secures Claim 2.3.It remains to show that there are no other monomorphisms of F into C.Let

( : F / C

be a monomorphism. Then

m#,Q(((%)) = ((m#,Q(%)) = ((0) = 0,

which implies that((%) = %j

for some j = 1, 2, . . . , n, since these are the only roots of the minimal polynomial.Hence,

((%) = 6j(%),

so((q0 + q1% + · · · + qn#1%

n#1) = q0 + q1%j + · · · + qn#1%n#1j =

6j(q0 + q1% + · · · + qn#1%n#1),

for all qj % Q. We have shown that ( = 6j for some j = 1, 2, . . . , n, whichsecures the result. !

Theorem 2.19 motivates the following. The reader should solve Exercise 2.22on page 96 in preparation.


Definition 2.18 Conjugates of an Element and a Field

If % % C and F is a number field such that % is algebraic over F , then theconjugates of % over F , also called the F -conjugates of %, are the roots ofm#,F (x) in C. If F = Q(%), then the fields Q(%j) are called the conjugate fieldsof F .

We are now in a position to provide the promised generalization of the notionof norm and related notions.

Definition 2.19 Norm and Trace of Elements

If F is an algebraic number field, |F : Q| = n, and % % F , let % = %1, %2, . . . ,%n

be the F -conjugates of %. Then the norm of % is

NF (%) = %1%2 · · ·%n,

and the trace of % is

TF (%) = %1 + %2 + · · · + %n.

Remark 2.13 From Exercise 2.25 on page 96, we see that NF (%), TF (%) % Qand

n7

j=1

(x! %j) % Q[x].

This polynomial is distinguished as follows.

Definition 2.20 Field Polynomials over F

If % % F where F is a number field, and % = %1, %2, . . . ,%n are the F -conjugatesof %, then the field polynomial of % over F is given by

fF (%) = (x! %)(x! %2) · · · (x! %n).

We now look at a motivating example.

Example 2.13 We look at how to factor the fifth Fermat number

F5 = 232 + 1.

For convenience, set % = 3&!2. First, notice that

2F5 = x3 + 2, where x = 211,

and thatNF (x! %) = x3 + 2, with x! % % Z[%].

92 2. Ideals

In fact, by Exercise 2.27, any & = a + b% + c%2 has norm

NF (&) = a3 ! 2b3 + 4c3 + 6abc. (2.14)

By Exercise 2.26, there is a prime & % Z[%] such that &## (x!%), so by Exercise

2.29,NF (&)

## NF (x! %) = x3 + 2.

Hence, we may be able to find a nontrivial factorization of F5 via norms ofcertain elements of Z[%]. We do this as follows.

Consider elements of the form a+ b% % Z[%], for convenience, and sieve overvalues of a and b, testing for

gcd(NF (a + b%), F5) = gcd(a3 ! 2b3, F5) > 1.

For convenience, we let a run over the values 1, 2, . . . , 100, and b run over thevalues b = 1, 2, . . . 20. Formal reasons for this approach will be given later. Wefix each value of a, and let b run over its range of values. The runs for 1 # a # 15and 1 # b # 20 yield

gcd(a3 ! 2b3, F5) = 1.

However, at a = 16, b = 5, we get

gcd(163 ! 2 · 53, F5) = 641.

In fact,F5 = 641 · 6700417.

We may factor 16 + 5% as follows.

16 + 5% = (1 + %)(!1 + %)(%)(!9 + 2%! %2),

where 1 + % is a unit with norm !1; !1 + % has norm !3; % has norm !2; and& = !9 + 2%! %2 has norm !641. This accounts for

163 ! 2 · 53 = 2 · 3 · 641,

and shows that & is the predicted prime divisor of x ! %, which gives us thenontrivial factor of F5.

The method in Example 2.13 works well largely because of the small valueof F5. However, it may not be feasible for larger values to check all of the gcdconditions over a much larger range. The following method of Pollard, whichhe introduced in 1991 in [78], uses the above notions of factorizations in Z[%] tofactor F7, which was first accomplished in 1970.

An important role in factorization is played by the following notion, whichwe will need as part of the algorithm to be described.


Definition 2.21 Smooth Integers

A rational integer z is said to be smooth with respect to y % Z, or simplyy-smooth, if all prime factors of z are less than or equal to y.

As in the above case, suppose that n % N with

2n = m3 + 2.

For instance,2F7 = m3 + 2

where m = 243. Pollard’s idea to factor n = F7 involves B-smooth numbers ofthe form a + bm, for some suitable B that will be the number of primes in aprescribed set defined in the algorithm below. Also, a + b% will be B-smoothmeaning that its norm is B-smooth in the sense of Definition 2.21. Thus, if weget a factorization of a + b% in Z[%], we also get a corresponding factorizationof a + bm modulo F7. To see this, one must understand a notion that we willgeneralize when we discuss the number field sieve in Appendix A. We let

5 : Z[%] ./ Z/nZ

be a ring homomorphism such that 5(%) = m. Thus, in Z/nZ,

x3 = !2 = !(1 + 1), where 1 is the identity of Z/nZ.

Hence, 5 is that unique map which is defined element-wise by the following.

5

B

C2!

j=0

zj%j

D

E =2!

j=0

zjmj % Z/nZ, where zj % Z.

The role of this map 5 in attempting to factor a number n is given by thefollowing.

Suppose that we have a set S of polynomials

g(x) =2!

j=0

zjxj % Z[x]

such that 7

g'S

g(%) = &2

where & % Z[%], and 7

g'S

g(m) = y2,

where y % Z. Then if 5(&) = x % Z, we have

x2 + 5(&)2 + 5(&2) + 5

B

C7

g'S

g(%)

D

E +7

g'S

g(m) + y2 (mod n).

94 2. Ideals

In other words, this method finds a pair of integers x, y such that

x2 ! y2 + (x! y)(x + y) + 0 (mod n),

so we may have a nontrivial factor of n by looking at gcd(x! y, n).We now describe the algorithm, but give a simplified version of it, since this

is meant to be a simple introduction to the ideas behind the number field sieve,which we will present in detail in Appendix A. The following is adapted from[64].

We use a very small value of n as an example for the sake of simplicity,namely n = 23329. Note that 2n = 363 + 2 = m3 + 2. We will also makesuitable references in the algorithm in terms of how Pollard factored n = F7.

! Pollard’s Algorithm

Step 1. Compute a factor base.

The term “factor base”means the choice of a suitable set of rational primesover which we may factor a set of integers. In the case of cubic integers inZ[%] = Z[ 3

&!2], we take for n = 23329 only the first eleven primes, those up

to and including 41 (or for n = F7, Pollard chose the first five hundred rationalprimes) as FB1, the first part of the factor base, and for the second part, FB2,we take those primes of Z[%] with norms ±p, where p % FB1. (The reasonsbehind the choice of the number of primes in FB1 are largely empirical.) Also,we include the units !1, 1 + %, and 1/(1 + %) = !1 + % ! %2 in FB2. Here,we have discarded the Z[%]-primes of norm p2 or p3, since these cannot divideour n, given that they cannot divide the a + b%, with the assumptions we aremaking.

Step 2. Run the sieve.

In this instance, the sieve involves finding numbers a+bm that are composedof some primes from FB1. For n = 23329, we sieve over values of a from !5 to5 and values of b from 1 to 10 (or for n = F7, Pollard chose values of a from!4800 to 4800, and values of b from 1 to 2000). Save only coprime pairs (a, b).

Step 3. Look for smooth values of the norm, and obtain factorizations ofa + bx and a + b%.

Here, smooth values of the norm means that N = NF (a + b%) = a3 ! 2b3

is not divisible by any primes bigger than those in FB1. For those (a, b) pairs,factor a + bm by trial division, and eliminate unsuccessful trials. Factor a + b%by computing the norm NF (a+ b%) and using trial division. When a prime p isfound, then divide out a Z[%]-prime of norm ±p from a + b%. This will involvegetting primes in the factorization of the form a + b% + c%2 where c '= 0. Unitsmay also come into play in the factorizations, and a table of values of (1+%)j is


kept for such purposes with j = !2, · · · , 2 for n = 23329 (or for F7, one shouldchoose to keep a record of units for j = !8,!7, . . . , 8). Some data extracted forthe run on n = 23329 is given as follows.

Table 2.1

a + b% + c%2 N factorization of a + b% + c%2

5 + % 3 · 41 (!1 + %)(!1! 2%! 2%2)4 + 10% !24 · 112 !(3 + 2%)2%4(!1 + %! %2)2!1 + % !3 !1 + %

!1! 2%! 2%2 !41 !1! 2%! 2%2

3 + 2% 11 3 + 2%% !2 %

!1 + %! %2 !1 unit

Table 2.2

a + bm + cm2 factorization of a + bm + cm2

5 + m 414 + 10m 22 · 7 · 13!1 + m 5 · 7

!1! 2m! 2m2 !5 · 13 · 413 + 2m 3 · 52

m 22 · 32

!1 + m!m2 !13 · 97

Step 4. Complete the factorization.

By selecting !1 times the first four rows in the third column of Table 2.1,we get a square in Z[%]:

&2 = (!1 + %)2(!1! 2%! 2%2)2(3 + 2%)2%4(!1 + %! %2)2, (2.15)

and correspondingly, since &2 is also !1 times the first four rows in the firstcolumn of Table 2.1, we get:

&2 = (5 + %)(!4! 10%)(!1 + %)(!1! 2%! 2%2). (2.16)

Then we get a square in Z from Table 2.2 by applying 5 to (2.16):

5(&2) = (5+m)(!4!10m)(!1+m)(!1!2m!2m2) = 22 ·52 ·72 ·132 ·412 = y2.

Also, by applying 5 to & via (2.15), we get:

5(&) = (!1+m)(!1!2m!2m2)(3+2m)m2(!1+m!m2) + 9348 (mod 23329),

so by setting x = 5(&), we have x2 = 52(&) = 5(&2) + y2 (mod n). Sincey = 2 · 5 · 7 · 13 · 41 + 13981(mod 23329), then y ! x + 4633(mod 23329).However, gcd(4633, 23329) = 41. In fact 23329 = 41 · 569.

Pollard used the algorithm in a similar fashion to find integers X and Y forthe more serious factorization gcd(X!Y, F7) = 59649589127497217. Hence, wehave a factorization of F7 as follows.

F7 = 59649589127497217 · 5704689200685129054721.

96 2. Ideals

Essentially, the ideas for factoring using cubic integers above is akin to thenotion of the strategy used in the quadratic sieve method. There, we try togenerate su"ciently many smooth quadratic residues of n close to

&n. In the

cubic case, we try to factor numbers that are close to perfect cubes. In AppendixA, we will extend these ideas to show how F9 was factored using the numberfield sieve, and Z[ 5

&2].

Exercises

2.21. Prove that Z[ 3&!2] is the ring of integers of Q( 3

&!2).

2.22. Let F be a number field and let % % A such that F = Q(%). Prove thatif %j for j = 1, 2, . . . , n are all the F -conjugates of %, then all the fieldsQ(%j) are isomorphic for j = 1, 2, . . . , n.

2.23. Prove that if F ( K ( E ( C, where F,K, E are fields, then |E : F | =|E : K| · |K : F |, where any of the degrees may be infinite.

2.24. Suppose that F = Q(%) where % % A, & % F , and & = &1, &2, . . . ,&n arethe F -conjugates of &. Prove that if m%,Q(x) = xd+qd#1xd#1+· · ·+q1x+q0

is the minimal polynomial of & over Q, thenn7

j=1

(x! &j) = m%,Q(x)n/d % Q[x].

2.25. If F is a number field and & % F prove that NF (&) % Q and TF (&) % Q,and if & % OF , then NF (&) % Z and TF (&) % Z.Conclude that if fF (x) =

8nj=1(x! &j) is the field polynomial of &, then

!TF (&) is the coe"cient of xn#1 and ±NF (&) is the constant term.

2.26. Prove that every nonzero ideal in a Dedekind domain R must contain aprime element.

2.27. Prove that (2.14) holds in Example 2.13 on page 91.

2.28. Prove that the norm given in Definition 2.19 on page 91 is multiplicativeand the trace is additive. In other words, for any %,& % OF , NF (%&) =NF (%)NF (&), and TF (% + &) = TF (%) + TF (&).

2.29. Prove that if &## * for &, * % OF , where F is a number field, then NF (&)

divides NF (*).

2.30. Use Pollard’s method to factor F6.

In Exercises 2.31–2.33, use the gcd method described before Pollard’s method tofind an odd factor of the given integer.

2.31. 577 ! 1.

2.32. 7149 + 1. (Hint: Use Z[ 3&!7].)

2.33. 3239 ! 1. (Hint: Use Z[ 3&

3].)

Chapter 3

Binary Quadratic Forms

What is it indeed that gives us the feeling of elegance in a solution, in demon-stration? It is the harmony of the diverse parts, their symmetry, their happybalance; in a word it is all that introduces order, all that gives unity, thatpermits us to see clearly and to comprehend at once both the ensemble andthe details.

Henri Jules Poincare (1854–1912)French mathematician–see Biography 3.8 on page 147

This chapter requires that the reader have a basic understanding of thefundamental background material on abstract algebra such as to be found, forinstance, in [68, Appendix A]. We take an algebraic approach to binary quadraticforms that is straightforward and unmasks some of the otherwise di"cult-to-interpret underpinnings of the theory.

3.1 Basics

Lagrange was the first to introduce the theory of quadratic forms, laterexpanded by Legendre, and greatly magnified even later by Gauss (see [68,Biography 2.7, p. 114], [68, Biography 4.1, p. 181], and [68, Biography 1.7, p.33]). An integral binary quadratic form is given by

f(x, y) = ax2 + bxy + cy2 with a, b, c % Z. (3.1)

For simplicity, we may suppress the variables, and denote f by (a, b, c). Thevalue a is called the leading coe!cient, the value b is called the middle coe!cient,and c is called the last coe!cient. If gcd(a, b, c) = 1, then we say that f(x, y) isa primitive form.

97

98 3. Binary Quadratic Forms

The aforementioned three great mathematicians looked at the representationproblem: Given a binary quadratic form (3.1), which n % Z are represented byf(x, y)? In other words, for which n do there exist integers x, y such thatf(x, y) = n? If gcd(x, y) = 1, then we say that n is properly represented byf(x, y). For instance, when studying criteria for the representation of a naturalnumber n as sums of two squares, such as in Theorem 1.13 on page 26, or [68,Section 6.1, pp. 243–251], a simple answer can be given. When looking atnorm-forms x2 + Dy2 = n, where D % Z, such as in [18] or [68, Section 7.1,pp. 265–273], the problem can be given a relatively simple answer for certain nand D. In general, there is no simple complete answer. Moreover, an even moregeneral and di"cult problem arises, namely when can an integer be representedby a binary quadratic form from a given set of such forms? The theory of binaryquadratic forms deals with this question via the following notion. In the balanceof our discussion, we use the term form to mean binary quadratic form.

Definition 3.1 Equivalent Binary Quadratic Forms

Two forms f(x, y) and g(x, y) are said to be equivalent if there exist integersp, q, r, s, such that

f(x, y) = g(px + qy, rx + sy) and ps! qr = ±1. (3.2)

For simplicity, we may denote equivalence of f and g by f $ g. If ps! qr = 1,then f and g are said to be properly equivalent, and if ps ! qr = !1, they aresaid to be improperly equivalent. Two forms f and g are said to be in the sameequivalence class or simply in the same class, if f is properly equivalent to g.

Remark 3.1 Definition 3.1 says that equivalent forms represent the same in-tegers, and the same is true for proper representation – see Exercise 3.1 onpage 103. Moreover, since

det$

p qr s

&= ps! qr = ±1,

this means that $p qr s

&% GL(2, Z),

– see Exercise 2.5 on page 66. Note, as well, that proper equivalence means thatps! qr = 1 so $

p qr s

&% SL(2, Z),

the subgroup of GL(2, Z) with elements having determinant 1. Properly equiv-alent forms are said to be related by a unimodular transformation, namelyX = px + qy and Y = rx + sy with ps! qr = 1. Note as well, by Exercise 3.3on page 103, proper equivalence of forms is an equivalence relation.

3.1 Basics 99

The notion of proper and improper equivalence is due to Gauss. Lagrangeinitiated the idea of equivalence, although he did not use the term. He merelysaid that one could be “transformed into another of the same kind,” but didnot make the distinction between the two kinds. Similarly Legendre did notrecognize proper equivalence. However, there is a very nice relationship betweenproper representation and proper equivalence, since as Exercise 3.2 on page 103shows, the form f(x, y) properly represents n % Z if and only if f(x, y) isproperly equivalent to the form nx2 + bxy + cy2 for some b, c % Z.

Example 3.1 For f(x, y) = x2 + 7y2, n = 29 = 1 + 7 · 22 = f(1, 2),f(x, y) is properly equivalent to g(x, y) = 29x2 + 86xy + 64y2 since f(x, y) =g(3x ! y,!2x + y), where p = 3, q = !1, r = !2, s = 1. With reference to Re-mark 3.1 on the facing page, X = 3x! y, Y = !2x+ y represents a unimodulartransformation.

The following notion is central to the discussion and links equivalent formsin another way.

Definition 3.2 Discriminants of Forms

The discriminant of the form f(x, y) = ax2 + bxy + cy2 is given by

D = b2 ! 4ac.

If D > 0, then f is called an indefinite form. If D < 0 and a < 0, then fis called a negative definite form, and if D < 0 and a > 0, then f is called apositive definite form.

Remark 3.2 By Exercise 3.7 on page 103, if forms f and g have discriminantsD and D1, respectively, and f(x, y) = g(px+qy, rx+sy), then D = (ps!qr)2D1.Thus, equivalent forms have the same discriminant. However, forms with thesame discriminant are not necessarily equivalent — see Exercise 3.8 on page 104.Furthermore, if f(x, y) = ax2 + bxy + cy2, then by completing the square, weget

4af(x, y) = (2ax + by)2 !Dy2,

so when D > 0, the form f(x, y) represents both positive and negative integers.This is the justification for calling such forms “indefinite.” If D < 0 and a < 0,then f(x, y) represents only negative integers, thus the reason they are called“negative definite,” and if a > 0, then they represent only positive integers,whence the term “positive definite.” Since we may change a negative definiteform into a positive definite one by changing the signs of all the coe"cients,it is su"cient to consider only positive definite forms when D < 0. We will,therefore, not consider negative definite forms in any discussion hereafter.

Congruence properties of the discriminant of a form may provide us withinformation on representation. For instance, Exercise 3.9 on page 104 tellsus that congruence properties modulo 4 determine when an integer may be


represented by forms with discriminant D + 0, 1(mod 4). Furthermore, whatthis tells us is that we can take the equation D = b2 ! 4ac and let a = 1 andb = 0 or 1 according as D + 0 or 1(mod 4), so then c = !D/4 or !(D ! 1)/4,respectively. Thus, we get a distinguished form of discriminant D given asfollows.

Definition 3.3 Principal Forms

If D + 0, 1(mod 4), then (1, 0,!D/4) or (1, 1,!(D ! 1)/4), respectively, arecalled principal forms of discriminant D.

Remark 3.3 Via Exercise 3.10, we see that if D = !4m, we get the formx2+my2. As we shall see, these forms are particularly important in the historicaldevelopment of the representation problem. Indeed, entire books, such as [18]are devoted to discussing this issue. There is a general notion that allows us tolook at canonical forms for more illumination of the topic. This is given in thefollowing which is due to Lagrange.

Definition 3.4 Reduced Forms

A primitive form f(x, y) = ax2 + bxy + cy2, of discriminant D, is said to bereduced if

(a) When D < 0 and a > 0, then

|b| # a # c, and if either |b| = a or a = c, then b ) 0. (3.3)

(b) When D > 0, then

0 < b <&

D and&

D ! b < 2|a| <&

D + b. (3.4)

Note that since f is positive definite in part (a) of Definition 3.4, then byDefinition 3.2 on the preceding page, both a and c are positive.

With the notion of reduction in hand, we have the following result, whichprovides us with a unique canonical representative for equivalence classes ofpositive definite forms.

Theorem 3.1 Positive Definite and Reduced Forms

Every positive definite form is properly equivalent to a unique reduced form.

Proof. Let f(x, y) = ax2 + bxy + cy2 be a primitive positive definite form. Let nbe the least positive integer represented by f . By Exercise 3.2 on page 103, thereexist B, C % Z such that f $ g properly, where g(X, Y ) = nX2 + BXY + CY 2.For any integer z, the transformation X = x! zy, Y = y yields

g(X, Y ) = nx2 + (B ! 2nz)xy + (nz2 !Bz + C)y2.

3.1 Basics 101

If we set z = Ne9

B2n

:, the nearest integer to B/(2n), then

!12

<B

2n! z # 1

2,!n # B ! 2nz # n, and |B ! 2nz| # n.

Thus, if we set b1 = B ! 2nz and c1 = nz2 !Bz + C, then

g(X, Y ) = nx2 + b1xy + c1y2,

where |b1| # n. Thus, f is properly equivalent to g, g is positive definite, andg(0, 1) = c1. Therefore, g represents c1, which implies c1 % N, and c1 ) n bythe minimality of n. We have shown that f is properly equivalent to a reducedform. The balance of the result will follow from the next result.

Claim 3.1 Any two properly equivalent reduced forms must be identical.

Suppose that the form f(x, y) = ax2 + bxy + cy2 is reduced and properlyequivalent to the reduced form g(x, y) = Ax2+Bxy+Cy2 via the transformationf(x, y) = g(px + qy, rx + sy) with ps! qr = 1. We may assume without loss ofgenerality that a ) A. Also, a straightforward calculation shows that

A = ap2 + bpr + cr2,

B = 2apq + b(ps + qr) + 2crs, (3.5)

C = aq2 + bqs + cs2.

Furthermore, we have that|b| # a # c. (3.6)

Using (3.6) we get,

A = ap2 + bpr + cr2 ) ap2 ! |bpr| + cr2

) ap2 ! |bpr| + ar2 = a(p2 + r2)! |bpr|. (3.7)

However, since p2 + r2 ) 2|pr|, then (3.7) is greater than or equal to 2a|pr| !|bpr| ) a|pr|, where the latter inequality follows from (3.6) again. We haveshown that

A ) a|pr|. (3.8)

However, by assumption a ) A, so |pr| # 1. If |pr| = 0, then

A = ap2 + bpr + cr2 ) ap2 + ar2 = a(p2 + r2) ) a,

from which it follows that A = a. On the other hand, if |pr| = 1, then by (3.8)A ) a, so again we get A = a.

It remains to show that B = b since, once shown, it follows from Exercise 3.7on page 103 that C = c since B2 ! 4AC = b2 ! 4ac.

Suppose that c > C. Then c > a since C ) A = a. If |pr| = 1, then by(3.7), using the fact that cr2 > ar2, we get that A > a, a contradiction. Hence,|pr| = 0. If p = 0, then again using (3.7), we conclude that A > a, so r = 0.


Since ps ! qr = 1, then ps = 1. Moreover, since |B| # A = a given that gis reduced, then from (3.6), we get !a # |B| ! |b| # a. However, by (3.5),B = 2apq + b. It follows that q = 0 and B = b.

Lastly, suppose that c < C. By solving for a, b, c in terms of A, B,C we mayreverse the roles of the variables and argue as above to the same conclusion thatB = b. This completes the proof. !

Remark 3.4 The above says that there is a unique representative for eachequivalence class of positive definite binary quadratic forms. Furthermore, byExercise 3.11 on page 104, when D < 0, the number hD of classes of primitivepositive definite forms of discriminant D is finite, and hD is equal to the numberof reduced primitive forms of discriminant D. (Note that we prove hD < " ingeneral for field discriminants in Theorem 3.7 on page 116.)

The case for indefinite forms is not so straightforward. The uniqueness issue,in particular, is complicated since we may have many reduced forms equivalentto one another, and the determination as to which reduced forms are equivalentis more di"cult. Yet, we resolve this issue in Theorem 3.5 on page 110.

We conclude this section with a result due to Landau–see Biography 3.1 onpage 104. This result precisely delineates the negative discriminants D = !4nfor which hD = 1 and the proof is essentially that of Landau [48].

Theorem 3.2 When h#4n = 1 for n > 0

If n % N, then h#4n = 1 if and only if n % {1, 2, 3, 4, 7}.

Proof. Suppose that h#4n = 1. f(x, y) = x2+ny2 is clearly reduced since a = 1,b = 0, and c = n ) 1 in Definition 3.4 on page 100. The result is clear for n = 1,so we assume that n > 1.

Case 3.1 n is not a prime power.

There exists a prime p## n such that pd||n, for d % N, where || denotes proper

division, also commonly called exactly divides, namely pd## n, but pd+1 ! n —

see [68, Definition 1.3, p. 16] for the general notion. Let a = min(pd, n/pd) andc = max(pd, n/pd). Thus, gcd(a, c) = 1, where 1 < a < c, since n is not a primepower. Thus, g(x, y) = ax2+cy2 is a reduced form of discriminant !4ac = !4n,so h#4n > 1, given that f(x, y) is also a reduced form of discriminant D, unequalto g(x, y). This completes Case 3.1.

Case 3.2 n = 2! where ' % N.

We need to show that h#4n > 1 for ' ) 3. If ' = 3, then D = !32 and theform g(x, y) = 3x2+2xy+3y2 is a reduced form of discriminant 22!4·3·3 = !32not equal to f(x, y), so we may assume that ' ) 4. Set

g(x, y) = 4x2 + 4xy + (2!#2 + 1)y2,

3.1 Basics 103

which is primitive since gcd(4, 4, 2!#2 + 1) = 1, and reduced since 4 < 2!#2 + 1.Moreover, the discriminant is

D = 42 ! 4 · 4 · (2!#2 + 1) = !16 · 2!#2 = !2!+2 = !4n,

but g '= f . This completes Case 3.2.

Case 3.3 n = pk where p > 2 is prime and k % N.

Suppose that n+1 is not a prime power. Then, as in Claim 3.1, we may writen + 1 = ac, where 1 < a < c and gcd(a, c) = 1. Thus, g(x, y) = ax2 + 2xy + cy2

is a reduced form of discriminant 22 ! 4ac = 4! 4(n + 1) = !4n, and f '= g, soh#4n > 1.

Lastly suppose that n + 1 = 2t where t % N, observing that n + 1 = pk + 1is even. If t ) 6, then g(x, y) = 8x2 + 6xy + (2t#3 + 1)y2 is reduced since8 < 2t#3 + 1, and gcd(8, 6, 2t#3 + 1) = 1. Also, g has discriminant

D = 62 ! 4 · 8(2t#3 + 1) = 4! 4 · 2t = 4! 4(n + 1) = !4n,

and f '= g, so h#4n > 1. For t # 5 we have that t % {1, 2, 3, 4, 5} have thecorresponding values n % {1, 3, 7, 15, 31}. It remains to exclude n = 15, 31.

If n = 15, then n is not a prime power so this violates the hypothesis ofCase 3.3. If n = 31, then the form g(x, y) = 5x2 + 4xy + 7y2 is reduced sinceb = 4 < a = 5 < c = 7, and is primitive since gcd(a, b, c) = 1. Lastly, thediscriminant is

D = 42 ! 4 · 5 · 7 = !4 · 31.

This completes Case 3.3, and we are done for this direction of the proof.Now we assume that n % {1, 2, 3, 4, 7}. That h#4n = 1 is Exercise 3.13. !

Exercises

3.1. Prove that equivalent forms represent the same integers, and the same istrue for proper representation.

3.2. Prove that the form f(x, y) properly represents n if and only if f(x, y) isproperly equivalent to the form nx2 + Bxy + Cy2 for some B, C % Z.

3.3. Prove that proper equivalence of forms is an equivalence relation, namelythat the properties of reflexivity, symmetry, and transitivity are satisfied.

3.4. Prove that improper equivalence is not an equivalence relation.

3.5. Prove that any form equivalent to a primitive form must itself be primitive.

3.6. Prove that if f represents n % Z, then there exists a g % N such thatn = g2n1 and f properly represents n1.

3.7. Suppose that f $ g where f is a form of discriminant D and g is aform of discriminant D1, then D = (ps ! qr)2D1 = D1, where f(x, y) =g(px + qy, rx + sy).


3.8. Provide an example of forms with the same discriminant that are notequivalent.

3.9. Let D + 0, 1(mod 4) and let n be an integer relatively prime to D. Provethat if n is properly represented by a primitive form of discriminant D,then D is a quadratic residue modulo |n|, and if n is even, then D + 1(mod 8). Conversely, if n is odd and D is a quadratic residue modulo|n|, or n is even and D is a quadratic residue modulo 4|n|, then n % Z isproperly represented by a primitive form of discriminant D.

3.10. Let n % Z and p > 2 a prime not dividing n. Prove that p is represented bya primitive form of discriminant !4n if and only if the Legendre symbolequality (!n/p) = 1 holds.(Hint: Use Exercise 3.9.)

3.11. For a fixed integer D < 0, let hD be the number of classes of primitivepositive definite forms of discriminant D. Prove that hD is finite and isequal to the number of reduced forms of discriminant D.

3.12. Let n % N and p > 2 prime with p ! n. Prove that the Legendre symbol(!n/p) = 1 if and only if p is represented by one of the h#4n reducedforms of discriminant !4n.(Hint: See Exercises 3.10–3.11 and Theorem 3.1 on page 100.)

3.13. Prove that if n % {1, 2, 3, 4, 7}, then h#4n = 1.

Biography 3.1 Edmund Landau (1877–1938) was born in Berlin, Germanyon February 14, 1877. He studied mathematics at the University of Berlin,where his doctoral thesis, awarded in 1899, was supervised by Frobenius. Landautaught at the University of Berlin for the decade 1899–1909. In 1909, when hewas appointed as ordinary professor at the University of Gottingen, he hadamassed nearly seventy publications. His appointment at Gottingen was asa successor to Minkowski. Hilbert and Klein were also colleagues there–seeBiography 3.5 on page 127. He became full professor there until the Nazis forcedhim out in 1933. On November 19, 1933, he was given permission to work atGroningen, Netherlands, where he remained until he retired on February 7,1933. He returned to Berlin where he died of a heart attack on February 19,1938.

Landau’s major contributions were in analytic number theory and the dis-tribution of primes. For instance, his proof of the prime number theorem,published in 1903, was much more elementary than those given by Poussin andHadamard–see [68, §1.9, pp. 65–72] for a detailed overview. He establishedmore than 250 publications in number theory and wrote several books on num-ber theory, which were influential.

3.2. Composition and the Form Class Group 105

3.2 Composition and the Form Class Group

The further mathematical theory is developed, the more harmoniously and uni-formly does its construction proceed, and unsuspected relations are disclosedbetween hitherto separated branches of the science.

David Hilbert–see Biography 3.5 on page 127

Gauss is responsible for being the first to see the deep connections withingenus theory (which we will study in §3.4) and composition, even though theseeds were there in the earlier work of Lagrange. However Gauss’s definitionof composition is di"cult to use. Something close to Gauss’ idea is given viaExercise 3.30 on page 145 in the positive definite case, where the product of twoforms f(x1, y1) and g(x2, y2) of discriminant !4n is equal to a form F (X, Y )where X and Y are integral bilinear forms. We take an approach that is dueto Dirichlet and is much easier. First we need to develop some new notions.The first result allows us to select a canonical form in each equivalence class.For ease of elucidation, we restrict our attention to discriminants that are fielddiscriminants–see Definition 1.6 on page 7.

Lemma 3.1 Canonical Forms

Let F = Q(&

#F ) be a quadratic field of discriminant #F and let m % Z.Then every proper equivalence class of forms of discriminant #F contains aprimitive form with positive leading coe!cient that is relatively prime to m.

Proof. Let f = (a, b, c) % C!F and set

Pa,m,c =7

p

p

where the product ranges over all distinct primes p such that p## a, p

## c andp

## m. Also setPa,m =

7

q

q

where the product ranges over all distinct primes q such that q## a, q

## m, butq ! c, let

Pc,m =7

r

r

where the product ranges over all distinct primes r such that r## c, r

## m, butr ! a, and

Sm =7

s

s

where the product ranges over all distinct primes s such that s## m but s !

Pa,m,cPa,mPc,m. Then f represents

aP 2a,m + bPa,mPc,mSm + c(Pc,mSm)2 = N. (3.9)


Claim 3.2 gcd(N, m) = 1.

Assume that a prime t## N and t

## m. Assume first that t## a. Then

t## Pa,m,cPa,m

by the definition of the latter. If t## Pa,m, then by (3.9),

t## cPc,mSm.

However, t ! Pc,mSm, so t## c. This contradicts the fact that t

## Pa,m. Hence,t ! Pa,m, so t

## Pa,m,c. It follows from (3.9) that

t## bPa,mPc,mSm.

However, we have already shown that t ! Pa,m and since t## a, then t ! Pc,m.

Also, t## Pa,m,c, so t ! Sm, which implies that t

## b. We have shown thatt

## gcd(a, b, c), contradicting that f is primitive. Hence, our initial assumptionwas false, namely we have shown that t ! a. Therefore,

t## Pc,mSm

by the definition of the latter. However, by (3.9), this implies that t## aPa,m, a

contradiction to what we have already shown. This secures the claim.By Exercise 3.2 on page 103, Claim 3.2 tells us that f is properly equivalent

to the formg(x, y) = Nx2 + Bxy + Cy2

for some B, C % Z. If N > 0, then we have our result.If N < 0, then by setting x0 = Bm' + 1 and y0 = !2N'm for some ' % Z,

g(x0, y0) = Nx20 + Bx0y0 + Cy2

0

= N(Bm' + 1)2 + B(Bm' + 1)(!2N'm) + C(2N'm)2

= NB2m2'2 + 2NBm' + N ! 2NB2m2'2 ! 2NB'm + 4CN2'2m2

= N(1!m2'2(B2 ! 4NC)) = N(1!m2'2#F ) = Q,

where Q > 0 if N < 0.Since f represents

Q = N(1!m2'2#F )

and Q is relatively prime to m, given that N and 1 ! m2'2#F are relativelyprime to m, then Exercise 3.2 gives us the complete result.

!

Now we make a connection with ideals that greatly simplifies the presenta-tion.


Theorem 3.3 Ideals and Composition of Forms

Suppose that OF is the ring of integers of a quadratic field of discriminant#F and

f(x, y) = ax2 + bxy + cy2

is a primitive form, with a > 0, of discriminant #F = b2 ! 4ac. Then

I = (a, (!b +'

#F )/2)

is an OF -ideal.

Proof. Since #F = b2 ! 4ac, then b2 + #F (mod 4a), so by Exercise 2.4 onpage 66, I is an OF -ideal. !

Note that in Theorem 3.3, we must exclude the case a < 0 since the normof an ideal must be positive. This excludes the negative definite case, but inview of Remark 3.2 on page 99, there is no loss of generality. Moreover, in theindefinite case, with a < 0, we may circumvent this via the techniques given inthe proof of Theorem 3.5 on page 110. In particular, see (3.13) on page 112.

Definition 3.5 United Forms

Two primitive forms f = (a1, b1, c1) and g = (a2, b2, c2) of discriminant D arecalled united if gcd(a1, a2, (b1 + b2)/2) = 1.

Note that in Definition 3.5, since b21 ! 4a1c1 = b2

2 ! 4a2c2, then b1 and b2

have the same parity so (b1 + b2)/2 % Z.

Theorem 3.4 United Forms and Uniqueness

If f = (a1, b1, c1) and g = (a2, b2, c2) are united forms of discriminant D,where D is a field discriminant, then there exists a unique integer b3 modulo2a1a2 such that b3 + bj (mod 2aj) for j = 1, 2 and b2

3 + D (mod 4a1a2).

Proof. This is an immediate consequence of the multiplication formulas forquadratic ideals on page 59. !

Definition 3.6 Dirichlet Composition

Suppose that f = (a1, b1, c1) and g = (a2, b2, c2) are primitive, united forms ofdiscriminant #F where #F is a field discriminant, a3 = a1a2, b3 is the valuegiven in Theorem 3.4, and

c3 =b23 !#F

4a3.

Then the Dirichlet composition of f and g is the form

f 8 g = G = (a3, b3, c3).


Remark 3.5 Note that(a3, (b3 +

'#F )/2)

is an-OF -ideal where F = Q(&

#F ) by the multiplication formulas given on page59. This shows the intimate connection between multiplication of quadratic ide-als and composition of forms. Indeed, we need not restrict to field discriminantsfor this to work. We could expand the discussion to nonmaximal orders inquadratic fields but then the delineation becomes more complicated since wemust rely on special conditions for invertibility of ideals and other considera-tions all of which are satisfied in the so-called maximal order OF . See [62] forthe more general approach.

The form G, in Definition 3.6, is a form of discriminant

b23 ! 4a3c3 = b2

3 ! 4a3(b23 !#F )/(4a3) = b2

3 ! b23 + #F = #F .

Also it is primitive since if a prime p## gcd(a3, b3, c3), then p

## a1 or p## a2.

Without loss of generality suppose it divides a1. Then since p## b3, we must

have that p## b1 since b3 + b1 (mod 2a1) by Theorem 3.4 on the preceding page.

However, since p## c3 and b2

3 ! 4a3c3 = D, then p2## #F . However, #F is a

field discriminant so p = 2 and #F + 0(mod 4) is the only possibility. ByDefinition 1.6 on page 7, #F /4 + 2, 3(mod 4). If #F /4 + 2(mod 4), then byTheorem 3.4, b3/2 is even since

$b3

2

&2

+ #F

4(mod a1a2),

given that 2## a1. However, we have

$b3

2

&2

! a3c3 =#F

4, (3.10)

so since 2## a3 and 2

## c3, then #F /4 + 0(mod 4), a contradiction. Thus,

#F /4 + 3 (mod 4),

so by (3.10), b3/2 is odd. However, (3.10) implies #/4 + 1(mod 4), a con-tradiction. We have shown that, indeed, G is a primitive form of discriminant#F .

Remark 3.6 The opposite of

f = (a, b, c)

isf#1 = (a,!b, c),

which is the inverse of f under Dirichlet composition. To see this we note thatunder the proper equivalence that sends (x, y) to (!y, x), f#1 $ (c, b, a), for


which gcd(a, c, b) = 1. This allows us to choose a united form in the class off#1 by Definition 3.5 on page 107, so we may perform Dirichlet composition toget

f 8 f#1 = G =$

ac, b,b2 !#F

4ac

&= (ac, b, 1).

Moreover, by Exercise 3.31 on page 145,

G $ (1, 0,#F /4) when #F + 0 (mod 4)

andG $ (1, 1, (1!#F )/4) when #F + 1 (mod 4).

Thus, G is in the principal class by Corollary 3.1 on page 112.

We now need to introduce the ideal class group as a vehicle for defining theform class group since Theorem 3.3 on page 107 gives us the connection.

Definition 3.7 Equivalence of Ideals

Let OF be the ring of integers of a number field F . Then two OF -ideals I, J aresaid to be in the same equivalence class if there exist nonzero %,& % OF suchthat (%)I = (&)J denoted by I $ J .

Remark 3.7 By Theorem 2.9 on page 73 and Exercise 2.17 on page 85, we knowthat the set of all fractional OF -ideals forms a multiplicative abelian group. Ifwe denote this group by I!F and let P!F denote the group of principal ideals,then the quotient group

I!F

P!F

= COF

is called the class group of OF . Also, the class of an OF -ideal I is denoted byI. Thus a product of classes IJ = C is the class belonging to any ideal C = IJformed by multiplying representatives I % I and J % J. The identity element1 is the principal class, namely all principal ideals (%) $ (1), meaning (%) % 1.The existence of inverse classes I#1 for any class I is guaranteed by Exercise2.18 and Theorem 2.9, namely II#1 = 1. The commutative and multiplicativelaws are clear, namely

IJ = JI, and I(JK) = (IJ)K, for OF -ideals I,J,K.

Note as well, that the conjugate ideal I % for I, first mentioned in Remark 2.2 onpage 63, satisfies

I#1 = I%

–see Exercise 3.19 on page 127. In what follows, we will need to refine thisconcept a bit in order to be able to include indefinite binary quadratic forms.We let P+

!Fdenote the group of principal ideals (%) where NF (%) > 0–see

Definition 2.19 on page 91. Then we let

I!F

P+!F

= C+OF


known as the narrow ideal class group, or sometimes called the strict ideal classgroup. Clearly, when F is a complex quadratic field, then COF = C+

OF, since

norms are necessarily positive in this case. In the real case we will learn moreas we progress.

Note that in what follows, we use the symbol $ to denote both equivalencein the ordinary ideal class group COF as well as equivalence of forms, but thiswill not lead to confusion when taken in context.

We use the symbol 9 to denote strict equivalence in C+OF

, i.e., I 9 J in C+OF

when there exist %,& % OF such that

(%)I = (&)J

where NF (%&) > 0. The next result shows that this is tantamount to formequivalence.

Theorem 3.5 Form and Ideal Class Groups

If C!F denotes the set of classes of primitive forms of discriminant #F ,where F is a quadratic field, then C!F is a group with multiplication given byDirichlet composition and

C+OF

$= C!F .

Proof. Let f = (a1, b1, c1) and g = (a2, b2, c2), then by Exercises 3.2 on page 103and 3.9 on page 104, g $ (a%2, b%2, c%2) where gcd(a1, a%2) = 1. Thus, Dirichletcomposition is defined so we may assume the f and g to be united, without lossof generality. Let F = (a3, b3, c3) be given as in Definition 3.6 on page 107. Thenwe know that via the ideal correspondence given in Theorem 3.3 on page 107,

(a1, (b1 !'

#F )/2)(a2, (b2 !'

#F )/2) = (a3, (b3 !'

#F )/2), (3.11)

via the multiplication formulas on page 59. Thus, by Theorem 3.3 and (3.11),the Dirichlet composition of f(x, y) and g(x, y) corresponds to the product ofthe corresponding ideal classes, which shows that Dirichlet composition inducesa well defined binary operation on C!F .

Note that in what follows, if we have strict equivalence of ideals given by

I = (a, (!b +'

#F )/2) 9 J = (a%, (!b% +'

#F )/2), (3.12)

then we may replace I by (aa%)I and J by (a2)J , so we may assume withoutloss of generality that a = a%. Via Theorem 3.3, we may define a mapping fromC+

OFto C!F as follows

, : (a, (!b +'

#F )/2) ./ f = (a, b, c),

wherec = (b2 !#F )/(4a).


Moreover, by the above,,(IJ) = ,(I),(J)

since we have shown that ideal multiplication corresponds to form multiplica-tion. To see that that , is well defined, assume that a% > 0 and b% % Z in (3.12).Thus, since there are ), * % OF such that ())I = (*)J where NF ()*) > 0 then

NF ()/*)N(I) = N(J) = a,

so NF ()/*) = 1. By Exercise 3.21 on page 127, there is a ( % OF such that)/* = (/(%. If

m",Q(x) = ux2 + vx + w

is the minimal polynomial of ( over Q, then it is for (% as well, so ,(() = ,((%) =(u, v, w). Hence,

,(()/*)I) = ,(((/(%)),(I) = ,(I).

Hence, it su"ces to prove that ,(I) = ,(J) when I = J . By Exercise 2.5 onpage 66, there exists

X =$

p qr s

&% GL(2, Z),

such that $(!b +

&#F )/2

a

&= X

$(!b% +

&#F )/2

a

&.

Therefore,

p

$!b% +

&#F

2

&+ qa =

!b +&

#F

2and

r

$!b% +

&#F

2a

&+ sa = a,

from which it follows that r = 0, s = p = 1, and b = b% ! 2qa. Hence,

ax2 + bxy + cy2 = f(x, y) = g(x! qy, y) = a(x! qy)2 + b%(x! qy)y + c%y2,

so f and g are properly equivalent, namely they are in the same class in C!F ,so , is well defined. Now we establish the isomorphism.

First we show that , is injective. Let

,(a, (!b +'

#F )/2) = f = (a, b, c) $ ,(a%, (!b% +'

#F )/2) = g = (a%, b%, c%)

in C!F . Since

(aa%)(a, (!b +'

#F )/2) 9 (a2)(a%, (!b% +'

#F )/2)

as OF -ideals, then we may assume that a = a% without loss of generality since,if they are not equal, we may change the preimage to make it so as above. Nowsince

f

$!b +

&#F

2a, 1

&= 0 = f

$!b% +

&#F

2a, 1

&,


then

either!b +

&#F

2a=!b% +

&#F

2aor!b +

&#F

2a=!b% !

&#F

2a,

given that these are the only two roots of f(x, 1) = ax2 + bx + c = 0. However,the latter is impossible by comparing coe"cients so the former holds, from whichwe get that b = b% so c = c%. Thus, , is injective.

Lastly, we show that , is surjective. Let


be a primitive form of discriminant #F and let

% = (!b +'

#F )/(2a).

Then f(%, 1) = 0, and a% % OF . Define an OF -ideal as follows. Set

I =

((a, a%) if a > 0,(&

#F )(a, a%) if a < 0 and #F > 0.(3.13)

Therefore, ,(I) = (a, b, c) in the first instance is clear. In the second instance,we note that I 9 (a, (!b +

&#F )/2) so

,(I) = ,((a, (!b +'

#F )/2)) = (a, b, c).

Hence, , is surjective and the isomorphism is established. !

Corollary 3.1 The identity element of C!F is the class containing the principalform (1, 0,!#F /4) or (1, 1, (1!#F )/4) for #F + 0, 1(mod 4), respectively.

Proof. Since

,(1,'

#F /2) = (1, 0,!#F /4) or ,(1, (!1 +'

#F )/2) = (1, 1, (1!#F )/4)

depending on congruence modulo 4 of #F , and the preimages are the identityelements in the principal class of C+

OF, then the images are clearly the identity

elements in the principal class of C!F . !

Remark 3.8 When F is a complex quadratic field, as noted in Remark 3.7 onpage 109,

COF = C+OF

,

so by Theorem 3.5 on page 110,

C!F$= COF .


However, in the real case, this is not always true. For instance, by Exercise 3.14,in the case where #F = 12, C!F '= {1} and COF has order 1. Yet by Theorem3.5,

C+OF

$= C!F .

Indeed, the case where the field F is real and has a unit of norm !1 or F iscomplex, then by Exercise 3.17 on page 117, COF = C+

OFalways holds. When

F is real and has no such unit, for instance as in the #F = 12 case, then byExercise 3.16,

|C+OF

: COF | = 2.

Note as well, ifh+

OF= |C+

OF|,

the narrow ideal class number, then by Theorem 3.5,

h+OF

= h!F ,

the number of classes of forms of discriminant #F . Also, if

hOF = |COF |,

the ordinary or wide class number, by the above discussion, we have demon-strated the following.

Theorem 3.6 Class Numbers of Forms and Ideals

If #F is the discriminant of a quadratic field F , then the class number ofthe form class group h!F , as well as that of both the wide ideal class group hOF

and the narrow ideal class h+OF

, is related by the following.

h!F = h+OF

=

F

GGGGH

hOF if #F < 0,hOF if #F > 0 and there exists a u % UF

with NF (u) = !1,2hOF if #F > 0 and there is no u % UF

with NF (u) = !1.

I

JJJJK

We conclude this section with a verification that h!F is finite. To do thiswe first need the following result.

Lemma 3.2 A Form of Reduction

If #F is the discriminant of a quadratic field F , then in each class of C!F

there is a form f = (a, b, c) such that

|b| #| a| # |c|.


Proof. Let the form f = (a1, b1, c1) be in an arbitrary class of C!F . We mayselect an integer a such that |a| is the least value from the set of nonzero integersrepresented by forms in the class of f . Then there exist p, r % Z such that

a = a1p2 + b1pr + c1r

2. (3.14)

If g = gcd(p, r), then a/g2 is represented by f , contradicting the minimality of|a| unless g = 1. Therefore, by the Euclidean algorithm, there exist integers q, ssuch that ps! qr = 1. Also,

f(px + qy, rx + sy) = a1(px + qy)2 + b1(px + qy)(rx + sy) + c1(rx + sy)2 =

(a1p2 + b1pr + c1r

2)x2 + (a12pq + b1(ps + qr) + c12rs)xy+

(a1q2 + b1qs + c1s

2)y2 =

ax2 + Bxy + Cy2,

where the coe"cient for x2 comes from (3.14),

B = (2pqa1 + (ps + qr)b1 + 2rsc1),

andC = a1q

2 + b1qs + c1s2.

Set g(x, y) = ax2 + Bxy + Cy2 and we have f $ g in C!F . We may select aninteger m such that

|2am + B| #| a|. (3.15)

Thus,g(x + my, y) = a(x + my)2 + B(x + my)y + Cy2 =

ax2 + (2am + B)xy + (am2 + Bm + C)y2 =

ax2 + bxy + cy2,

withb = 2am + B,

andc = am2 + Bm + C.

Seth(x, y) = ax2 + bxy + cy2.

Then, since #F = b2 ! 4ac, given that f $ g $ h, then c = 0 implies that#F = b2, a contradiction to the fact that #F is a field discriminant. Hence,since h(0, 1) = c, then |c| ) |a| by the minimality of |a|. Thus, from (3.15), wehave the result. !

Corollary 3.2 Any form of discriminant #F is equivalent to a reduced formof the same discriminant.


Proof. By Theorem 3.1 on page 100, we need only prove the result for #F > 0.

Claim 3.3 We may assume that (a, b, c) satisfies |a| #| c| with'

#F ! 2|a| < b <'

#F .

By Lemma 3.2, we may select a form (a, b, c) such that |b| #| a| #| c|. If&#F ! 2|a| > b, then by setting

m ==&

#F

2c+

b|c|2c

+ 3

>,

where

3 =

(1 if c < 0,0 if c > 0

we get '#F ! 2|c| < !b + 2cm <

'#F .

We now show that

(a, b, c) $ (c,!b + 2cm, a! bm + cm2). (3.16)

Via the map , in Theorem 3.5 on page 110,

, :$

a,!b +

&#F

2

&./ (a, b, c),

and

, :$

c,b! 2cm +

&#F

2

&./ (c,!b + 2cm, a! bm + cm2),

as OF -ideals. However, by Exercise 2.6 on page 66$

c,b! 2cm +

&#F

2

&=

$c,

b +&

#F

2

&,

so $c,

b! 2cm +&

#F

2

&$

$b!

&#F

2c

& $c,

b +&

#F

2

&

=$

a,b!

&#F

2

&=

$a,!b +

&#F

2

&.

Since , is a bijection, we have established (3.16).If |a! bm + cm2| < |c|, then we repeat the (finite) process, this time on

(c,!b + 2cm, a! bm + cm2),

which must terminate in(A, B,C) $ (a, b, c)


with|A| # |C| and

'#F ! 2|A| < B <

'#F .

This is Claim 3.3.Therefore,

0 <'

#F ! b < 2|a| # 2|c| =|#F ! b2|

2|a| <###'

#F + b### .

Hence, b > 0, so b2 < #F and |2a|2 # 4|ac| = #F ! b2 < #F , so

2|a| <'

#F <'

#F + b,

from which it follows that (a, b, c) is reduced. !

Theorem 3.7 h!F < "

If F is a quadratic field with discriminant #F , then h!F is finite.

Proof. Note that by Exercise 3.11 on page 104, we need only consider the casewhere #F > 0. By Lemma 3.2 on page 113, for any class of C!F , there is aform f = (a, b, c) in the class with

|ac| ) b2 = #F + 4ac > 4ac,

so ac < 0. Moreover,

4a2 # 4|ac| = !4ac = #F ! b2 # #F .

Therefore,|a| #

'#F /2, (3.17)

so by Lemma 3.2,|b| #

'#F /2. (3.18)

Hence, by the bounds in (3.17)–(3.18), there can only be finitely many choicesfor the values a and b for a given discriminant #F . Since

c =b2 !#F

4a,

we have established the result. !

Corollary 3.3 Positive Definite Forms and Reduction

When #F < 0, then the number of inequivalent positive definite forms withdiscriminant #F is the same as the number of reduced forms.


Proof. See Exercise 3.11. !

Corollary 3.4 hOF < "

If #F is the discriminant of a quadratic field F , then hOF is finite.

Proof. This follows from Theorem 3.6 on page 113 and Theorem 3.7. !

Exercises

3.14. Prove that when #F = 12 where F = Q(&

3), then the form f = (!1, 0, 3)is not properly equivalent to the form g = (1, 0,!3). This shows thatC!F '= {1}. Show, however, that COF = {1}.(Hint: See Remark 1.19 on page 50 and Theorem 2.17 on page 83.)

In Exercises 3.15-3.17, assume that #F is the discriminant of a quadraticfield F .

3.15. Let F be a real quadratic field and set

% =

((1, 0,!#F /4) if #F + 0(mod 4),(1, 1, (1!#F )/4) if #F + 1(mod 4).

Prove that % $ !% in C!F if and only if OF has a unit u such thatNF (u) = !1.

3.16. Let F be a real quadratic field. Assume that OF does not have a unit ofnorm !1. Prove that

|C+OF

: COF | = 2.

(Hint: Use Exercise 3.15.)

3.17. Prove that C+OF

= COF if F is either a complex quadratic field or F is areal quadratic field such that OF has a unit u with NF (u) = !1.(Hint: Use Exercise 3.15.)

3.18. Let F be a number field and let hOF be the (wide) class number of F .Prove that if I is an integral OF -ideal, then IhOF $ 1.(Hint: By Theorem 3.7 on the facing page, |hOF | < ".)


3.3 Applications via Ambiguity

Seal up the mouth of outrage for a while,Till we can clear the ambiguities.

from act five, scene 3, line 216 of Romeo and Juliet (1595)William Shakespeare

In Remark 2.2 on page 63, we first mentioned the conjugate I % of an idealI in COF and we mentioned norms of ideals in Exercise 2.4 on page 66. Theseare important concepts that we now formalize.

Definition 3.8 Conjugates and Norms of Ideals

Suppose that F is a quadratic field of discriminant #F . If

I =$

a,!b +

&#F

2

&(3.19)

is an OF -ideal, then

I % =$

a,b +

&#F

2

&

is called the conjugate ideal of I. The representation of I given in (3.19) is calledthe Hermite normal form of I, and similarly for its conjugate–see Biography 3.4on page 126. The value a > 0 is called the norm of I (and of I %) denoted by

a = N(I) = N(I %),

the smallest positive integer in the ideal. Also,

N(IJ) = N(I)N(J) for OF -ideals I, J.

By Exercises 3.19–3.20 on page 127,

an ideal I has order at most 2 in COF if and only if I $ I% $ I#1 in COF .

The elements of order 2 in both the form and ideal class groups are intimatelylinked and play an important role, including some interesting and valuable ap-plications that we present in this section. First, we need the following whichwill be the gateway to linking forms and ideals in this context.

Definition 3.9 Ambiguous Ideals

If F is a quadratic field of discriminant #F and

I = (a, (!b +'

#F )/2)

is an OF -ideal, then I is called ambiguous if

I = I % = (a, (b +'

#F )/2).

An ambiguous class of ideals in C+OF

is one that contains an ideal I such thatI 9 I %.

3.3. Applications via Ambiguity 119

Definition 3.10 Ambiguous Forms

If F is a quadratic field of discriminant #F and f = (a, b, c) is a primitive formof discriminant #F = b2 ! 4ac, then f is said to be ambiguous if a

## b. Anambiguous class in C!F is one that contains an ambiguous form.

Now we embark upon linking Definitions 3.9–3.10.

Lemma 3.3 If F is a quadratic field of discriminat #F and I is a primitiveOF -ideal, then

I = I %

if and only ifN(I)

## #F .

Proof. If I = I %, then$

N(I),!b +

&#F

2

&=

$N(I),

b +&

#F

2

&,

sob +

&#F

2+!b +

&#F

2=

'#F % I.

Thus, (#F ) ( I, so by Corollary 2.5 on page 76, N(I)## #F .

Conversely, suppose that N(I)## #F . Since Exercise 2.4 on page 66 tells us

thatN(I)

## NF ((b +'

#F )/2),

then N(I)## b, so we set b = N(I)d for some d % Z. Then from Exercise 2.6,

I =$

N(I),!b +

&#F

2

&=

$N(I),

!2dN(I) + b +&

#F

2

&=

$N(I),

b +&

#F

2

&= I %,

as required. !

Corollary 3.5 An OF -ideal I = (a, (!b +&

#F )/2) is ambiguous if and onlyif a

## b.

Proof. If I = I %, thenN(I) = a

## #F = b2 ! 4ac,

by Lemma 3.3, so a## b since #F is either 4D or D where D is squarefree. Note

that it is not possible that a = 4 since, when #F + 0(mod 4), we must havethat D + 2, 3(mod 4). Conversely, if a

## b, then a = N(I)## #F , so by Lemma

3.3, I = I %. !

The next result gives us conditions on strict equivalence, namely equivalencein C+

OF, not explicit in the literature. The reader should be reminded of the

distinction between strict ideal equivalence, denoted by I 9 J and ordinaryideal equivalence, denoted by I $ J , as discussed in Remark 3.7 on page 109.


Lemma 3.4 If I is a primitive OF -ideal of COF , then the following are equiv-alent.

(a) I 9 I %.

(b) There exists an OF -ideal J such that N(J)## #F and I $ J .

(c) There exists a primitive OF -ideal J such that I $ J and J = J %.

Proof. If I 9 I %, then there exist %,& % OF such that

(%)I = (&)I %

where NF (%) > 0 and NF (&) > 0. Thus, NF (%/&) = 1, so by Exercise 3.21 onpage 127, we know there exists ( % OF such that

%

&=

(

(%,

so(()I = ((%)I %. (3.20)

Suppose now that n % N is the largest value such that

(()I = (n)J,

where J is a primitive OF -ideal. Then from (3.20), J = J %. Hence, from Lemma3.3, N(J)

## #F and from (3.20), I $ J . Thus, (a) implies (b).If (b) holds, then (c) holds by Lemma 3.3. If (c) holds, then I is in an

ambiguous class of COF having an ambiguous ideal J , so there exist %,& % OF

such that(%)I = (&)J.

Hence, since J = J %, it follows that

(&%%)I % = (&%%)I.

SinceNF (&%%) = NF (&%%),

then N(&%%&%%) > 0, so by Remark 3.7 on page 109, I 9 I %. Thus, (c) implies(a) and this completes the logical circle. !

Now we bring in forms and the connection to ambiguous ideals will materi-alize.

Theorem 3.8 Forms of Order # 2 Are in an Ambiguous Class

Suppose that f is a binary quadratic form with discriminant #F where Fis a quadratic field, and C!F is the form class group. Then the following areequivalent.


(a) f has order 1 or 2 in C!F .

(b) f $ f#1.

(c) f is equivalent to an ambiguous form.

Proof. Suppose that f = (a, b, c). If f has order at most 2, then

f 8 f $ 1

sof $ f#1.

Thus, (a) implies (b). If (b) holds, then by Theorem 3.5 on page 110,

I = (a, (!b +'

#F )/2) 9 (a, (b +'

#F )/2) = I %,

so by Lemma 3.4, I $ J where J is ambiguous. Thus,

either I 9 J or I 9 ('

#F )J , where ('

#F )J is also ambiguous.

Hence, f is equivalent to an ambiguous form. If (c) holds, then by the multipli-cation formulas for ideals on page 59, and the correspondence via Theorem 3.5,f2 $ 1, so (c) imples (a). This establishes the equivalence of (a), (b) and (c). !

To show that the concept of ambiguity has even more formidable relation-ships, we state two of them as closing features to highlight the connections.

We need the following concept.

Definition 3.11 Radicands of Quadratic Fields

If #F is the discriminant of a quadratic field F , then the radicand is given by

DF =

(#F /4 if #F + 0(mod 4),#F if #F + 1(mod 4).

It was an outstanding problem to give criteria for a sum of two squaresunder the following situation. We quote from the well-written paper [82]: “Anapparently open problem is to characterize those D that are a sum of tworelatively prime squares but for which x2!Dy2 does not represent !1. Such Dinclude 34, 146, 178, 194, 205, 221, 305, 377, 386, and 410.” This is accomplishedas follows. We state the result for quadratic field radicands, although in [70] itis proved for arbitrary nonsquare integers.

Theorem 3.9 Ambiguity and Sums of Squares

Let #F be the discriminant of a quadratic field F . Then the following areequivalent.


(a) There is an element of order 2 in COF that is not the image of an ambiguousclass under the natural mapping + : C+

OF./ COF .

(b) DF is a sum of two (relatively prime) squares and there is no unit u % OF

such that NF (u) = !1.

Proof. If (a) holds, then by Lemma 3.4 on page 120, there is an OF -ideal I suchthat I '9 I % and +(I) is an element of order 2 in COF . Therefore,

COF '$= C+OF

,

so by Theorem 3.6 on page 113, #F > 0 and there is no unit u % OF such thatNF (u) = !1. Thus, we need only show that there is no prime p

## #F with p + 3(mod 4) since, once established, DF is a sum of two squares–see Example 1.15on page 28, for instance. If such a prime p exists, then there exists

* = (x + y'

#F )/(2z) % Q('

#F )

such thatI = (*)I % where NF (*) = !1,

since I $ I %, but I '9 I %. We may assume without loss of generality thatgcd(z, p) = 1 given that DF is squarefree. Since

x2 ! y2#F

4z2= !1 then x2 + !4z2 (mod p), so (x · (2z)#1)2 + !1 (mod p).

However, this is a contradiction since we know !1 is not a quadratic residue ofsuch primes. We have shown that (a) implies (b). Now assume that (b) holds.Thus, there are a, b % Z,

DF = 4a2 + b2, for some a, b % N. (3.21)

By (3.21), and Exercise 2.4 on page 66,

I = (a, (!b +'

#F )/2)

is an OF -ideal. Also, from the multiplication formulas on page 59, it followsthat

II % = (a) (3.22)

and I2 = ((b +&

#F )/2). Assume that I 9 I %, so by Lemma 3.4, I $ J = J %.Therefore, I %J $ (1) so there exists % % OF such that (%) = I %J . Hence,

(%)I = II %J = (a)J (3.23)

where the last equality comes from (3.22). Taking conjugates in (3.23), we get(%%)I % = (a)J % = (a)J = (%)I.

Thus, (%%)I %I = (%)I2, which implies that (%%)(a) = (%)((b +&

#F )/2).Hence,

%%

%= u

$b +

&#F

2a

&, (3.24)


for some unit u % OF . However, by the hypothesis in (b), NF (u) = 1, so (3.24)implies

1 = NF

$%%

%

&= NF

$b +

&#F

2a

&=

b2 !#F

4a2= !1,

a contradiction that proves I '9 I %, so I is not the image under + of an ambiguousclass. Hence, we have (a) holds, so the result is secured. !

We conclude the section and thus the chapter with a look at applications ofambiguity to another concept–see Biography 3.3 on page 126.

Definition 3.12 Markov Triples

A Markov triple is a triple of positive integers (a, b, c) satisfying the Markovequation a2 + b2 + c2 = 3abc, and a, b, c are called Markov numbers.

Conjecture 3.1 The Markov Conjecture

If (a1, b1, c) and (a2, b2, c) are Markov triples with aj # bj # c for j = 1, 2, thena1 = a2 and b1 = b2, in which case c is said to be unique. In other words, themaximal element of a Markov triple uniquely determines the triple.

Markov came across this topic in 1879 when he was looking for the minimumpositive value represented by real indefinite binary quadratic forms. He lookedat the equation x2 + y2 + z2 = 3xyz and sought integral solutions x, y, z. Wewill adapt this quest to suit our setup drawing upon some recent results in theliterature.

The following is essentially the approach taken from [4]. Note that belowthe discriminants may not be field discriminants. Let c be a Markov numberand consider the Diophantine equation

x2 + y2 ! 3cxy = !c2. (3.25)

Let

4 =!3c +

&9c2 ! 4

2,

and set F = Q(4). Then there is a one-to-one correspondence between thesolutions of (3.25) and elements & = x + y4 % Z[4] with

NF (&) =

*2x! 3cy + y

&9c2 ! 4

2

+ *2x! 3cy ! y

&9c2 ! 4

2

+= !c2. (3.26)

If we look at the group of automorphisms, acting on (3.25) that fix c, givenby

( : (x, y, z) ./ (y, x, c),


+ : (x, y, c) ./ (!x,!y, c),

and# : (x, y, z) ./ (y, 3yc! x, c),

then it follows from what Markov proved in 1879, that (, +, # essentially generateall solutions to (3.25). Now we show how this relates to R = Z[4] and put it intoour context. First, + corresponds to multiplication by !1 in R, # corresponds tomultiplication by 4 in R, and ( is a permutation that corresponds to taking aconjugate followed by multiplication by 4. Furthemore, 4 is the smallest positiveunit in R when m '= 1. Therefore, there is a one-to-one correspondence betweenthe solutions of (3.25) and pairs of principal ideals I = (&), and I % = (&%)generated by elements &,&% % R satisfying (3.26). We have just proved thefollowing.

Theorem 3.10 An integer c is the maximal element of exactly one Markovtriple if and only if there exists exactly one pair of primitive, principal ideals inOF , {(&), (&%)} where NF (&) = !c2.

Biography 3.2 Eduard Kummer (1810–1893) was born on January 29, 1810in Sorau, Brandenburg, Prussia (now Germany). He entered the University ofHalle in 1828. By 1833, he was appointed to a teaching post at the Gymna-sium in Liegniz which he held for 10 years. In 1836, he published an importantpaper in Crelle’s Journal on hypergeometric series, which led to his correspon-dence with Jacobi and Dirichlet, who were impressed with his talent. Indeed,upon Dirichlet’s recommendation, Kummer was elected to the Berlin academyin 1839, and was Secretary of the Mathematics Section of the Academy from1863 to 1878. In 1842, with the support of Dirichlet and Jacobi, Kummer wasappointed to a full professor at the University of Breslau, now Wroclaw, inPoland. In 1843, Kummer was aware that his attempts to prove Fermat’s LastTheorem were flawed due to the lack of unique factorization in general –see thediscussions of the topic in Chapter 1. He introduced his “ideal numbers” thatwas the basis for the concept of an ideal thus allowing the development of ringtheory, and a substantial amount of abstract algebra later. In 1855, Dirichletleft Berlin to succeed Gauss at Gottingen, and recommended to Berlin that theyo"er the vacant chair to Kummer, which they did. In 1857, the Paris Academyof Sciences awarded Kummer the Grand Prize for his work. In 1863, the RoyalSociety of London elected him as a Fellow. He died in Berlin on May 14, 1893.

Example 3.2 If #F = 221, we have (&) = (14+&

221) and (&%) = (14!&

221)with c = 5 and NF (&) = !25. Here (a, b, c) = (1, 2, 5) is the Markov triple,14 = (3cb! 2a)/2, and 9c2 ! 4 = 221.

Example 3.3 If #F = 776 = 23·97 and c = 194, then (&) = (3778+13&

84680)and (3778! 13

&84680) where NF (&) = !1942. In this case, the Markov triple

is (5, 13, 194), 3778 = (3cb! 2a)/2, and 9(c/2)2 ! 1 = 84680.


Also, to bring in ideals we have the following result, which is taken from[92].

Theorem 3.11 Let #F = 9c2 ! 4 and suppose that c is a Markov numberthat is not unique. Then there exist relatively prime integers p and q such that1 < p < q < c with c = pq such that the following conditions hold.

(a) There exists OF -ideals I and J of norm p2 and q2 respectively, such that

J 9 J % 9 I'

#F .

(b) There exists a form f with f $ f#1 such that f represents both !p2 andq2.

Proof. If c is not unique, then by Theorem 3.10 on the preceding page, thereexist two distinct pairs of principal ideals

(&1) =$

c2,!b1 +

&#F

2

&, and (&2) =

$c2,

!b2 +&

#F

2

&.

Since 4c2## (#F ! b2

j ) for j = 1, 2, then b21 + b2

2 (mod 2c2). Also, since theideals are distinct, then b1 '+ ±b2 (mod 2c2). Thus, there exist relatively primeintegers p, q with 1 < p < q < c with c = pq such that

b1 + b2 + 0 (mod 2p2), and b1 ! b2 + 0 (mod 2q2). (3.27)

Set

I =$

p2,!b1 +

&#F

2

&, and J =

$q2,

!b2 +&

#F

2

&,

which are OF -ideals. Since p and q are relatively prime, then

OF 9 (&1)'

#F 9 IJ'

#F 9 (&2)'

#F 9 I %J'

#F ,

and multiplying through by I we get

J 9 J % 9 I'

#F ,

which yields part (a).For part (b), we associate the OF -ideal I(

&#F ) with the form (!p2, b1, c1) =

f and the OF -ideal J with the form (q2, b2, c2) = g, so by the equivalences inpart (a), f $ f#1 $ g, which gives us part (b). !

These applications conclude the chapter with forays into other dominionsand show the beautiful architecture underlying this mathematics–see the quoteby Dyson on page 155.


Biography 3.3 Andrei Andreyevich Markov (1856–1922) was born in Ryazan,Russia on June 14, 1856. He showed mathematical ability at an early agewhen he wrote a paper on integration of linear di"erential equations before heentered university. In 1884, he was awarded his doctorate from St. PetersburgUniversity with a thesis on applications of continued fractions. Markov was aprofessor at St. Petersburg University from 1886 until his retirement in 1905.He worked in number theory, analysis, continued fractions applied to probabilitytheory, approximation theory, and convergence of series. In particular, his workon what we now call Markov chains began the study of stochastic processes.However, it was not until 1923 when Norbert Weiner first gave a rigoroustreatment of Markov processes that the true value of the theory came to light.The general theory can be said to have been established by Andrei Kolmogorovin the 1930s. Markov’s son also became a mathematician in his own right(under the same last name). Among the honours in his life was the election tothe Russian Academy of Sciences in 1902. Markov died on July 20, 1922 inPetrograd (now St. Petersburg), Russia.

Biography 3.4 Charles Hermite (1822–1901) was born on December 24, 1822in Dieuze, Lorraine, France. In 1840-41, he studied at the same institutionwhere Galois had studied a decade and a half earlier, namely the College Louis-le-Grand, and had the same instructor as Galois, Louis Richard. He was alsotutored by Catalan in those years. He then went to the Ecole Polytechnique,where he was eventually awarded his degree in 1847. He was appointed thereas repetiteur and admissions examiner. His most far-reaching mathematicalresults were accomplished in that appointment over the next decade. One ofthese was his proof that doubly periodic functions can be represented as quotientsof periodic entire functions. He also worked on quadratic forms, includinghis result on a reciprocity law relating to binary quadratic forms. In 1855,he established a theory of transformations, which found an interface amongnumber theory, theta functions, and transformations of abelian functions, thelatter of which he had established. In 1858, he proved that although it wasknown to Ru!n and Abel that an algebraic equation of the fifth degree cannotbe solved by radicals, an algebraic equation of the fifth degree could be solvedusing elliptic functions. In 1862, he was appointed maıtre de conference atEcole Polytechnique, becoming examiner in 1863, and then professor in 1869.He left for the Sorbonne in 1876, where he stayed until he retired in 1897.

Among his other accomplishments for which he is well known is the proofof the transcendence of e–see Biography 3.6 on page 128 and Theorem 4.6on page 172. He also is known for a variety of topics that bear his nameamong which are: Hermite di"erential equations, Hermite matrices, Hermitepolynomials, and his formula for interpolation. He died in Paris, France onJanuary 14, 1901.


Biography 3.5 David Hilbert (1862–1943) was born in Koningsberg, Prussia,which is now Kaliningrad, Russia. He studied at the University of Koningsbergwhere he received his doctorate under the supervision of Lindemann. He wasemployed at Koninsberg from 1886 to 1895. In 1895, he was appointed to fillthe chair of mathematics at the University of Gottingen, where he remainedfor the rest of his life. Hilbert was very eminent in the mathematical worldafter 1900 and it may be argued that his work was a major influence through-out the twentieth century. In 1900, at the Paris meeting of the Second In-ternational Congress of Mathematicians, he delivered his now-famous lectureThe Problems of Mathematics, which outlined twenty-three problems that con-tinue to challenge mathematicians today. Among these were Golbach’s con-jecture and the Riemann hypothesis. Some of the Hilbert problems have beenresolved and some have not such as the latter two. Hilbert made contributionsto many branches of mathematics including algebraic number theory, the calcu-lus of variations, functional analysis, integral equations, invariant theory, andmathematical physics. He also had Hermann Weyl as one of his students–seeBiography 1.1 on page 31. Hilbert retired in 1930 at which time the city ofKoninsberg made him an honourary citizen. He died on February 14, 1943 inGottingen.

ExercisesIn Exercises 3.19–3.21, #F denotes the discriminant of quadratic field F

with ring of integers OF . Also,

I = (a, (!b +'

#F )/2)

is an OF -ideal, withI % = (a, (b +

'#F )/2)

its conjugate ideal.

3.19. Prove that I% = I#1 in COF .(Hint: Use The Multiplication formulas on page 59.)

3.20. Prove that I has order at most 2 in COF if and only if I $ I %.(Hint: Use Exercise 3.19.)

3.21. Let u be a unit in OF such that NF (u) = 1. Prove that there exists an% % OF such that % = u%%, where %% is the algebraic conjugate of %.(This is the quadratic analogue of Hilbert’s Theorem 90, but is actuallydue to Kummer–see Biographies 3.2 on page 124 and 3.5.)

3.22. Prove that if f = Q(&

221), then OF has no unit u with NF (u) = !1.


Biography 3.6 Carl Louis Ferdinand von Lindemann (1852–1939) was bornin Hannover, Hanover, which is now Germany. He studied at the University ofGottingen which he entered in 1870. However, as was a practice at the time, hemoved from one university to another studying at Munich and at Erlangen. Hewas awarded his doctorate in 1873 under the direction of Klein at Erlangen. In1877, he was awarded his habilitation by the University of Wurzburg. Also, in1877, he was appointed as extraordinary professor to the University of Freiburg,and promoted there to ordinary professor in 1879. In 1883, he was appointedprofessor at the University of Koningsberg. In 1893, he accepted a chair at theUniversity of Munich where he remained for the rest of his life.

Lindemann is probably best known for his proof that " is transcendental–seeTheorem 4.7 on page 175. He proved this in 1882, using methods of Her-mite who had shown, in 1873, that e is transcendental–see Biography 3.4 onpage 126. Lindemann was also interested in physics as well as in the historyof mathematics, including the translation and expansion of some of Poincare’swork. Among the honours bestowed upon him were being elected to the BavarianAcademy of Sciences and being given an honourary degree from the Universityof St. Andrews. As noted above, Hilbert was one of his students, as was OskarPerron. He died in Munich on March 6, 1939.

3.4. Genus 129

3.4 Genus

My mind rebels at stagnation. Give me a problem, give me work, give me themost abstrusive cryptogram, or the most intricate analysis, and I am in myown proper atmosphere.

spoken by Sherlock Holmes in The Sign of Four (1890)Sir Arthur Conan Doyle–see page 55.

In §3.1, we looked at representation of integers by binary quadratic forms.Thus, if the discriminant of forms f and g is D and f and g are not in the sameequivalence class, then there is the problem of distinguishing those numbersrepresented by f from those represented by g. This is, in particular, of valuewhen the forms are positive definite in view of Theorem 3.2 on page 102, whenwe know hD = h#4n > 1. The notion on the header of this section was createdby Gauss to express this type of distinction. In order to be able to preciselydefine it, we need the following result. As usual, we use the term form hereinto mean binary quadratic form.

Lemma 3.5 Jacobi Symbols and Representation

Let F be a quadratic field with discriminant #F with

|#F | = p1p2 · · · pr if #F + 1 (mod 4)

and|#F | = 2#p2p3 · · · pr if #F + 0 (mod 4),

where % % {2, 3} and pj, for j = 1, 2, . . . , r % N, are distinct odd primes.If n1, n2 % Z are properly represented by a form of discriminant #F with

gcd(2#F , n1) = gcd(2#F , n2) = 1, then,

(#F /|n1|) = (#F /|n2|) = 1,

where (4/4) denotes the Jacobi symbol. Also,

(n1/pj) = (n2/pj) for j = 2, . . . , r,

and(31/%1) = (32/%2)

where (3j/%j) are defined by the following with

sign(nj) = 1 if nj > 0 and sign(nj) = !1 if nj < 0 for j = 1, 2 :

.&j

#j

/=

F

GGGGGGH

.nj

p1

/if #F + 1(mod 4),

.#1|nj |

/· sign(nj) if #F + 12(mod 16),

.2

|nj |

/if #F + 8(mod 32),

.#2|nj |

/· sign(nj) if #F + 24(mod 32).

I

JJJJJJK


Proof. Suppose that the integers n1 and n2 are properly represented by theform f = (a, b, c). Since there are relatively prime integers xj , yj for j = 1, 2such that

f(xj , yj) = nj = ax2j + bxjyj + cy2

j ,

where gcd(2#F , n1) = gcd(2#F , n2) = 1, then

4af(xj , yj) = (2axj + byj)2 !#F y2j .

Therefore, for each odd pi

## #F ,

4anj + (2axj + byj)2 (mod pi).

Hence, $anj

pi

&= 1,

from which it follows that$

n1

pi

&=

$n2

pi

&=

$a

pi

&.

It remains to deal with the case when # + 0(mod 4) and show that (31/%1) =(32/%2) and to show that (#F /|nj |) = 1 for j = 1, 2. The latter follows fromExercise 3.9 on page 104. The balance of the result will now follow from theproduct formula that we establish as follows.

Claim 3.4 For j = 1, 2,$

3j

%j

& r7

i=2

$nj

pi

&=

$#F

|nj |

&.

First, we know from the quadratic reciprocity law–see [68, Theorem 4.11, p.196]–that

r7

i=2

$nj

pi

&=

$(!1)((nj#1)/2)(!F /2##1)/2)#F /2'

|nj |

&where * % {2, 3}. (3.28)

If #F + 12(mod 16), then * = 2 and #F /4 + 3(mod 4), so from (3.28),$

3j

%j

& r7

i=2

$nj

pi

&=

$#F

|nj |

& $(!1)(nj+1)/2

|nj |

&· sign(nj) =

$#F

|nj |

&.

If #F + 8(mod 32), then we get * = 3, #F /8 + 1(mod 4), and so from (3.28),$

3j

%j

& r7

i=2

$nj

pi

&=

$#F

|nj |

&.

3.4. Genus 131

Lastly, if # + 24(mod 32), then * = 3 and #/8 + 3(mod 4), so from (3.28),$

3j

%j

& r7

i=2

$nj

pi

&=

$#F

|nj |

& $(!1)(nj+1)/2

|nj |

&· sign(nj) =

$#F

|nj |

&.

This is Claim 3.4 and so the entire result. !

We are now in a position to define the salient feature that will provide themechanism for the primary definition we are seeking. In what follows sign(n) isas defined in Lemma 3.5.

Definition 3.13 Assigned Values of Generic Characters

Let F be a quadratic field with discriminant #F ,

|#F | = p1p2 · · · pr if #F + 1 (mod 4),

and|#F | = 2#p2p3 · · · pr if #F + 0 (mod 4),

where % % {2, 3} and pj , for j = 1, 2, . . . , r % N, are distinct odd primes.Suppose that n is a nonzero integer with gcd(2#F , n) = 1. Let 71 be definedas the following, where ()) ) denoted the Jacobi symbol:

71 =

F

GGGGGGH

.np1

/if #F + 1(mod 4),

.#1|n|

/· sign(n) if #F + 12(mod 16),

.2|n|

/if #F + 8(mod 32),

.#2|n|

/· sign(n) if #F + 24(mod 32),

I

JJJJJJK

for j = 2, 3, . . . , r, and let 7j be the Jacobi symbol (n/pj). Then the values 7j

are called the generic characters of n and their assigned values are given by ther-tuple

(71, 72, . . . ,7r). (3.29)

If n is represented by a form f of discriminant #F , then (3.29) are thegeneric characters of the form f , denoted by 7j(f) for j = 1, 2, . . . , r.

Remark 3.9 Note that the assigned characters in Definition 3.13 satisfy themultiplicative property

7j(fg) = 7j(f)7j(g)

by the properties of Jacobi symbols. Also, we may view the multiplicative char-acters as functions mapping from Z to {±1}, so (71, . . . ,7r) may be consideredas a vector-valued function from r-tuples of integers to r-tuples with entries ±1.With this in mind, Lemma 3.5 tells us that the vector of assigned values remainsinvariant over all integers represented by a form from a class in C!F . Hence,the following holds.


Corollary 3.6 All integers n relatively prime to 2#F , which are representableby forms in a given equivalence class of C!F have the same assigned values ofgeneric characters, and (#F /|n|) = 1.

By Corollary 3.6, the characters of f are the same for all integers representedby f so the notion of the characters of f in Definition 3.13 is indeed a well-definedconcept. Now we have the tools to define the main topic.

Definition 3.14 Genus

A class of forms in C!F having the same assigned characters is called a genusof forms. The genus of forms having all assigned characters +1 must containthe principal form called the principal genus.

The following is an important consequence from Corollary 3.6.

Corollary 3.7 The product of the assigned values for the characters for anygiven genus is +1.

Proof. This is immediate from Claim 3.4 on page 130 and Exercise 3.9 onpage 104. !

Remark 3.10 It follows that equivalent forms necessarily represent the sameintegers, so they are in the same genus. Also, we will see later that each genusconsists of a finite number of classes of forms, the same for each genus, andthere are only finitely many genera–see Theorem 3.14 on page 142. Also, byExercise 3.26 on page 145, each genus is a coset of the principal genus. It isalso known that the principal genus is actually the subgroup of squares C2

!Fof

C!F –see Remark 3.13 on page 143.

The following is a general aspect of genus theory applied to the principalgenus. Note that we will be using Dirichlet’s Theorem on primes in arithmeticprogression–see Chapter 7, where we provide a proof. This result guaranteesthat every class in

.Z

|D|Z

/)includes an odd prime. Moreover, in the proof of the

following, we will be using properties of the Jacobi symbol–see [68, pp. 192–200].

Theorem 3.12 Principal Forms and Genus

Let #F be the discriminant of a quadratic field F and let f be a primitiveform of discriminant #F . Set

U!F =0

m %$

Z|#F |Z

&): there is an odd prime p % m and (#F /p) = 1

L.

Then each of the following hold.

(a) If m % Z with gcd(#F , m) = 1 is represented (not necessarily properly) bya form of discriminant #F , then m % U!F .

3.4. Genus 133

(b) The elements m % (Z/|#F |Z)) such that m is represented by the principalgenus of discriminant #F form a subgroup H!F of U!F .

(c) The cosets of H!F in U!F are precisely the elements of

Lf =0

' %$

Z|#F |Z

&): f(x, y) + ' (mod |#F |) for some x, y % Z

L

where f ranges over the primitive forms of discriminant #F which repre-sent distinct values.

(d) Forms f, g of discriminant #F are in the same genus if and only if Lf =Lg.3.1

Proof. First of all, we show that U!F is a group. If p1, p2 are odd primes withp1, p2 % U!F , then it su"ces to show that p1 · p2

#1 % U!F . Let p3 = p2#1

where p3 is an odd prime. Then by the quadratic reciprocity law for the Jacobisymbol in the case where #F + 1(mod 4)–see, for instance, [68, Exercise 4.25,p. 200], $

#F

p1p3

&=

$p1p3

|#F |

&=

$p1p3

|#F |

&=

$1

|#F |

&= 1,

since1 + p1 · p#1

2 + p1 · p3 (mod |#F |),

sop1 · p3 = p1 · p#1

2 % U!F .

Note that 1 = p1 · p#12 = p4 for some odd prime p4. This comment holds for the

remaining cases.If #F + 0(mod 8), then

$#F

p1p3

&=

$#F /4p1p3

&=

$2

p1p3

& $#F /8p1p3

&

= (!1)(p1p3)2"1

8 · (!1)p1p3"1

2 ·!F /8"12

$p1p3

#F /8

&=

$p1p3

#F /8

&=

$1

#F /8

&= 1,

since p1p3 + 1(mod #F ) so (p1p3 ! 1)/2 is even as is ((p1p2)2 ! 1)/8.If #F + 4(mod 8), then$

#F

p1p3

&=

$#F /4p1p3

&= (!1)

p1p3"12 ·!F /4"1

2

$p1p3

#F /4

&=

$p1p3

#F /4

&= 1,

3.1An important fact must be highlighted here. Part (d) says that two forms, f, g, are in thesame genus if and only if they represent the same values modulo |!F |. Therefore, althoughit is possible that f and g are in the same genus, yet there may exist an integer n such thatg(x, y) = n a but f does not represent n (meaning that there are no integers X, Y such that

f(X, Y ) = n) it must hold that there exist integers u, v such that f(u, v) = n ! (Z/|!F |Z)!.This means that f and g are in the same genus if and only if they represent the same valuesin U!F

, namely if and only if Lf = Lg . See Example 3.6 on page 136 for an explicit depictionof these facts.


since p1p3 + 1(mod #F ), forcing (p1p3 ! 1)/2 to be even. We have shown thatU!F is indeed a group.

Now if m % Z with gcd(#F , m) = 1 is represented by a form of discriminant#F , then by Exercise 3.6 on page 103, we may let m = m2

1m2 where m2 isproperly represented by a form of discriminant #F . Suppose that p > 2 isprime with p = m2 %

.Z

|!F |Z

/). By Exercise 3.9 on page 104, (#F /p) = 1, so

p = m2 % U!F . Also, since$$

Z|#F |Z

&)&2

( U!F ,

and U!F has been shown to be a group, then m21 % U!F and m2 % U!F so

m = m21m2 = m2

1 · m2 % U!F .

Hence, m % U!F and we have completed the proof of part (a).For part (b), we have that H!F ( U!F by part (a). Also, products of classes

in U!F all of whose assigned characters are +1 must also be all +1. It followsthat if m, n % H!F , then m · n#1 % H!F , so H!F is a subgroup of U!F . Thisis part (b).

For part (c), let ' % Lf . Since Lf ( U!F by part (a), then there is an oddprime p such that p = ' and f properly represents p. By Exercise 3.2, thereexist x, y, b, c % Z such that

f(x, y) = px2 + bxy + cy2.

Therefore, by setting ( = 2 if #F + 1(mod 4) and ( = 1 if #F + 0(mod 4), wehave

(2pf(x, y) = ((px + (by/2)2 ! y2#F (2/4. (3.30)

Hence, (2pf(x, y) % H!F , namely f(x, y) % ((2p)#1H!F . We have shown that

Lf ( ((2p)#1H!F .

Conversely, if m % ((2p)#1H!F , then by the discussion in Footnote 3.1 onthe preceding page, there are X, Y % Z such that

(2pm + X2 + (( ! 1)XY +(( ! 1!#F )

4Y 2 (mod |#F |). (3.31)

Hence, from (3.30)–(3.31), we can find u, v % Z such that

f(u, v) + m (mod |#F |).

In other words, m % Lf . This shows that ((2p)#1H!F ( Lf , so

Lf = ((2p)#1H!F ,

securing part (c).

3.4. Genus 135

For part (d), if f and g are in the same genus, then f and g have the sameassigned characters 7j(f) = 7j(g) for each j as in Definition 3.13 on page 131.Therefore, by Remark 3.9 on page 131,

7j(f 8 g#1) = 7j(f)7j(g#1) = 7j(g)7j(g#1) = 7j(g 8 g#1) = 7j(1!F ) = +1

for all such j where 1!F is the principal form in C!F as given in Definition 3.3on page 100. Thus, f 8 g#1 is in the principal genus, so Lf*g"1 = H!F bypart (b). It follows from part (c) that Lf = Lg. Conversely, if Lf = Lg, thenLf*g"1 = H!F by part (c), so f 8g#1 is in the principal genus by part (b). Thismeans that 7j(f 8 g#1) = +1 for all j so

1 = 7j(f 8 g#1) = 7j(f)7j(g#1),

which means that

7j(g) = 7j(f)7j(g#1)7j(g) = 7j(f)7j(g#1 8 g) = 7j(f)7j(i!F ) = 7j(f),

so f and g have the same assigned values, namely f and g are in the same genus.This completes part (d) and so the total result. !

Remark 3.11 With Lf and U!F as defined in Theorem 3.12 on page 132,

U!F =M

Lf ,

where the disjoint union is over forms of discriminant #F which represent dis-tinct values. In other words, Lf is a coset of H!F in U!F . This allows thefollowing notion.

Definition 3.15 Genus and Cosets

The genus of the coset Lf , as given in Theorem 3.12 on page 132, consists of allthe forms of discriminant #F that represent the values of Lf modulo |#F |.

Notice, as well, the nice manner in which the coset approach yields thegeneric interpretation of forms given in Definition 3.13 on page 131. If ', m % Lf

for a form f , then ' · m#1 is in the principal genus, so the assigned charactersfor ' · m#1 are all +1. Hence, the generic characters of ' and m are the same.

Historically, it was Lagrange who first introduced the notion of looking atcongruence classes in (Z/|#F |)) represented by a single form. To do this hegathered together these forms that represent the same equivalence classes in(Z/|#F |)). Thus, Lagrange was prescient in this regard since this was thefundamental idea behind genus theory.

Following the notation of the proof of Theorem 3.12 on page 132, x willcontinue to denote an element in U!F /H!F in the ensuing developments.

Example 3.4 Let #F = !20 where h!F = 2, f = (1, 0, 5), and g = (2, 2, 3),with

U!F = {1, 3, 7, 9}with

H!F = Lf = {1, 9},and Lg = {3, 7}.


Example 3.5 Let #F = !35, which has h!F = 2 with f = (1, 1, 9) being theprincipal form and g = (3, 1, 3) being in a di!erent genus. Here

U!F = {1, 3, 4, 9, 11, 12, 13, 16, 17, 27, 29, 33},

andH!F = {1, 4, 9, 11, 16, 29}.

The above illustrate parts (a)–(b) of Theorem 3.12, and what follows illustratespart (c). Also, with reference to Remark 3.11, notice that

U!F = 3fLf = L(1,1,9) 3 L(3,1,3) = {1, 4, 9, 11, 16, 29} 3{ 3, 12, 13, 17, 27, 33}.

Since f represents {1, 4, 9, 11, 16, 29}, then 71(f) = (1/5) = 1 and 72(f) =(1/7) = 1, so the assigned values for f are (1, 1), as stated in Definition 3.14on page 132. Since g represents {3, 12, 13, 17, 27, 33}, 71(g) = (3/5) =!1 and 72(g) = (3/7) = !1. Thus, the assigned values for g are (!1,!1).Indeed, it follows from Corollary 3.7 on page 132, that if #F has r = 2 genericcharacters, then the assigned values must be (+1,+1) and (!1,!1). Further-more to depict the mechanism of the proof of part (c) in Theorem 3.12, we havethe following. Since

Lg = L(3,1,3) = {3, 12, 13, 17, 27, 33},

then if we let ' = 3 , then ' % Lg, and we have, (4')#1 = 3, so

(4')#1H!F = {3 · 1, 3 · 4, 3 · 9, 3 · 11, 3 · 16, 3 · 29}

= {3, 12, 27, 33, 13, 17} = Lg.

Also, if ' = 4 ,then (4')#1 = 11, then

(4')#1H!F = {11 · 1, 11 · 4, 11 · 9, 11 · 11, 11 · 16, 11 · 29}

= {11, 9, 29, 16, 1, 4} = Lf .

We have shown that the cosets of H!F in G are precisely the elements of Lf

and Lg as asserted by part (c) of Theorem 3.12.

Example 3.6 Now we illustrate Theorem 3.12 on page 132 part (c) when theprincipal genus has more than one class of forms in light of Footnote 3.1 onpage 133. For instance, if #F = !23, then h!F = 3 and there is a single genusof forms, the principal genus, having the three distinct forms, the principal formf = (1, 1, 6), as well as g = (2, 1, 3) and h = (2,!1, 3). Also,

U!F = H!F = {1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18},

so the only cosets of H!F in U!F is U!F = H!F itself. Moreover, Lf = Lg =Lh. It is a direct computation to show, for instance, that f directly represents{1, 4, 6, 8, 9, 12, 16, 18} in the sense that we can find x, y values for f(x, y) to

3.4. Genus 137

equal any member of this set. Yet, it is not clear about 2, 3, 13 since f does notrepresent 2, 3, or 13. However, the definition of Lf requires only that we findany element in one of these classes (not necessarily the same element for eachvalue) that f does represent. Since ' = 117 = 2 + 23 · 5 % 2 and

f(3, 4) = ' = 32 + 3 · 4 + 6 · 42,

then 2 % Lf ; since f(7, 0) = ' = 49 % 3, then 3 % Lf , where we note that properrepresentation is not a requirement in Theorem 3.12 on page 132. Also, sincef(5, 2) = ' = 59 % 13, then 13 % Lf . The reader may verify that Lg = Lh =U!F so, as the genus of a coset given in Definition 3.15 on page 135 tells us, thegenus of Lf = U!F consists of all the forms of discriminant !23 that representthe values of Lf modulo 23, namely all of U!F .

Example 3.7 This example illustrates the #F > 0 case when each genus hasa single class as a real analogue of Example 3.5 on the facing page. Let

#F = 105 = 3 · 5 · 7

for whichU!F = {1, 2, 4, 8, 13, 16, 23, 26, 32, 41, 46, 52, 53, 59,

64, 73, 79, 82, 89, 92, 97, 101, 103, 104}.

Also, h!F = 4 and there is a single genus in each class, where the inequivalentreduced forms are given by

f = (1, 1,!26), g = (2, 9,!3), h = (7, 7,!2) and k = (5, 5,!4).

We have

H!F = Lf = {1, 4, 16, 46, 64, 79}, Lg = {2, 8, 23, 32, 53, 92},

Lh = {13, 52, 73, 82, 97, 103}, and Lk = {26, 41, 59, 89, 101, 104}.

Thus,U!F = Lf 3 Lg 3 Lh 3 Lk.

To illustrate a comment made in Remark 3.11 on page 135, we have the follow-ing. Since 2, 8 % Lg then 2 · 8#1 = 2 · 92 = 72 % H!F = Lf . In other words,for any of the forms, if m, n is in one of the cosets, then m · n#1 % H!F = Lf ,the elements of U!F represented by the principal genus H!F , and so by theprincipal form f , as described in Footnote 3.1 on page 133.

For an example of a discriminant #F > 0 which is a real analogue of Exam-ple 3.6 on the preceding page, see Exercise 3.36 on page 146.

The above allows us to state a fundamental result in genus theory.


Theorem 3.13 Cosets and Genus

Let #F be the discriminant of a quadratic field F , and let H!F be as inTheorem 3.12. If J is a coset of H!F in U!F and p > 2 is a prime not dividing2#F , then p % J if and only if p is represented by a reduced form of discriminant#F in the genus of J .

Proof. By Theorem 3.12 on page 132, J = Lf for some primitive form ofdiscriminant #F . Also,

f(x, y) + p (mod |#F |)

by the definition of Lf . Therefore, The Legendre symbol

(#F /p) = (#F /f(x, y)) = 1.

Thus, by Exercise 3.9 on page 104, p is properly represented by a primitiveform g with Lg = Lf , and p = g(X, Y ) for some X,Y % Z. By Corollary 3.2on page 114, g may be assumed to be reduced. Conversely, if p is representedby a reduced form f of discriminant #F in the genus of J , then p % Lf = J byTheorem 3.12. !

Corollary 3.8 Let #F be the discriminant of a quadratic field F , and let p bea prime not dividing 2#F . Then p is represented by a form of discriminant #F

in the principal genus if and only if there exists an integer z such that

p + z2 + m (mod |#F |),

where m = 0 or

m =

(z + (1!#F )/4 if #F + 1(mod 4),!#F /4 if #F + 0(mod 4).

Proof. By Theorem 3.12 on page 132 and Theorem 3.13, p is represented by aform in the principal genus if and only if

p +(

x2 + xy + (1!#F )y2/4 if #F + 1(mod 4),x2 !#F y2/4 if #F + 0(mod 4).

In the case where y is even, this says

p +(

x2 + xy + (y/2)2 + (x + y/2)2 (mod |#F |) if #F + 1(mod 4),x2 (mod |#F |) if #F + 0(mod 4).

In the case where y is odd, it says

p +(

(x + y#12 )2 + x + y#1

2 + (1!#F )/4(mod |#F |) if #F + 1(mod 4),x2 !#F /4(mod |#F |) if #F + 0(mod 4).

3.4. Genus 139

Setting z = (x + y/2) with m = 0, in the first case, and z = x + (y ! 1)/2,with m = z+(1!#F )/2, in the second case yields the result for #F + 1(mod 4)and setting z = x in both cases yields the result for #F + 0(mod 4). !

Example 3.8 Considering Example 3.5 on page 136 again, for #F = !35,we see that the cosets of H!F in U!F are Lf and Lg where, for instance,12 = 47 % Lg with

47 = p = 3x2 + xy + 3y2 = 3 · 12 + 1 · (!4) + 3(!4)2.

Also, 4 = 109 % Lf where

109 = p = x2 + xy + 9y2 = 42 + 4 · 3 + 9 · 32, (3.32)

and 16 = 191 % Lf with

191 = p = x2 + xy + 9y2 = 132 + 13 · 1 + 9 · 12.

To illustrate Corollary 3.8, using the notation therein, we note that in theprincipal genus, the prime

p = 109 + z2 + z + (1!#F )/4 + 52 + 5 + 9 + 39 + 4 (mod |#F |),

where z = x + (y! 1)/2 and m = z + (1!#F )/2, from the x, y given in (3.32).This illustrates the #F + 1(mod 4) with m '= 0 case in the proof of Corollary3.8. Also, since 1 = 71, we have

p = 71 = 52 + 5 · 2 + 9 · 22 + z2 + (x + y/2)2 (mod |#F |),

where z = x + (y! 1)/2 = 5 + 2/2 = 6, and m = 0 which illustrates the #F + 1(mod 4) with y even case in the proof of Corollary 3.8.

Example 3.9 To illustrate the case where #F + 0(mod 4) in Corollary 3.8, aswell as to motivate the next illustration, we let #F = !8, where U!F = {1, 3}.Here 1 = 19, and

19 = 12 + 2 · 32 + x2 !#F /4 + 3 (mod |#F |),

illustrating the case where the y value is odd. On the other hand, if #F = !4,then U!F = {1} and 1 = 5 where

5 = 12 + 22 = x2 ! y2#F /4 + x2 + 1 (mod |#F |),

illustrating the remaining case where y is even.The two discriminants #F = !4,!8 are special from another perspective

that we explore in the following depiction of representation of primes and classnumbers, that we will study in greater detail in §3.5.


For an illustration of Corollary 3.8 on page 138 in the case where #F > 0see Exercise 3.37 on page 146.

For the following illustration, the reader should solve Exercise 3.23 onpage 144 in preparation.

Example 3.10 Corollary 3.8 allows us to categorize the principal genus viacongruence conditions, especially when there is exactly one class in the principalgenus. For instance, when #F = !4, there is only one class for the principalgenus given by the unique reduced form f(x, y) = x2 + y2 of discriminant !4,which is our problem of representation as a sum of two squares. In this case,an odd prime p = x2 + y2 for some integers x, y if and only if, by Corollary 3.8,there exists an integer z such that

p + z2 (mod 8) or p + z2 + 1 (mod 4), i.e., if and only if p + 1 (mod 4),

a result we have already seen in Theorem 1.13 on page 26. Similarly, if #F = !8,then there is only one class for the principal genus since the unique reduced formof discriminant !8 is x2 + 2y2. By Corollary 1.13, p = x2 + 2y2 if and only ifthere is an integer z such that

p + z2 (mod 8) or p + z2 + 2 (mod 8), i.e., if and only if p + 1, 3 (mod 8).

This is tantamount to saying that the Legendre symbol (!2/p) = 1 if and onlyif p + 1, 3(mod 8), a result we know from elementary number theory–see [68,Exercise 4.3, p. 187] for instance. Also, see Exercise 3.10 on page 104.

When #F + 1(mod 4) we have as an illustration the unique reduced formx2 + xy + 2y2 in the principal genus of discriminant !7. Here, by Corollary3.10, an odd prime

p = x2 + xy + 2y2

if and only if for some integer z,

p + z2, or z2 + z + 2 (mod 7), i.e., if and only if p + 1, 2, 4 (mod 7).

The latter is tantamount to saying that p is a quadratic residue modulo 7, andthis holds if and only if !7 is a quadratic residue modulo p. For instance,p = 29 = (!1)2 + (!1)(4) + 2 · 42.

Lastly, consider #F = !43 for which there is the unique reduced form x2 +xy+11y2. Then, by Corollary 3.10, an odd prime p = x2 +xy+11y2 if and onlyif there is a z % Z with p + z2 + z + 11(mod 43) or p + z2 (mod 43). However,the former congruence implies 4p + (2z + 1)2 (mod 43) so this representationoccurs if and only if p is a quadratic residue modulo 43, and this holds if andonly if !43 is a quadratic residue modulo p.

At this juncture, it is worth pointing out a rather beautiful result by Ja-cobi–see [68, Biography 4.4, p. 192]. He discovered that if p + 3(mod 4) is aprime and p > 3, then if R is the sum of all the quadratic residues modulo p,and NR is the sum of the quadratic nonresidues, then

h#p =NR!R

p.

3.4. Genus 141

For instance, for p = 43,

R = 1 + 4 + 6 + 9 + 10 + 11 + 13 + 14 + 15 + 16 + 17 + 21

+23 + 24 + 25 + 31 + 35 + 36 + 38 + 40 + 41 = 430,

andNR = 2 + 3 + 5 + 7 + 8 + 12 + 18 + 19 + 20 + 22 + 26 + 27

+28 + 29 + 30 + 32 + 33 + 34 + 37 + 39 + 42 = 473,

soh#43 = (473! 430)/43 = 1,

which we know from Exercise 3.23 on page 144. The first proof of this remarkableresult was provided by Dirichlet in 1838, a result known today as Dirichlet’sclass number formula–see [68, Biography 1.8, p. 35]. The actual number ofdiscriminants #F < 0 with hD = 1 has been solved for some time and thevalues for which we have class number one are

#F % {!3,!4,!7,!8,!11,!19,!43,!67,!163}.

In 1934, Heilbronn and Linfoot proved that the above list could contain atmost one more value—see Biography 1.3 on page 50. In 1966 this was provedby Stark. However, in 1952, a proof was given by Heegner, in [40], but thisproof was fragmentary and not well-understood, so it was generally discredited.It turns out that it is a valid proof as was later acknowledged after Deuringcleared it up—see Biography 3.7 on page 146.

Remark 3.12 The conditions in Example 3.10 for representations of primesdo not always occur. In other words, given a form f(x, y) = ax2 + bxy + cy2

of discriminant #F it is not always the case that there exist natural numberss, a1, . . . , as, m, depending on a, b, and c, such that for an odd prime p notdividing #F we have

p = ax2 + bxy + cy2 if and only if p + a1, . . . , as (mod m). (3.33)

In Example 3.10, we saw several instances where (3.33) is satisfied, but theserelied on h!F = 1. When the class number is greater than one, we may nothave (3.33). For instance, if #F = !56, then by Exercises 3.25–3.29, h#56 = 4,and there are two genera with x2 + 14y2 and 2x2 + 7y2 being in the one genusand 3x2 +2xy +5y2 and 3x2!2xy +5y2 being in the other genus. Moreover, asshown in the very readable [91, Theorem, p. 424], (3.33) fails for p = x2 +14y2.The authors, Spearman and Williams, do this by proving that every arithmeticprogression {a + km : k = 0, 1, . . .} where m is assumed even without loss ofgenerality, and gcd(a, m) = 1, either contains no primes of the form x2 + 14y2

or it contains primes of both forms x2 + 14y2 and 2x2 + 7y2. Note, as well,that x2 + 14y2 represents 23 but not 79 and 2x2 + 7y2 represents 79 but not23. However, it is worth observing that this is not to be confused with the fact,


noted in Footnote 3.1 on page 133, that since f = (1, 0, 14) and g = (2, 0, 7) areboth in the principal genus, they represent {1, 9, 15, 23, 25, 39} modulo 56. Inother words, even though f does not represent 79, we do have that f(!1, 3) =23 % 79 % (Z/56Z)), and similarly, g(!3, 8) = 191 % 23 % (Z/56Z)), eventhough g does not represent 23. This latter interpretation via Theorem 3.12 onpage 132 allowed us to view the cosets and genera with an ease that the abovemore rigid interpretation did not allow. Yet to consider the solvability of (3.33),we cannot allow the coset interpretation since it does not apply to this mostinteresting question.

Now we are ready for the exact number of genera. The following was proved,in greater generality, by Gauss in 1801.

Theorem 3.14 The Genus Group

Suppose that F is a quadratic field of discriminant #F divisible by r distinctprimes. Then each of the following holds.

(a) The h!F proper equivalence classes of forms can be subdivided into 2r#1

genera consisting of h!F /2r#1 classes of forms each, which comprise asubgroup G!F of C!F under Dirichlet composition.

(b) With U!F and H!F as given in Theorem 3.12 on page 132,

G!F$=

U!F

H!F

,

and |G!F | = 2r#1.

Proof. By Exercise 3.9 on page 104 there exists at least one class of forms ineach genus. Also, by Exercise 3.26 on page 145, there are an equal number ofclasses in each genus. Lastly, by Lemma 3.5 on page 129 there are 2r#1 possiblegenera, with the product, given in Claim 3.4 on page 130, being +1, since thereare that many possible r-tuples of +1’s and !1’s corresponding to the Jacobisymbols. Under Dirichlet composition given in Definition 3.6 on page 107, withthe identity element being the principal genus, P, this is a subgroup G!F ofC!F , which establishes part (a).

For part (b), letf $gen g

denote that f and g are in the same genus, namely the same equivalence classin G!F . Also, let f

gendenote this class and define the map

5 : G!F ./ U!F

H!F

viaf

gen ./ Lf ,

3.4. Genus 143

whereLf =

Lf

H!F

,

observing, from Remark 3.11 on page 135, that

U!F = 3fLf

soU!F

H!F

$= 3fLf .

In addition, note that by parts (c)–(d) of Theorem 3.12 on page 132,

3fLf$= {Lf

gen}fgen'G!F

,

andLf

gen = Lggen if and only if f $gen g,

so 5 is not only well defined but is indeed a bijection. By part (a), |G!F | = 2r#1,which is the entire result. !

Example 3.11 If, as in Example 3.6 on page 136, #F = !23, then r = 1 soby Theorem 3.14, there is 2r#1 = 1 genus. Since h!F = 3, then the classes offorms f = (1, 1, 6), g = (2, 1, 3), and h = (2,!1, 3) are all in the principal genus,with

U!F = H!F = ((Z/|#F |Z)))2 = {1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18},

so there are h!F /2r#1 = 3 proper equivalence classes of forms in the principalgenus.

On the other hand, if #F = !35 as in Example 3.5 on page 136, thenr = 2 = h#35, so there are 2r#1 = 2 genera, each having h#35/2r#1 = 1 properequivalence class.

Example 3.12 If #F = !420, then it can be shown that h#420 = 8 and in thiscase, r = 4, so

h#420 = 2r#1 (3.34)

and each genus therefore has exactly one class of forms by Theorem 3.14, and|G!F | = 8. Indeed, by Exercise 3.27 on page 145, the criterion for the propertythat every genus of forms of discriminant #F = !4n consists of a single class isthat (3.34) holds. Also, see Exercise 3.33 for other criteria.

Remark 3.13 Gauss not only proved Theorem 3.14 on the preceding page,but also he showed that the principal genus contains exactly those forms thatare squares of some form under Dirichlet composition, sometimes called theduplication or squaring theorem. In other words, if F is a quadratic field ofdiscriminant #F and if P denotes the principal genus of discriminant #F , then


P$= C2!F

. It is also the case, related to the above, that the set of ambiguousforms A!F is a subgroup of C!F and has cardinality

|A!F | = 2r#1,

where r is the number of distinct prime divisors of the discriminant #F . Itfollows that the genus group G!F and group of ambiguous forms A!F arerelated by

A!F$= G!F .

Remark 3.14 Some concluding remarks for this section to summarize theabove developments are in order. Roughly speaking, when we look at formsin C!F , we are essentially considering sets of integers represented by forms. Inthis case, it is su"cient to consider whether or not #F is a quadratic residuemodulo a given prime to determine whether or not such a prime is representedby a form of discriminant #F . Essentially this is what Exercise 3.9 on page 104tells us. When looking at forms in G!F , we are considering sets of congruenceclasses modulo |#F | to which the represented integers belong. When there isa single class of forms (from C!F ) in each genus, then the question of whichprimes are represented by a given form of discriminant #F is completely an-swered by congruence conditions. Many such illustrations were considered inExample 3.10 on page 140. However, if there exist more than one class (fromC!F ) in a given genus, then it is possible that no such congruence conditionsexist to determine which of the forms from the distinct classes, in the samegenus, represent a given prime. For instance, the case #F = !56, considered inRemark 3.12 on page 141, is one such case.

Essentially, two forms of discriminant #F are in the same genus if theyrepresent the same values in (Z/|#F |)), and this is what Theorem 3.12 onpage 132 tells us. Theorem 3.14 on page 142 groups these forms into equivalenceclasses related to the results in Theorem 3.12 in a very natural fashion. Thecosets Lf of H!F in U!F determine in which genus the form f lies. This is tiedto the fact that forms f and g are in the same genus if and only if they havethe same assigned character, and this is tantamount to Lf = Lg as cosets, abeautiful interconnection. Furthermore, the duplication theorem mentioned inRemark 3.13 on the previous page tells us that the principal genus consists ofjust the squares of forms under composition. Also, Remark 3.13 tells us thatthe genus group is essentially the group of ambiguous forms, the central topicof §3.3.

Exercises

3.23. Prove that for D % {!4,!7,!8,!11,!12,!16,!19,!28,!43} there isexactly one class in the principal genus; indeed that hD = 1.(Hint: See the solution of Exercise 3.13 on page 413.)

3.24. By Exercise 3.23, the first negative discriminant D + 1(mod 4) with hD >1 is D = !15. Show that hD = 2, and determine the congruence classesfor representation by the principal form (1, 1, 4).

3.4. Genus 145

3.25. Prove that x2 + 14y2 and 2x2 + 7y2 are in the same genus of discriminant!56 by showing that if p '= 2, 7 is a prime then

p = x2 + 14y2 or p = 2x2 + 7y2 for some x, y % Z

if and only ifp + 1, 9, 15, 23, 25, 39 (mod 56).

3.26. Prove that the classes belonging to the principal genus P form a subgroupof C!F . Then show that every genus forms a coset of P in C!F . Concludethat there are an equal number of classes in each genus.

3.27. Prove that the following are equivalent:

(a) Every genus of forms with discriminant #F = !4n for n % N consistsof a single class.

(b) h#4n = 2r#1 where r is the number of distinct prime divisors of #F .

(Euler found 65 values of n for which this holds and called such n conve-nient numbers. No others are known.)

3.28. Prove that 3x2 + 2xy + 5y2 and 3x2 ! 2xy + 5y2 are in the same genus ofdiscriminant !56 by showing that if p '= 2, 7 is a prime then

p = 3x2 ± 2xy + 5y for some x, y % Z

if and only ifp + 3, 5, 13, 19, 27, 45 (mod 56).

3.29. Prove that h#56 = 4.

3.30. Prove that the product of any two forms of discriminant !4n for n % N isof the form X2 + nY 2 for some X, Y % Z.

3.31. Prove the assertion made in Remark 3.6 on page 108 that (ac, b, 1) $(1, 0,#F /4) when #F + 0(mod 4) and (ac, b, 1) $ (1, 1, (1!#F )/4) when# + 1(mod 4).(Hint: When #F + 0(mod 4), in Definition 3.1 on page 98, select p = b/2,q = 1, r = !1, and s = 0, and when #F + 1(mod 4) select p = !(1+b)/2,q = !1, r = 1 and s = 0.)

3.32. Let #F < 0. Prove that a reduced positive definite form f = (a, b, c) ofdiscriminant #F has order 1 or 2 if and only if b = 0, a = b, or a = c.

3.33. Prove that the following are equivalent.

(a) Every genus of forms with discriminant #F = !4n for n % N consistsof a single class.

(b) Every reduced positive definite form f = (a, b, c) of discriminant !4nsatisfies that either b = 0, a = b, or a = c.


(Hint: Use Exercise 3.32 in conjunction with Remark 3.13 on page 143where it is noted that the principal genus is the group of squares.)To solve Exercises 3.34–3.35, use the techniques employed in the solutionof Exercise 3.13 on page 413.

3.34. Find all primitive reduced forms of discriminant #F = !71.

3.35. Find all primitive reduced forms of discriminant #F = !80.

3.36. Given #F = 229, we have that h!F = 3 where Theorem 3.14 on page 142tells us there is a single genus of forms. Find three inequivalent reducedforms of discriminant #F and a distinct prime p represented by each formsuch that p + 1(mod 229).

3.37. With reference to Exercise 3.36 find an integer z '= 0 for each of the threeprimes p found therein such that p + z2+z!57(mod 229). This illustratesCorollary 3.8 on page 138 for the case #F > 0.

Biography 3.7 Max Deuring (1907–1984) was born in Gottingen, Germanyon December 9, 1907. He entered the University of Gottingen in 1926, wherehe studied mathematics and physics. In 1931, under the supervision of EmmyNoether, he received his doctorate entitled Arithmetische Theorie der algebrais-chen Funktionen–see Biography 2.1 on page 73. One of his strengths was theability to simplify and generalize existing results, one of these being the afore-mentioned work of Heegner on page 141. In his first paper, published in 1931,he generalized Hilbert’s theory of prime divisors in Galois fields to more gen-eral fields–see Biography 3.5 on page 127. In 1931, at the University of Leipzig,Deuring was appointed as van der Waerden’s assistant. In 1937, Deuring wentto the University of Jena where he stayed for six years. In 1937 and 1940 hepublished two papers which were his greatest contributions. These publicationsgeneralized Hasse’s results on the Riemann hypothesis for the zeta function as-sociated with elliptic curve over a finite field. To do this he extended Hasse’sidea from curves of genus 1 to elliptic curves of higher genus using his alge-braic theory of correspondences, which Andre Weil later used to generalize theReimann hypothesis to function fields of arbitrary genus. In 1950, Deuring wasappointed to the fill the chair vacated by Herglotz at Gottingen, which Deuringheld until his retirement in 1976. He died in Gottingen on December 20, 1984.

Among his honours, were election to the Academy of Science and Literaturein Mainz, and the Gottingen Academy of Sciences.

3.4. Genus 147

Biography 3.8 Henri Jules Poincare (1854–1912) was born in Nancy, Franceon April 29, 1854. He entered the Ecole Polytechnique in 1873, and graduatedin 1875. After receiving his doctorate, he was appointed to teach at the Univer-sity of Caen, but remained there for only two years. In 1881, he was appointedto a chair in the Faculty of Science in Paris. Also, in 1886, with the supportof Hermite–see Biography 3.4 on page 126–he was nominated for a chair at theSorbonne. He held these two chairs until his untimely death at the age of 58on July 17, 1912 in Paris.

Poincare created the theory of automorphic forms, non-Euclidean geometry,and complex functions. His contributions to algebraic topology are also seminaland the Poincare conjecture in that area remains a major challenge. In hispaper, published in 1890, on the three-body problem, he created new avenues incelestial mechanics, and gave the first mathematical description of chaotic mo-tion, which essentially began the modern study of dynamical systems. Indeed,in three volumes published between 1892 and 1899, he aimed to completely char-acterize all motions of mechanical systems. He also wrote on the philosophyof mathematics and science in general. In that vein, a quote from an articlepublished in 1904 is germane: “It is by logic we prove, it is by intuition that weinvent.” Another quote, made at an address from his funeral is a fitting bottomline: “He was a mathematician, geometer, philosopher, and man of letters, whowas a kind of poet of the infinite, a kind of bard of science.”


3.5 Representation

Mathematics, rightly viewed, possesses not only truth, but supreme beauty–abeauty cold and austere, like that of sculpture, without appeal to any partof our weaker nature, without the gorgeous trappings of painting or music,sublimely pure, and capable of stern perfection such as only the greatest artcan show.

—from Philosophical Essays (1910)Bertrand Russell (1872–1970), British philosopher and

mathematician

We have looked at some representation problems already in Example 3.10on page 140 for positive definite forms where we looked at representation ofprimes. We now look at the problem more extensively. Indeed, as mentioned inExample 3.10, the problem of representation as a sum of two integer squares issolved via the consideration of #F = !4, the discriminant of F = Q(

&!1). We

also looked, in that example, at representations of the form x2 +2y2, emanatingfrom #F = !8, the discriminant of Q(

&!2). Some other special forms were

considered as well. We now look at more general results based upon the classnumbers of quadratic fields that we linked to the form class group in §3.2. Recallthat by Corollary 3.4 on page 117, we know that hOF < ".

Theorem 3.15 Prime Representation and hOF

Let F be a quadratic field with discriminant #F and (wide) class numberhOF . Suppose that p > 2 is a prime such that gcd(#F , p) = 1 and #F is aquadratic residue modulo p. Then the following hold.

(a) If either #F < 0 or #F > 0 and there exists a u % UF with NF (u) = !1,then there exist relatively prime integers a, b such that

phOF =

( a2 !#F b2 if #F + 1(mod 8),a2 ! !F

4 b2 if #F + 0(mod 4),a2 + ab + 1

4 (1!#F )b2 if #F + 5(mod 8).

(b) If #F > 0 and there does not exist a u % UF with NF (u) = !1, then thereexist relatively prime integers a, b such that

phOF =

( ±(a2 !#F b2) if #F + 1(mod 8),±(a2 ! !F

4 b2) if #F + 0(mod 4),±(a2 + ab + 1

4 (1!#F )b2) if #F + 5(mod 8).

Proof. By Theorem 2.4 on page 60, since p > 2, then if (#F /p) = 1, we have(p) = P1P2 where Pj are distinct prime OF -ideals for j = 1, 2. Thus,

(phOF ) = (p)hOF = PhOF1 P

hOF2 $ (1),

3.5. Representation 149

since PjhOF $ (1) for j = 1, 2 by Exercise 3.18 on page 117. Hence, P

hOFj is a

principal ideal for j = 1, 2. Let

PhOF1 =

$u + v

&#F

2

&

where u + v (mod 2), if #F + 1(mod 4), and u is even if #F + 0(mod 4). Thenvia the proof of Theorem 2.4 on page 60 we know that P2 must be the conjugateof P1, namely

PhOF2 =

$u! v

&#F

2

&.

Hence,

(phOF ) =$

u2 !#F v2

4

&,

so there exists an % % UF such that

phOF = %

$u2 !#F v2

4

&.

However,

% =4phOF

u2 !#F v2% Q.

But, by Corollary 1.2 on page 4, OF *Q = Z, so % % UZ = {±1}. Thus,

4phOF = ±(u2 !#F v2). (3.35)

Claim 3.5 If #F + 0(mod 4), then gcd(u/2, v) = 1, and if #F + 1(mod 4),gcd(u, v) = 1 or 2.

If #F + 1(mod 4), let q > 2 be a prime such that q## gcd(u, v). Then there

exist integers x, y such that u = qx and v = qy, where x + y (mod 2). Therefore,by (3.35), q2

## 4phOF , but q > 2 so q = p. Hence,

PhOF1 = (p)

$x + y

&#F

2

&= P1P2

$x + y

&#F

2

&,

which forces P2

## PhOF1 , contradicting that P1 and P2 are distinct OF -ideals.

We have shown that gcd(u, v) = 2c for some integer c ) 0. It follows from (3.35)that 4c

## 4 so c = 0 or c = 1.If #F + 0(mod 4), and q is a prime such that q

## gcd(u/2, v), then thereexist integers x, y such that u = 2qx and v = qy, so

phOF = ±((qx)2 ! (#F /4)(qy)2)

which forces p = q and this leads to a contradiction as above. This is Claim 3.5.


If #F < 0 then the plus sign holds in (3.35), since u2 ! #F v2 > 0. When#F > 0 and there exists a % % UF with NF (%) = !1, we may multiply by

NF (%) = N(r + s'

#F ) = r2 !#F s2 = !1

to get

!(u2 !#F v2) = (r2 !#F s2)(u2 !#F v2) = (ru + #F sv)2 !#F (rv + su)2.

To complete the proof, we need only show how the a, b may be selected to satisfyparts (a)–(b) of our theorem.

When #F + 1(mod 4), then by (3.35), if u and v are odd, so 4phOF + 0(mod 8), contradicting that p > 2. Thus, by Claim 3.5, gcd(u, v) = 2 so weselect a = u/2 and b = v/2. If #F + 0(mod 4), then by Claim 3.5, we mayselect a = u/2 and b = v. Lastly, when #F + 5(mod 8), since u + v (mod 2),set u = b + 2a and b = v where a, b % Z. Then (3.35) becomes,

±4phOF = u2 !#F v2 = (b + 2a)2 !#F b2 = 4a2 + 4ab + (1!#F )b2,

sophOF = ±(a2 + ab +

14(1!#F )b2),

which secures our result. !

Remark 3.15 As a counterfoil to Theorem 3.15 on page 148, we note that, byExercise 3.9 on page 104, if #F is not a quadratic residue modulo a prime p > 2,then there is no binary quadratic form that represents pk for any positive integerk. Hence, there cannot exist integers (a, b, c) such that pk = ax2 + bxy + cy2 forany integers x, y.

Theorem 3.15 has certain value when hOF = 1. In particular, we have thefollowing results, the first two of which are a recapitulation of what we discussedin Example 3.10 on page 140–and the first of which also appears in Theorem 1.13on page 26–via Theorem 3.15 this time.

Corollary 3.9 Let p be a prime. Then there exist relatively prime integers a, bsuch that

p = a2 + b2 if and only if p = 2 or p + 1 (mod 4).

Proof. By Theorem 3.2 on page 102 and Theorem 3.6 on page 113, for #F = !4,

hOF = hZ[$#1] = 1.

Thus, by Theorem 3.15, if (#F /p) = 1, namely p + 1(mod 4), then p = a2 + b2

for a, b % N. Since 2 = 12 + 12, then we have one direction. Conversely, ifp = a2 + b2, and p > 2, then by Exercise 3.9 on page 104, (!4/p) = (!1/p) = 1,which implies that p + 1(mod 4). !


Corollary 3.10 Let p be a prime. Then there exist relatively prime integersa, b such that

p = a2 + 2b2 if and only if p = 2 or p + 1, 3 (mod 8).

Proof. First, we know that (!8/p) = (!2/p) = 1 if and only if p + 1, 3(mod 8)–see Example 3.10. By Theorem 3.2 and Theorem 3.6, for #F = !8,

hOF = hZ[$#1] = 1.

Therefore, by Theorem 3.15, if (!8/p) = 1, p = a2 + 2b2 for a, b % N. Also,2 = 02 + 2 · 12. Conversely, if

p = a2 + 2b2, and p > 2,

then by Exercise 3.9 on page 104, (!8/p) = (!2/p) = 1. !

Corollary 3.11 Let p be a prime. Then there exist relatively prime integersa, b such that

p = a2 + ab + b2 if and only if p = 3 or p + 1 (mod 3).

Proof. From Exercise 3.38 on the next page, (!3/p) = 1 if and only if p +1(mod 3). By Example 3.10, Theorem 1.3 on page 6, and Theorem 3.6 onpage 113, we have that

h#3 = hZ[(1+$#3)/2] = 1.

Thus, by Theorem 3.15, if (#F /p) = (!3/p) = 1, then

p = a2 + ab + b2 for some integers a, b.

Also 3 = 12 + 1 · 1 + 12. Conversely, by Exercise 3.9 on page 104, if p > 3 andp = a2 + ab + b2, then (!3/p) = 1. !

Corollary 3.12 Let p be a prime. Then there exist relatively prime integersa, b such that p = a2 + 7b2 if and only if p = 7 or

p + 1, 9, 11, 15, 23, 25 (mod 28).

Proof. By Exercise 3.39 on the next page, (!7/p) = 1 if and only if

p + 1, 9, 11, 15, 23, 25 (mod 28).

Also, by Theorem 1.3, Theorem 3.6, and Example 3.10, for #F = !7,

hOF = hZ[(1+$#7)/2] = h#7 = 1.


Therefore, by Theorem 3.15, if (!7/p) = 1, p = a2 + 7b2 for a, b % N. Also,7 = 02 + 7 · 12. Conversely, if

p = a2 + 7b2, and p '= 7,

then by Exercise 3.9 on page 104, (!7/p) = 1. !

Exercises

3.38. Prove that (!3/p) = 1 for a prime p > 3 if and only if p + 1(mod 3).(Hint: You may use the fact from [68, Example 4.11, p. 191], that(3/p) = 1 if and only if p + ±1(mod 12) and (3/p) = !1 if and only ifp + ±5(mod 12).)

3.39. Prove that (!7/p) = 1 for an odd prime p if and only if p +1, 9, 11, 15, 23, 25(mod 28).

In Exercises 3.40–3.43, use the techniques of Corollary 3.11 on the previouspage to solve the representation problems.

3.40. Prove that a prime p is representable in the form

p = a2 + ab + 3b2 for relatively prime a, b % Z

if and only if

p = 11 or p + 1, 3, 5, 9, 15, 21, 23, 25, 27, 31 (mod 44).


p = a2 + ab + 5b2 for relatively prime a, b % N

if and only if p = 19 or

p + 1, 5, 7, 9, 11, 17, 23, 25, 35, 39, 43, 45, 47, 49, 55, 61, 63, 73 (mod 76).




p + 1, 9, 11, 13, 15, 17, 21, 23, 25, 31, 35, 41, 47, 49, 53, 57, 59, 67, 79, 81,

83, 87, 95, 97, 99, 101, 103, 107, 109, 111, 117, 121, 127, 133,

135, 139, 143, 145, 153, 165, 167, 169 (mod 172).





p + 1, 9, 15, 17, 19, 21, 23, 25, 29, 33, 35, 37, 39, 47, 49, 55, 59, 65, 71, 73, 77, 81,

83, 89, 91, 93, 103, 107, 121, 123, 127, 129, 131, 135, 143, 149, 151, 153, 155,

157, 159, 163, 167, 169, 171, 173, 181, 183, 189, 193, 199, 205, 207, 211, 215,

217, 223, 225, 227, 237, 241, 255, 257, 261, 263, 265 (mod 268).

3.44. From Theorem 1.3 on page 6, Example 3.10 on page 140, and The-orem 3.6 on page 113, we know that hOF = hZ[(1+

$#163)/2] = 1.

Thus, Theorem 3.15 on page 148 informs us that odd prime p with(#F /p) = (!163/p) = 1 satisfy that p = a2 + ab + 41b2 for some rel-atively prime integers a, b. Show that for b = 1, a2 + a + 41 is indeedprime for a = 0, 1, . . . , 39.(This is related to a result of Rabinowitsch [79], which states that fornegative #F , with #F + 1(mod 4), we have that hOF = 1 if and only ifx2 + x + (1 !#F )/4 is prime for x = 0, 1, . . . , 0|#F |/4 ! 11. The readermay now go to Exercises 3.40–3.43, and indeed for all values in Example3.10, and verify this fact for those values as well.)(See Biography 3.9 on the next page.)

3.45. Related to the Rabinowitsch result in Exercise 3.44 is the following,known as the Rabiniowitsch–Mollin–Williams criterion for real quadraticfields–see [63]. If F is a real quadratic field with discriminant #F +1(mod 4), then |x2 + x + (1 ! #F )/4| is 1 or prime for all x =1, 2, . . . , 0(

&#F ! 1)/21 if and only if hOF = 1 and either #F = 17 or

#F = n2 + r + 5(mod 8) where r % {±4, 1}–see [65, Theorem 6.5.13, p.352]. Verify this primality for the values

#F % {17, 21, 29, 37, 53, 77, 101, 173, 197, 293, 437, 677}.

(See Biography 3.10 on the following page.)

3.46. It is known that for #F = !20, hOF = 2 and P = (2, 1 +&!5) is an

ideal representing the nonprincipal class. Use the identification given inthe proof of Theorem 3.5 on page 110 to prove the following, where p '= 5is an odd prime.

(a) p = a2 + 5b2 if and only if p + 1, 9(mod 20).(b) p = 2a2 + 2ab + 3b2 if and only if p + 3, 7(mod 20).(c) Conclude that for #F = !20, there are two genera each consisting of

a single class.


Biography 3.9 The following was taken from a most interesting article aboutG. Rabinowitsch by Mordell [72]. Mordell writes: “In 1923, I attended a meet-ing of the American Mathematical Society held at Vassar College in New YorkState. Someone called Rainich from the University of Michigan at Ann Arbor,gave a talk upon the class number of quadratic fields, a subject in which I wasvery much interested. I noticed that he made no reference to a rather prettypaper written by Rabinowitz from Odessa and published in Crelle’s journal.I commented upon this. He blushed and stammered and said, “I am Rabi-nowitz.” He had moved to the U.S.A. and changed his name.... The spelling ofRabinowitsch in this book coincides with that which appears in Crelle [79].

Biography 3.10 Hugh Cowie Williams was born in London, Ontario, Canadaon July 23, 1943. He graduated with a doctorate in computer science from theUniversity of Waterloo in 1969. Since that time, his research interests have beenin using computational techniques to solve problems in number theory, and inparticular, those with applications to cryptography. Currently he holds a Chairunder Alberta Informatics Circle of Research Excellence (iCORE ) at the Uni-versity of Calgary (U of C ). He oversees the Centre for Information Securityand Cryptography (CISaC ), a multi-disciplinary research centre at the U of Cdevoted to research and development towards providing security and privacy ininformation communication systems. There are also more than two dozen grad-uate students and post doctoral fellows being trained at the centre. The iCOREChair is in algorithmic number theory and cryptography (ICANTC ), which isthe main funder of CISaC. The initial funding from Icore was $3 million dol-lars for the first five years and this has been renewed for another five years. Inconjunction with this iCORE Chair, Professor Williams has set up a researchteam in pure and applied cryptography to investigate the high-end theoreticalfoundations of communications security. Professor Williams comes from theUniversity of Manitoba, where he was Associate Dean of Science for Researchand Development, and Adjunct Professor for the Department of Combinatoricsand Optimization at the University of Waterloo. He has an extensive researchand leadership background and a strong international reputation for his work incryptography and number theory. CISaC and ICANTC were acronyms coinedby this author, who initiated the application for the Chair, and is currently amember of the academic sta" of CISaC, as well as professor at the U of C’smathematics department. This author and Professor Williams have coauthoredmore than two dozen papers in number theory, and computational mathematics,over the past quarter century.

3.6. Equivalence Modulo p 155

3.6 Equivalence Modulo p

The bottom line for mathematicians is that the architecture has to be right.In all the mathematics that I did, the essential point was to find the rightarchitecture. It’s like building a bridge. Once the main lines of the structureare right, then the details miraculously fit. The problem is the overall design.

—From the interview in [1]Freeman Dyson (1923–)

American physicist, mathematician, and author

Now we turn to equivalence of forms modulo a prime, a topic that has somerather palatable results.

Definition 3.16 Forms Equivalent Modulo a Prime

Let p be a prime and for j = 1, 2, let #Fj be the discriminants of a quadraticfields Fj . Also, let

fj = (aj , bj , cj)

be primitive forms of discriminant #Fj for j = 1, 2. If there is a transformation

x = rX + sY, y = tX + uY,

wheref1(x, y) + f2(X, Y ) (mod p)

with gcd(ru! st, p) = 1, we say that f1 and f2 are equivalent modulo p, and wedenote this by

f1 $ f2 (mod p).

Remark 3.16 If the forms f1 and f2 are equivalent modulo a prime p, as givenin Definition 3.16, then if p ! #Fj for j = 1, 2,

#F2 + (ru! st)2(b21 ! 4a1c1) + (ru! st)2#F1 (mod p). (3.36)

Thus, from (3.36), the following Legendre symbol equality holds,$

#F1

p

&=

$#F2

p

&.

Lemma 3.6 Vanishing Middle Term Modulo p

If f = (a, b, c) is a primitive form of discriminant #F for a quadratic fieldF , and p is an odd prime not dividing #F , then for some a1, c1 % Z,

(a, b, c) $ (a1, 0, c1) (mod p).


Proof. Since f is primitive, then gcd(a, b, c) = 1, so if p ! a, then by setting

X +$

x +b

2ay

&(mod p), and Y + y (mod p),

we get

ax2 + bxy + cy2 + a

$x +

b

2ay

&2

! #4a

y2 + aX2 ! #4a

Y 2 (mod p).

Similarly, we get such a result when we assume that p ! c. On the other hand,if p

## gcd(a, c), then by setting

x = X + Y and y = X ! Y,

we achieveax2 + bxy + cy2 + bX2 ! bY 2 (mod p).

We have shown that we always have f equivalent modulo p to a form of type(a1, 0, c1). !

Remark 3.17 Lemma 3.6 shows we may always assume that if we consider aform (a, b, c) modulo p, we may assume that p

## b and p ! (ac). Now we havesu"cient tools to establish the first main result.

Theorem 3.16 Canonical Equivalence Modulo p

Suppose that F is a quadratic field of discriminant #F and p is an odd primenot dividing #F . If (a, b, c) is a primitive form of discriminant #F , then eachof the following holds.

(a) If #F is a quadratic residue modulo p, then (a, b, c) $ (1, 0,!1)(mod p).

(b) If #F is a quadratic nonresidue modulo p, then

(a, b, c) $ (1, 0,!#F ) '$ (1, 0,!1) (mod p).

Proof. We begin with a claim.

Claim 3.6 If p ! (ac), then there exist x, y % Z such that

ax2 + cy2 + 1 (mod p).

For x = 0, 1, . . . , p ! 1, ax2 takes on (p + 1)/2 distinct values and as y rangesover 0, 1, . . . , p ! 1, 1 ! cy2 takes on (p + 1)/2 distinct values. Hence, by thePigeonhole Principal–see [68, p.35]– there must exist x, y % Z such that

ax2 + 1! cy2 (mod p),

securing the claim.

3.6. Equivalence Modulo p 157

By Claim 3.6, we may let r, t be integers such that

ar2 + ct2 + 1 (mod p),

and select fixed integers s, u with p ! (ru! st). Now set

b1 + 2ars + 2ctu (mod p) and c1 + as2 + cu2 (mod p).

Therefore, via the transformation

x = rX + sY, y = tX + uY,

we get(a, 0, c) $ (1, b1, c1) (mod p).

If we set#F1 = b2

1 ! 4c1,

then since, p## b1 and p ! c1, via Remark 3.17 on the facing page, we get

(1, b1, c1) $ (1, 0,!#F1/4) $ (1, 0,!#F1) (mod p).

Thus, if#F1 + z2 (mod p),

then via Remark 3.16 on page 155, we know that

#F + z2w2 (mod p),

so via the transformation x = X and y = wzY ,

(1, 0,!#F1) $ (1, 0,!1) (mod p).

Since p## b and p ! (ac) may be assumed via Remark 3.17, then we have shown

that when #F is a quadratic residue modulo p,

(a, b, c) $ (a, 0, c) $ (1, 0,!#F1) $ (1, 0,!1) (mod p).

This is part (a).If #F is not a quadratic residue modulo p, then we have shown that

(a, b, c) $ (1, 0,!#F ) (mod p).

That(1, 0,!#F ) '$ (1, 0,!1)

is Exercise 3.47 on the following page. This is (b) and we have secured theresult. !

Corollary 3.13 If p is an odd prime not dividing #F , then any two forms withdiscriminant #F must be equivalent modulo p.


The reader may go to Exercises 3.48–3.50 for further results on equivalencemodulo 2.

ExercisesIn Exercises 3.47–3.49, #F denotes the discriminant of quadratic field F .

3.47. Prove the fact stated in part (b) of Theorem 3.16 on page 156, that

(1, 0,!#F ) '$ (1, 0,!1).

3.48. Prove that any form f = (a, b, c) of odd discriminant #F must satisfy

(a, b, c) $ (0, 1, 0) (mod 2) if 2## (ac),

and(a, b, c) $ (1, 1, 1) (mod 2) if 2 ! (ac),

in the sense of Defintion 3.16 on page 155.

3.49. With reference to Exercise 3.48, show that

(0, 1, 0) '$ (1, 1, 1) (mod 2).

3.50. Prove that any two forms with the same odd discriminant must be equiv-alent modulo 2.

3.51. Let p be an odd prime and let fj = (aj , bj , cj) be forms of discriminant#Fj where p

## #j for j = 1, 2. Prove that

(a1, b1, c1) $ (a2, b2, c2) (mod p)

if and only if the Legendre symbol equality$

n1

p

&=

$n2

p

&

holds where nj is an integer represented by fj with gcd(nj ,#Fj ) = 1 forj = 1, 2.(Hint: Use Lemma 3.1 on page 105.)

Chapter 4

Diophantine Approximation

We could use up two Eternities in learning all that is to be learned about ourown world and the thousands of nations that have arisen and flourished andvanished from it. Mathematics alone would occupy me eight million years.

from Notebook No. 22, Spring 1883–September 1884.Mark Twain (1835–1910), born Samuel Langhorne Clemens,

American writer

In this chapter,we assume the background on continued fractions, rationalapproximations, quadratic irrationals, and related topics covered, for instance,in [68, Chapter 5].

4.1 Algebraic and Transcendental Numbers

We have already looked at some Diophantine equations in §1.1. In particular,in Definition 1.10 on page 13, and Theorem 1.8 on page 14, we considered theRamanujan–Nagell equation, the generalization of which we will study later inthe text. The relationship between the solution of Diophantine equations andapproximation of algebraic numbers by rational numbers is the focus of thissection. In particular, we know from [68, Corollary 5.3, p. 215, Exercise 5.10,p.220], for instance, that there are infinitely many rational number p, q suchthat ####%!

p

q

#### <1q2

. (4.1)

A natural query is: Can the exponent 2 be increased to get a general result thatimproves upon (4.1)? In a drive to answer this question, the Fields medal wasachieved by Roth in 1958 for his 1955 result: If % is an algebraic number, thenfor a given 3 > 0, there exist at most finitely many rational numbers p, q, with

159

160 4. Diophantine Approximation

q > 0 such that ####%!p

q

#### <1

q2+&(4.2)

–see [21]. Roth’s work was preceded by results of Thue in 1909 and Siegel in1921–see [68, Biography 1.12, p. 45] and Biography 4.4 on page 170. Both ofthe latter two improved upon the following result of Liouville–see Biography 4.3on page 168.

Biography 4.1 Klaus Friedrich Roth (1925–) was born on October 29, 1925in Breslau, Germany (now Wroclaw, Poland). He achieved his BA in 1945from Peterhouse, Cambridge. In 1946, he entered University College, Lon-don where he was awarded his master’s degree in 1948. In that year he wasappointed lecturer there and was awarded his doctorate in 1950, under the di-rection of Davenport. In 1955, when he was a lecturer at University Collegein London, he proved what is now known as the Thue–Siegel–Roth Theorem,or just Roth’s Theorem, (4.2), for which he won the Fields medal. Indeed, themedal, was awarded by Davenport at the International Congress of Mathemati-cians in 1958–see Biography 1.6 on page 54. To date, he is the oldest Fieldsmedalist.

Roth became a professor at University College, London in 1961. Then hemoved to a chair at Imperial College, London, a position he held until hisretirement in 1988. He came back as a visiting professor there and remainedat Imperial College until 1996 when he returned to Scotland. He is also knownfor his 1952 proof that subsets of the integers of positive density must containinfinitely many arithmetic progressions of length three, which established thefirst non-trivial case of what we now call Szemeredi’s theorem.

Among Roth’s many honours were also fellowship in the Royal Society ofLondon in 1990, and in the Royal Society of Edinburgh in 1993. Moreover,other medals he won were the De Morgan Medal of the London MathematicalSociety in 1983, and the Sylvestor Medal of the Royal Society in 1991.

In the aforementioned presentation of the medal by Davenport, he said ofRoth’s work: “It will stand as a landmark in mathematics for as long as math-ematics is cultivated.”

Remark 4.1 Before stating the result, we will need an elementary fact fromintroductory calculus, the Mean-Value Theorem, which says that, given a func-tion continuous on the interval [a, b], a '= b, in R2 and di!erentiable on the openinterval (a, b), then there exists a & % (a, b) such that

f %(&) =f(b)! f(a)

b! a.

Theorem 4.1 Liouville’s Theorem

If % is a real algebraic number of degree n > 1, then there is a constant

4.1 Algebraic and Transcendental Numbers 161

c# > 0 such that for any rational number p/q, q > 0,####%!

p

q

#### >c#

qn.

Proof. Let

f(x) =n!

j=0

ajxj

be the minimal polynomial of % over Q, where aj % Z for 0 # j # n byLemma 1.1 on page 9. We may assume that

|%! p/q| < 1 (4.3)

since otherwise we choose c# # 1/2, then if

|%! p/q| > 1,

we must have|%! p/q| > c#/qn

because 1 > c#/qn. By the Mean-Value Theorem cited in Remark 4.1, thereexists a & between p/q and % such that

f %(&)$

%! p

q

&= f(%)! f

$p

q

&= !f

$p

q

&. (4.4)

We require the following which is of interest in its own right.

Claim 4.1 If we set

c# =1

n2 max0&j&n{|aj |})(1 + |%|)n#1, (4.5)

a positive constant, depending only on %, then

|f %(&)| <1c#

.

By (4.3), |&| < 1 + |%|, so

|f %(&)| =

######

n#1!

j=1

jajxj#1

######<

######

n#1!

j=1

n max0&j&n

{|aj |}(1 + |%|)j#1

######

< n2 max0&j&n

{|aj |})(1 + |%|)n#1 =1c#

,

since n > 1. This secures Claim 4.1.


Since we have that####f

$p

q

&#### =|anpn +

"n#1j=0 ajpjqn#j |qn

) 1qn

(4.6)

then by (4.4), Claim 4.1, and (4.6),####%!

p

q

#### =|f(p

q )||f %(&)| >

c#

qn,

as required. !

Remark 4.2 In the definition of c# given in (4.5), there is

H(%) = max0&j&n

{|aj |},

which is called the height of %, also known as the height of the minimal poly-nomial f(x). Louiville’s Theorem actually states that algebraic numbers arenot too well approximated by rational numbers. Moreover, the statement ofthe theorem seems to suggest that the degree of approximation depends on thegiven algebraic number %. However, Roth’s Theorem shows that this is notthe case – see (4.2) on page 160. Indeed, transcendental numbers can be betterapproximated by rational numbers. (Recall that a transcendental number is acomplex number that is not algebraic.) To see this, note that if % is an algebraicnumber with continued fraction expansion

% = ,q0; q1, q2, . . .-,

having convergents Cj = Aj/Bj for j = 0, 1, 2, . . . , then by Exercise 4.1 onpage 167, ####%!

Aj

Bj

#### #1

qj+1B2j

,

and by Liouville’s Theorem,####%!

Aj

Bj

#### >c#

Bnj

,

so by combining the two, we get

c#qj+1 < Bn#2j . (4.7)

In particular, when n = 2, the sequence of partial quotients are bounded, since

qj+1 <1c#

.

In reference to the Liouville numbers cited in Biography 4.3 on page 168, con-sider the continued fraction expansion of * % R given by

* = ,1, 101!, 102!, 103!, . . . , -,


from which it follows that qj = 10j! and

Bj = 10j!(1+o(1)).

(Recall that the “little oh” symbol is defined for functions f and g, denoted byf = o(g), to mean that limx!" f(x)/g(x) = 0.) Hence,

qj+1

Bkj

/" as j /" for any k,

so by (4.7), * must be transcendental. In fact, this motivates a major resultlater in this section, namely that almost all real numbers are transcendental.Here “almost all” means all but an “enumerable” set, a concept we now define.

Definition 4.1 Cardinal Numbers and Enumerable Sets

If there exists a bijection between two sets A and B, namely there exists a one-to-one correspondence between them, then the sets are said to have the samecardinal number. Equivalently they are said to be equipotent to one another.Any set that is equipotent to N, the natural numbers, is called enumerable. Anyset that is either finite or enumerable is called a countable set. If a set is notcountable, it is called uncountable.

If % is an algebraic number, then there exist a polynomial of degree d % N,

f(x) = a0 + a1x + · · · + adxd, (4.8)

with aj % Z for j = 0, 1, 2, . . . , d not all zero such that f(%) = 0. We define therank of (4.8) to be

' = d + |a0| + |a1| + · · · + |ad|, (4.9)

where we see that ' ) 2.Now we show that the set of algebraic numbers Q is countable–see Defini-

tion 1.4 on page 2.

Theorem 4.2 The Set of Algebraic Numbers is Enumerable

Q is enumerable.

Proof. For a given value of ' % N with ' ) 2, there are only finitely manyequations (4.8) for which (4.9) holds. Thus for a given ' % {2, 3, . . .} let thosefinitely many equations be given by

E!,1, E!,2, . . . , E!,n$ .

For each ' = 2, 3, . . . , we may arrange the equations in a sequence

E2,1, E2,2, . . . , E2,n2 , E3,1E3,2, . . . , E3,n3 , E4,1, . . .


and let the set of all of these equations be denoted by S. We may now put Sin a one-to-one correspondence with N via the mapping, for ' = 2, 3, . . ., withj = 1, 2, . . . , n! where n1 = 1, given by

, : E!,j ./!#1!

i=1

ni + j ! 1.

Clearly ,(S) ( N, and now we show that , is surjective. Let k % N be arbitrary,and let mk ) 2 be the largest value such that k )

"mk#1i=1 ni. Thus, there exists

an integer sk ) 0 such that

k =mk#1!

i=1

ni + sk.

If sk ) nmk , then k )"mk

i=1 ni, contradicting the definition of mk, so 0 # sk #nmk ! 1. Hence,

, (Emk,sk+1) =mk#1!

i=1

ni + sk = k,

and this shows that ,(S) = N. Now we show that , is injective. If there existsk, ' % {2, 3, . . .} and j,m % N such that 1 # j # nk, 1 # m # n!, and

, (Ek,j) =k#1!

i=1

ni + j ! 1 =!#1!

i=1

ni + m! 1 = ,(E!,m),

then we need to show that k = ' from which we get that j = m and , is thenshown to be injective. If k '= ', then we may assume without loss of generalitythat ' > k, so

0 =!#1!

i=k

ni + m! j )!#1!

i=k

ni + 1! nk ) 1,

a contradiction. This secures the entire result. !

Corollary 4.1 The set of all rational numbers is countable.

Proof. Since Q ( Q, the result follows from Theorem 4.2 and Exercise 4.2 onpage 167. !

The following was proved by Cantor.

Theorem 4.3 The set of real numbers is uncountable.

Proof. If R is countable, then by Exercise 4.2, the interval (0, 1) ( R is count-able. Thus, we may let %j % (0, 1) for j = 1, 2, . . . be an enumeration of theseunit interval numbers. Each %j has a decimal expansion which we will denoteby

%j = 0.dj,1dj,2 · · · dj,n · · · , with 0 # dj,n # 9.


Now define% = 0.c1c2 · · · cn · · · ,

where

cj =

(dj,j + 1 if 0 # dj,j # 5,dj,j ! 1 if 6 # dj,j # 9.

Since the j-th decimal place of % di!ers from that of %j for any j and % % (0, 1).Also, since cj '= 0, 9 for any j, then % can have only one decimal representation.Thus, since % is not on the list of %j , this contradicts the enumerability of (0, 1).!

Hence, we have the following result promised in Remark 4.2 on page 162.

Corollary 4.2 Almost all real numbers are transcendental.

Proof. By Theorem 4.3, R is uncountable and by Theorem 4.2, Q is countable.Hence, almost all real numbers are transcendental. !

Biography 4.2 Georg Cantor (1845–1918) was born in St. Petersburg, Rus-sia. He attended university at Zurich, then later at the University of Berlin,where he studied under Kummer, Weierstrass, and Kronecker–see Biogra-phy 4.6 on page 179. In 1867, he obtained his doctorate for his work in numbertheory. In 1869 he took a position at the University of Halle which he kept untilhe retired in 1913. Unfortunately, he su"ered from mental illness in the lateryears of his life and died of a heart attack in a psychiatric clinic in 1918.

Cantor is known to be the founder of set theory, as well as for his contributionsto mathematical analysis. Cantor even wrote on the connections between settheory and metaphysics, displaying his interest in philosophy as well. Therewere some, such as Kronecker, who did not agree with Cantor’s views on settheory. Indeed, Kronecker blocked an application by Cantor for a position atBerlin when he applied for a better-paying position there.

In Exercise 4.3 on page 167, we have irreducibility criteria for polynomialsthat allows us to establish a result on the degree of roots of natural numbers.

Theorem 4.4 Rational Roots of Natural Numbers

If n % N and m > 1 is an integer such that m '= rd for any r, d % N suchthat d

## n and d > 1, then m1/n is an algebraic integer of degree n.

Proof. Let f(x) = xn !m for a given integer m > 1 which is not an dth powerof a natural number for any divisor d > 1 of n. If % = m1/n, then f(%) = 0. ByDefinition 1.4 on page 2, it su"ces to show that f is irreducible over Q. Supposethat f(x) = g1(x)g2(x), where we may assume via Gauss’ Lemma elucidated inExercise 4.3 on page 167 that gj(x) % Z[x] for j = 1, 2. If !n denotes a primitiventh root of unity–see Definition 1.2 on page 2–then we may write

f(x) =n#17

j=0

(x! !jn%).


Let Sj for j = 1, 2 be sets such that S1 3 S2 = {0, 1, 2, . . . , n ! 1}, defined viathe following,

g1(x) =7

j'S1

(x! !jn%) and g2(x) =

7

j'S2

(x! !jn%).

If |Sj | = sj , for j = 1, 2, then

g1(0) = (!1)s1%s1!P

j#S1j

n and g2(0) = (!1)s2%s2!P

j#S2j

n . (4.10)

Now since g1(0)g2(0) = f(0) = !m, and (!1)s1+s2 = (!1)n while

!P

j#S1j+

Pj#S2

jn = !

Pn"1j=0 j

n = !n(n#1)/2n

(see [68, Theorem 1.1, p. 2] for the last equality), then (!1)n!n(n#1)/2n = !1,

observing that !n(n#1)/2n = !1 when n is even. Therefore, %sj % N for j = 1, 2.

Let t % N be the least value such that %t % Q.

Claim 4.2 If j % N with %j % Q, then t## j.

Since t % N is the least such value, then there exist q, r % Z such thatj = tq + r where 0 # r < t. However, %r = %j%#tq % Q, so by the minimalityof t, if r > 0, then r ) t, a contradiction. Hence, r = 0 and t

## j and we havethe claim.

By Claim 4.2, sj

## t for j = 1, 2 and t## n. Since (%t)n/t = m, and %t = q1/q2

for qi = 1, 2 with qi % N and gcd(q1, q2) = 1, then mqn/t2 = qn/t

1 so each primefactor of q2 divides q1. But since gcd(q1, q2) = 1, then this means that q2 = 1,so m = %t % N. Now, if we set d = n/t, then m = (%t)d, contradicting that mis not the dth power of any natural number if n '= t. Hence, n = t, and eitherS1 = " or S2 = ". In other words, f is irreducible over Q. !

Theorem 4.4 speaks about rational powers of algebraic numbers. A naturalquestion to pose is: what happens when we raise algebraic numbers to irrationalpowers? In 1934, Gel

,

fond and Schneider proved, independently, that if % '=0, 1 is an algebraic integer and & is an irrational algebraic integer, then %%

is transcendental, a result known as the Gel,

fond–Schneider Theorem. Thisresult was generalized substantively by Baker [3] in 1966, when he showed thatif {%j}1&j&n and {*j}1&j&n are algebraic integers where {1, *1, . . . , *n} and{2"i, loge(%1), . . . , loge(%n)} are linearly independent over Q, then

n7

j=1

%'j

j is transcendental

and {loge(%j)}1&j ln n are linearly independent over Q, where Q is the field ofall algebraic numbers. (Recall that a set {%j}n

j=1 is linearly independent overQ if

"nj=1 qj%j = 0 for qj % Q implies qj = 0 for j = 1, 2, . . . , n.)


Baker’s result yields methods that are applicable to Diophantine equations.For instance, one such quantitative result is the following.

Suppose that we have the Diophantine equation with n ) 3,

f(x, y) =n!

j=0

ajxjyn#j % Z[x, y]. (4.11)

Then if m % N, a solution (X, Y ) % Z2 to (4.11) satisfies

loge{max{|X|, |Y |}} # C

for some constant C depending on m, n and

H = max0&j&n

{|aj |},

the height of f–see Remark 4.2 on page 162. Indeed, it can be shown that wemay select

C = (nH)(10n)5 + (loge m)2n+2.

In §4.2, we examine the role of transcendental numbers, including the con-tributions of Liouville and others discussed above.

Exercises

4.1. Given a simple continued fraction expansion

% = ,q0; q1, . . .-,

of an algebraic number %, with convergents Aj/Bj for j = 0, 1, . . ., provethat ####%!

Aj

Bj

#### #1

qj+1B2j

.

(Hint: you may use the fact that

%! Aj

Bj=

(!1)j

Bj(%j+1Bj + Bj#1),

where %j+1 = qj+1+1/%j+2, which is a fact that follows from [68, Theorem1.12, p. 25].)

4.2. Prove that every subset of a countable set is countable.

4.3. Let f1(x), f2(x) % Z[x], set f3(x) = f1(x)f2(x), and define gcd(fj) to bethe gcd of the coe"cients of fj(x) for j = 1, 2, 3. Prove that

gcd(f3) = gcd(f1) gcd(f2).

Conclude that if f(x) % Z[x] and f(x) = h(x)g(x), for h(x), g(x) % Q[x],then f(x) = G(x)H(x) for some G(x), H(x) % Z[x].


(This is often called Gauss’ Lemma on integral polynomial factorization.Essentially, it says that any polynomial that is irreducible in Z[x] is alsoirreducible in Q[x], or speaking in the contrapositive, if f(x) is reduciblein Q[x], then it is already reducible in Z[x].)

Biography 4.3 Joseph Liouville (1809–1882) was born in Saint-Omer, Franceon March 24, 1809. He entered the Ecole Polytechnique in 1825 and graduatedin 1827 with Poisson being one of his examiners—see [68, Biography 1.22,p.68]. After graduation, he su"ered some health problems, but in 1831 foundhis first academic post with an appointment as assistant to Claude Mathieu, whoheld a chair at Ecole Polytechnique after succeeding Ampere. This and otherpositions he held were largely teaching positions with up to 40 hours a week ofinstruction. Yet in 1836, he founded the Journal de Mathmatiques Pures etAppliques, now commonly called Journal de Liouville, which was influentialin France in the nineteenth century. In 1837, he was appointed Professor ofAnalysis and Mechanics at the Ecole Polytechnique, and in 1838 he was electedto the astronomy section of the Academie des Sciences. Then Poisson died andLiouville was appointed to the Bureau des Longitudes to fill the vacancy in 1840.During much of the next decade, he was involved in politics. In 1851, he won thebid for the chair vacated by Libri at the College de France, beating Cauchy, andbegan lecturing there in 1851. In that year he published results on transcendentalnumbers that eliminated their dependence on continued fractions. In particular,he presented the first proof of the existence of a transcendental number, nowcalled the Liouvillian number, 0.110001 . . . where there are zeros except in then! place, for each n % N, where there is a 1–see Remark 4.2 on page 162.

Liouville’s mathematical interests ranged widely from mathematical physics toastronomy and pure mathematics. For instance, his work on di"erential equa-tions resulted in the Sturm–Liouville theory, used in solving integral equations,which have applications to mathematical physics. As well, he made inroadsin di"erential geometry when he studied conformal transformations. There heproved a major result involving the measure-preserving property of Hamilto-nian dynamics, which is fundamental in statistical mechanics. He publishedmore than four hundred papers of which half were in number theory. He diedin Paris, France on September 8, 1882.

4.4. Use Roth’s result (4.2) to prove the following. Let n ) 3 and assume that"

f(x, y) =n!

j=0

an#jxjyn#j % Z[x, y]

is an irreducible (homogeneous) polynomial. Suppose furthermore that

g(x, y) =!

0&k+!&n#3

bk!xky! % Q[x].

Prove that f(x, y) = g(x, y) has only finitely many solution (x, y) % Z2.


(Hint: Suppose that %j for j = 1, 2, . . . , n are all solutions of f(x, 1) = 0.Show that there is a constant K such that

######a0

n7

j=0

(x! %jy)

####### Kyn#3.

Proceed to conclude that there must exist some natural number m # nsuch that

|%m ! x

y| <

C

y3,

for some constant C.)

4.5. Show that"!

j=1

a#j2

is irrational where a > 1 is an integer.

4.6. Prove the following result due to Thue. Let n ) 3 and let

f(x, y) =n!

j=0

ajxn#jyj % Q[x]

be an irreducible homogeneous polynomial. If m % Q, then f(x, y) = mhas only finitely many solutions (x, y) % Z2.


Biography 4.4 Carl Ludwig Siegel (1896–1981) was born in Berlin, Germany.He entered the University of Berlin in 1915, attending lectures by Frobenius andPlanck. In 1917, his studies ended when he was called to military duties. Afterbeing discharged, he returned to his studies in Gottingen in 1919 under thesupervision of Landau—see Biography 3.1 on page 104—achieving his doctoratein 1920. Siegel improved upon Thue’s result that in turn extended Liouville’sTheorem 4.1 on page 160. Thue proved that, given an algebraic number %of degree n ) 2, there exists a positive constant c# such that for all rationalnumbers p/q and any 3 > 0,

####%!p

q

#### >c#

qn/2+1+&.

Siegel improved this by showing that the above exponent on q could be replacedby 2

&n + 3. In 1947, Dyson improved this to show that the exponent could be

replaced by&

2n + 3—see page 155.In 1922, Siegel was appointed professor at Johann-Wolfgang-Goethe Uni-

versity of Frankfurt to succeed Schonflies. For over more than a decade Siegelcollaborated with his colleagues Hellinger, Epstein, and Dehn at Frankfurt. Thisincluded a history of mathematics seminar they held for thirteen years.

On January 30, 1933, Hitler came to power enacting the Civil Service Lawon April 7, 1933. This was used as a mechanism for firing Jewish teachersfrom positions at universities. Although Siegel was not Jewish, he vehementlydisagreed with the Nazi policies so much that he left for the U.S.A. in 1935where he spent a year at the Institute for Advanced Study at Princeton. How-ever, in 1937, he accepted a professorship at the University of Gottingen. Butwhen Germany went to war in 1939, he felt he could not stay in his homeland.In 1940, he spent a brief time in Norway, then went back to the Institute atPrinceton where he remained from 1940 to 1951. In that year, he returned toGottingen where he remained until his death on April 4, 1981.

Siegel contributed to many areas of mathematics including: as noted above,approximation of algebraic numbers by rational numbers, but also transcendencetheory, zeta functions, the geometry of numbers, quadratic forms, and celestialmechanics. Siegel never married and had very few doctoral students. He haddevoted his life to research. Perhaps his most prestigious honour was the WolfPrize bestowed on him in 1978.

4.2. Transcendence 171

4.2 Transcendence

The meaning doesn’t matter if it’s only idle chatter of a transcendental kind.From act I of Patience (1881)

William Schwenck Gilbert (1836–1911)–English writer of comic and satirical verse

In Corollary 4.2 on page 165, we proved that almost all real numbers aretranscendental. We now look more closely at such numbers. In Remark 4.2on page 162 and Biography 4.3 on page 168, we defined a Liouvillian number,which is transcendental. We now generalize this notion.

Definition 4.2 Liouville Number

A real number % is called a Liouville number if for all m % N there existam, bm % Z such that bm > 1 and

0 <

####%!am

bm

#### <1

bmm

. (4.12)

The Liouvillian number cited above from §4.1 is a special case of Definition4.2. Given that

limm!"

1bmm

= 0,

then % is approximated by am/bm better as m grows large, which shows thatthe set of these bm is unbounded. We establish this via Liouville’s Theorem 4.1on page 160 in the following result.

Theorem 4.5 Liouville Numbers Are Transcendental

Every Liouville number is transcendental.

Proof. Assume that % is a Liouville number that is not transcendental. Then %is an algebraic number of degree n > 1. By (4.12) and Liouville’s Theorem 4.1,

c#

bnm

<

####%!am

bm

#### <1

bmm

,

so 0 < c# < bn#mm . However, as noted above, limm!" 1/bm

m = 0, acontradiction. !

Now that we have established the existence of transcendental numbers andprovided sets thereof, we turn to the problem of the transcendence of specificnumbers such as e and "—see Biographies 3.4 on page 126 and 3.6 on page 128for background on the solution of these two problems. Compared to the method-ology for establishing existence above and in §4.1, establishing the transcendence


of individual numbers is a more intricate problem. One open question is thetranscendence of

* = limn!"

$1 +

12

+13

+ · · · + 1n! loge(n)

&, (4.13)

called Euler’s constant. Indeed it is unknown if * is irrational. Other well knownnumbers that have resisted attempts to prove transcendence are the values ofthe Riemann Zeta function !(2n + 1) for n = 1, 2, . . ., although !(3) has beenproved irrational by Apery, and thus is known as Apery’s constant—see §5.3.

The following proof is essentially due to Hermite, and we follow the approachgiven in [93, Theorem 9.5, p. 145]. We assume knowledge of elementary calcu-lus, including the following generalization of the product formula known as theLeibniz formula–see Biography 4.5 on page 175.

(fg)(i)(x) =i!

k=0

$i

k

&f(x)(k)g(x)(i#k), (4.14)

where f (j) denotes the j-th derivative of f . Also, recall the integration by partsformula N

d(uv) = uv =N

udv +N

vdu. (4.15)

Theorem 4.6 The Transcendence of e

The real number e is transcendental.

Proof. We use properties of the following integral defined for t ) 0,

I(t) =N t

0et#xf(x)dx,

where f(x) % R[x]. Employing integration by parts we get

I(t) = !et#xf(x)|tx=0 +N t

0et#xf %(x)dx

= etf(0)! f(t)! et#xf %(x)|tx=0 +N t

0et#xf %%(x)|tx=0

...

= etd!

i=0

f (i)(0)!d!

i=0

f (i)(t).

Therefore,

I(t) = etd!

i=0

f (i)(0)!d!

i=0

f (i)(t). (4.16)


Now if we let fab(x) be f(x) with absolute values around the coe"cients off(x), then

|I(t)| #N t

0|et#xf(x)|dx # tetfab(t). (4.17)

We will employ the above for a specific function f that we will define below.We proceed to prove that e is transcendental by contradiction. Assume, to

the contrary, that e is algebraic. Then there is a minimal polynomial

P (x) =d!

j=0

bjxj % Z[x],

with

P (e) =d!

j=0

bjej = 0. (4.18)

Note that by Exercise 4.7 on page 179, e is irrational so we may assume thatd > 1, allowing for the following. We arbitrarily select a large prime p, whichwe may specify later, and set

f(x) = xp#1d7

j=1

(x! j)p. (4.19)

Observe that the degree of f is

df = (d + 1)p! 1. (4.20)

Now consider the sum

J =d!

j=0

bjI(j). (4.21)

Thus, by (4.16) and (4.18),

J =d!

j=0

bj

B

Cej

df!

i=0

f (i)(0)!df!

i=0

f (i)(j)

D

E

=df!

i=0

f (i)(0)d!

J=0

bjej !

df!

i=0

d!

j=0

bjf(i)(j) = !

df!

i=0

d!

j=0

bjf(i)(j).

Thus,

J = !df!

i=0

d!

j=0

bjf(i)(j). (4.22)

Now suppose that 0 # k # d and define

hk(x) =f(x)

(x! k)p= xp#1

d7

m=1m(=k

(x!m)p % Z[x].


By (4.14),

f (i)(x) = ((x! k)p · hk(x))(i) =i!

!=0

$i

'

&((x! k)p)(!))(hk(x))(i#!). (4.23)

If 0 # i < p, then f (i)(k) = 0 since the sum in (4.23) has (k ! k)p = 0 in eachterm. On the other hand, if i ) p, then

f (i)(k) =$

i

p

&p! · h(i#p)

k (k).

Hence, for any i ) 0, p!## f (i)(k). By a similar analysis, f (i)(0) = 0 for any

i < p! 1. Also, if i ) p! 1 and we set

m(x) =P (x)xp#1

% Z[x],

thenf (i)(0) =

$i

p! 1

&(p! 1)! · m(i#p#1)(0).

Hence, p## m(i)(0) % Z for i > 0 and

m(0) = (!1)dp(d!)p.

It follows that for i '= p ! 1, p!## f (i)(0) % Z and that (p ! 1)!

## f (p#1)(0) % Z,but for p > d, p ! f (p#1)(0). Since we may select p > d as large as we like, itfollows from (4.22) that (p ! 1)!

## J , so |J | ) (p ! 1)!. However, we also havefrom (4.19)–(4.20), that

fab(k) # kp#1d7

j=1

(k + j)p < (2d)2p#1 # (2d)df < (2d)2dp. (4.24)

Thus, (4.24) tells us via (4.17) and (4.21) that

|J | #d!

k=0

|bk|kekfab(k) # d(d + 1)Ked(2d)2dp < Cp,

whereK = max

0&k&d{|bk|},

andC = d(d + 1)Ke(2d)2d,

which is a constant not depending on p. We have shown that

(p! 1)! # |J | # Cp,

which bounds p, a contradiction to the fact that we have arbitrarily chosen plarge. This contradiction establishes the result. !


Remark 4.3 Theorem 4.6 on page 172 illustrates a few of the techniques in-volved in the theory of transcendental numbers. Although the proof of thetranscendence of " does not really use any deeper results, more is needed inthe proof in terms of algebraic conjugates of % % Q and the use of symmetricpolynomials—see Exercise 4.8 on page 180. The following is due to Lindemann—see Biography 3.6 on page 128

Biography 4.5 Gottfried Wilhelm von Leibniz (1646–1716) was born on July1, 1646 in Leipzig, Saxony (now Germany). He studied law at Leipzig from1661 to 1666 and ultimately received a doctorate in law from the Universityof Altdorf in February 1667. Then he pursued a career in law at the courts ofMainz from 1667 to 1672. From 1672 to 1676, he spent his time in Paris wherehe studied mathematics and physics under Christian Huygens (1629–1695 ).In 1676, he left for Hannover, Hanover (now Germany), where he remainedfor the balance of his life. Leibniz began looking for a uniform and usefulnotation for the calculus in 1673, and by the autumn of 1676, he discoveredthe di"erential notation d(xn) = nxn#1dx for n % Q. In 1684, he published thedetails of the di"erential calculus, the year before Newton published his famedPrincipia. There remained a bitter dispute over priority concerning discoveryof the calculus. In 1700, Leibniz created the Brandenburg Society of Sciences,which led to the creation of the Berlin Academy some years later. Then hebecame increasingly reclusive until his death in Hannover on November 14,1716.

Much of the mathematical activity in his last years involved the aforemen-tioned priority dispute over the invention of the calculus. In 1714, he publisheda pamphlet indicating a mistake made by Newton in understanding higher orderderivatives, an error that was discovered by Johann Bernoulli, as evidence ofhis case.

Theorem 4.7 The Transcendence of "

The real value " is transcendental.

Proof. If " is algebraic, then given that Q is a field, then % = i" is also algebraicwhere i2 = !1. Let the algebraic conjugates of % be

% = %1, %2, . . . ,%d, for some d % N.

Sincee#1 = ei( = !1,

called Euler’s identity for e, then

(1 + e#1)(1 + e#2) · · · (1 + e#d) = 0.

We may writed7

j=1

(1 + e#j ) =!

%=Pd

i=1 &i!iwhere &i#{0,1}

e). (4.25)


If we let{+1, +2, . . . , +n}

be the exponents in the sum (4.25) that are nonzero, then

2d ! n +n!

j=1

e)j = 0.

Now we may invoke the techniques used in the proof of Theorem 4.6 by com-paring

n!

i=1

I(+i) (where I(t) is given in (4.16) on page 172)

with

f(x) = anxp#1n7

i=1

(x! +i)p,

where a is the leading coe"cient of the minimal polynomial of % and p is anarbitrarily chosen large prime to be specified later. Since a+i % A * Q = Z byCorollary 1.1 on page 4, and since

7

)

(x! +) = x2d#nn7

i=1

(x! +)

is symmetric with respect to %1, . . . ,%d, then by Exercise 4.8 on page 180,f(x) % Z[x]. Now we let

nf = (n + 1)p! 1,

and

g = !(2d ! n)nf!

j=0

f (j)(0)!nf!

j=0

n!

i=0

f (j)(+i) % Z[x], (4.26)

wheren!

i=0

f (j)(+i)

is symmetric in the a+i. Hence, by Exercise 4.8 again,n!

i=0

f (j)(+i) % Z[x].

However, for j < p, f (j)(+i) = 0, son!

i=0

f (j)(+i) + 0 (mod p!).

Also, if j '= p! 1, f (j)(0) + 0(mod p!). As well, for p su"ciently large,

f (p#1)(0) = (p! 1)!(!a)np(+1 · · · +n)p + 0 (mod (p! 1)!),


butf (p#1)(0) = (p! 1)!(!a)np(+1 · · · +n)p '+ 0 (mod p!).

Hence,

|g| #n!

i=1

|+i|fab(|+i|) # cp

where c % R is independent of p. Then we proceed as in the proof of Theorem4.6, this time using Exercise 4.9 on page 180, to get a contradiction to " beingalgebraic. !

Lindemann proved a stronger result than Theorem 4.7, namely that if % % C,% '= 0, then at least one of % or e# is transcendental. Then this result wasgeneralized considerably by Weierstrass to linear combinations stated in ournext result.

Theorem 4.8 The Lindemann–Weierstrass Result

Given %i, &j % Q, where %i, for i = 1, 2, . . . , n, are distinct and &j '= 0 forj = 1, 2, . . . , n,

n!

j=1

&je#j '= 0.

Proof. See [54]. !

Theorem 4.6 on page 172 is immediate from Theorem 4.8, and Theorem 4.7follows from Theorem 4.8 via Euler’s identity ei( + 1 = 0, cited on page 175.The very notion of transcendence itself can be generalized as follows.

Definition 4.3 Algebraic Independence

If %j % R for j = 1, 2, . . . , n % N, then {%j}nj=1 is said to be algebraically

independent over Q if there does not exist a polynomial

f(x1, x2, . . . , xn) % Q[x1, x2, . . . , xn]

with f(%1, %2, . . . ,%n) = 0.

Since the concept of a single % being transcendental is included in Definition4.3, then we have our generalization.

To take the theory of transcendental numbers to its pinnacle, we state aresult that is more general still than Theorem 4.8, namely the following openconjecture, the verification of which would fell numerous open questions in thetheory of transcendental numbers.


Conjecture 4.1 Schanuel’s Conjecture

If %j % C are linearly independent over Q for j = 1, 2, . . . , n % N, then thereexists a subset S of {%1, %2, . . . ,%n, e#1 , e#2 , . . . , e#n} such that |S| ) n where Sis algebraically independent over Q.

We conclude with some numbers known to be transcendental, and somethat are not. From Theorem 4.8 on the preceding page, we know that e#

is transcendental if % % Q is nonzero. It also follows from Theorem 4.8that sin(%), cos(%), tan(%) are transcendental for any nonzero % % Q, as wellas loge(%) for any % % Q with % '= 0, 1. Gel

,

fond constant e( and the

Gel,

fond–Schneider constant&

2$

2are known to be transcendental by the

Gel,

fond–Schneider Theorem, a result that follows from Conjecture 4.1 – seepage 166. Also, Gel

,

fond’s constant and the Gel,

fond–Schneider constant werenoted in Hilbert’s seventh problem as examples of numbers whose transcen-dence was unknown at the turn of the twentieth century – see Biography 3.5 onpage 127.

The number whose binary expansion is given by

p = 0.011010011001001101001011001101001 . . .

is known as the Proulet–Thue–Morse constant. To see how this number isdefined, let the first term be t0 = 0 and for n % N, define tn = 1 if the numberof ones in the binary expansion of n is odd, and tn = 0 if the number of onesis even. Thus, the Thue–Morse sequence tn is given by t0 = 0, t2n = tn, andt2n+1 = 1! tn for all n % N. The generating function for the tn is given by

,(x) ="!

n=0

(!1)tnxn ="7

n=0

(1! x2n

),

– see [68, §1.7]. The sequence was independently discovered by P. Proulet, AxelThue, and Marston Morse. This constant p was shown to be transcendental byMahler in 1929 – see Biography 4.7 on page 181.

Some numbers unknown to be transcendental are the Euler constant, dis-cussed on page 172, as well as Apery’s constant mentioned there. There is alsoCatalan’s constant defined by

K ="!

j=0

(!1)j

(2j + 1)2

which, like Euler’s constant, is not known to be irrational. Also, sums, products,and powers of " and e, except Gel

,

fond’s constant, such as "( and " + e or ee

are not known to be transcendental. It is of interest to note that since " isknown to be transcendental, then it is not possible to get the square root of" from rational numbers, so it is impossible to find the length of the side of asquare having the same area as a given circle using ruler and compass. Thismeans that the classical problem of squaring the circle cannot be accomplished.


For a nice discussion of many open problems in diophantine analysis, see[100].

Biography 4.6 Karl Theodor Wilhelm Weierstrass (1815–1897) was born onOctober 31, 1815 in Ostenfelde, Westphalia (now Germany). His early edu-cation was spotty in terms of his commitment. He entered the Catholic Gym-nasium in Pederborn in 1829, and graduated in 1834. Then he entered theUniversity of Bonn, where he was enrolled in the study of law, finance, and eco-nomics largely to satisfy the wishes of his father, which conflicted with his loveof mathematics. This led to a conflict within him that resulted in his not study-ing any subjects, rather spending four years of exhaustive drinking and fencing.He left the Bonn in 1838 without taking the examinations. In 1839, he enteredthe Academy at Munster to study to become a secondary school teacher, andbegan his career as such in 1842 at the Pro-Gymnasium in West Prussia (nowPoland). In 1848, he moved to the Collegium Hoseanum in Brandenburg. Dur-ing much of this time he studied mathematics on his own, including his readingof Crelle’s Journal, for instance. Given his lack of formal training, his pub-lication on abelian functions in the Brandenburg school prospectus was largelyignored. However, he published a paper in Crelle’s Journal in 1854 on his (par-tial) theory of inversion of hyperelliptic integrals, which was more than noticed.On the basis of this paper alone, the University of Konigsberg presented himwith an honorary doctorate on March 31, 1854. This made Weierstrass decideto ultimately leave secondary school teaching never to return to it. When hepublished his full theory of inversion of hyperelliptic integrals in Crelle’s Jour-nal in 1856, he began receiving many o"ers for chairs at various universities.He accepted an o"er of a professorship at the University of Berlin in Octoberof 1856. His lectures on applications of Fourier series and integrals to math-ematical physics, the theory of analytic functions, and of elliptic functions, aswell as applications to problems in geometry and mechanics were received withenthusiasm from the many students from around the globe who came to attend.Among those who benefited from his teaching were Cantor, Frobenius, Hensel,Hurwitz, Klein, Lie, Mertens, Minkowski, Mittag-Le#er, and Stolz. Indeed,together with his colleagues, Kummer and Kronecker at Berlin, the universitywas provided with a reputation as a leader for excellence in mathematics. Hedied on February 19, 1897 in Berlin

Weierstrass is known as the father of modern analysis. He created testsfor convergence of series, established fundamental work in the theory of peri-odic functions, functions of a real variable, elliptic functions, abelian functions,converging infinite products, and the calculus of variations, not to mention thetheory of quadratic forms. He set a standard of rigour, for instance, estab-lishing irrational numbers as limits of convergent series, that is with us today.

Exercises

4.7. Prove the result first established by Euler that e is irrational.


(Hint: Prove that e#1 is irrational by using the formula e#1 =""

i=0(#1)i

i! ,and breaking it into two parts, %n =

"ni=0

(#1)i

i! and &n =""

i=n+1(#1)i

i! ,demonstrating that n!%n + n!&n(!1)n+1 cannot be an integer.)

4.8. This exercise deals with symmetric polynomials. These are defined to bethose f(x1, x2, . . . , xn) % R[x], for a given commutative ring with identityR, such that for any permutation ( of {1, 2, . . . , n},

f(x"(1), x"(2), . . . , x"(n)) = f(x1, x2, . . . , xn),

denoted succinctly by f" = f . The elementary symmetric polynomialssj in the variables xj for j = 1, 2, . . . , n, are the coe"cients of the monicpolynomial: (X!x1)(X!x2) · · · (X!xn) = Xn!s1Xn#1±· · ·+(!1)nsn,which are homogeneous, symmetric, and

s1 =n!

j=1

x,

...

sk =!

1&i1<i2<···<ik&n

xi1xi2 · · ·xik

...

sn =n7

j=1

xj .

The Fundamental Theorem of Symmetric Polynomials is the following.Let f(x1, x2, . . . , xn) % Q[x1, x2, . . . , xn] be symmetric. Then thereexists a polynomial g(x1, x2, . . . , xn) % Q[x1, x2, . . . , xn] such thatf(x1, x2, . . . , xn) = g(s1, s2, . . . , sn).Prove the fundamental theorem.(Hint: Since f is a sum of monomials axa1

1 xa22 · · ·xan

n where a % Q andaj ) 0 for all j = 1, 2, . . . , n, order them according to the exponents an,called a dictionary ordering. Select a largest one axa1

1 xa22 · · ·xan

n . Thenconsider asa1#a2

1 sa2#a32 · · · san

n = g1 which is symmetric in x1, x2, . . . , xn

and is a sum of monomials in x1, x2, . . . , xn. Then the largest one appear-ing in f is axa1#a2

1 (x1x2)a2#a3 · · · (x1x2 · · ·xn)an . Consider f1 = f ! g1

and repeat the process which must terminate.)

4.9. Prove that " '% Q.(Hint: Assume to the contrary that " = a/b and let f(x) = xn(a!bx)n/n!.Consider the sum

"nj=0(!1)jf (2j)(x) and show that the sum at x = 0, "

are integers so that you may demonstrate thatO (0 f(x)dx is an integer.

Reach a contradiction by showing that for large enough n the integral liesbetween 0 and 1.)


Biography 4.7 Kurt Mahler (1903–1988) was born in Krefeld, PrussianRhineland on July 26, 1903. From an early age he taught himself mathematicsby reading the masters such as Landau, Klein, and Hilbert as well as manyothers. In 1925, he moved to Gottingen where he attended lectures by manyincluding Emmy Noether, Landau, Heisenberg, Hilbert, and Ostroski. In par-ticular, Noether was influential in that she taught him about p-adic numbers.By 1927 he had enough to submit a thesis to Frankfurt on zeros of the gammafunction. This was su!cient for his doctoral requirements. His first appoint-ment was to the University of Konigsberg in 1933. However, with Hitler’s riseto power he had to leave Germany. Mordell invited him to Manchester wherehe stayed from 1933 to 1934. Then he went to Groningen in the Netherlandsfor 1934-1936, and retuned to Manchester in 1937, where he remained until1962 when he went to Canberra, Australia for the last six years of his career.He died there in his eighty-fifth year on February 25, 1988.

Among his works were the proof of the transcendence of&

2$

2. Also, he

classified real and complex numbers into classes which are algebraically inde-pendent. As well, he worked on p-adic numbers, p-adic Diophantine approx-imation, the geometry of numbers, and measure of polynomials. Among thehonours in his life was the De Morgan medal awarded in 1971. Moreover, hewas elected a Fellow of the Australian Academy of Science in 1965 and receivedits Lyle Medal in 1977. In November 1977, he received a diploma at a specialceremony in Frankfurt to mark the golden jubilee of his doctorate. The DutchMathematical Society made him an honorary member in 1957, as did the Aus-tralian Mathematical Society in 1986. Among his nonmathematical activitieswas photography. Indeed, many of his pictures are displayed at the Univer-sity House of Australian National University where he lived for more than twodecades.


4.3 Minkowski’s Convex Body Theorem

Poetry is a subject as precise as geometry.From a letter to Louise Colet, August 14, 1853 in Correspondence

1853–1856, M. Nadeau (ed.) (1964)Gustave Flaubert (1821–1880)

–French novelist

Minkowski coined the term geometry of numbers to mean the use of geomet-ric methods, especially in Euclidean n-space, to solve deep problems in numbertheory–see Biography 4.8 on page 190. Perhaps the most celebrated of theseis the convex body theorem which he proved in 1896. Before presenting thisresult, we need to develop some basic ideas in the theory of the geometry ofnumbers, the first of which is given as follows. Some of the material in thissection is adapted from [64]. The reader should be familiar with the basics ofvector spaces such as that to be found in [68, Appendix A].

Definition 4.4 Lattices and Parallelotopes

Let '1, '2, . . . , 'm % Rn (m, n % N, m # n) be R-linearly independent vectors. If

L = {' % Rn : ' =m!

j=1

zj'j for some zj % Z} = Z['1, . . . , 'm],

then L is called a lattice of dimension m in Rn. When m = n, L is called a fulllattice. In other words, a full lattice L is a free abelian group of rank n havinga Z-basis that is also an R-basis for Rn. Furthermore, the set

P =

13

4

n!

j=1

rj'j : rj % R, 0 # rj < 1 for j = 1, 2, . . . , n

?@

A

is called the fundamental parallelotope, or fundamental parallelepiped, or fun-damental domain of L. An invariant of P is

V (P) = |det('j)|,

called the volume of P, and also called the discriminant of L, denoted by D(L).

Remark 4.4 Recall that a free abelian group with a basis of n elements is anadditive abelian group with a linearly independent subset S of order n thatgenerates it, meaning that G equals the intersection of all subgroups containingS. See Exercise 2.5 on page 66 for a reminder of the definition of GL(n, Z), ifneeded.

As well, note that the term “invariant” in Definition 4.4 means that, irre-spective of which basis we choose for L, the volume of P remains the same. It isan easy exercise for the reader to verify that the determinant remains the same

4.3. Minkowski’s Convex Body Theorem 183

under change of basis using Exercise 4.10 on page 189. For the reader with aknowledge of measure theory, or Lebesgue measure in Rn,

the volume of a so-called measurable set S ( Rn is called the measure of S.

This measure can be shown to be the absolute value of the determinant ofthe matrix with rows 'j for j = 1, 2, . . . , n for any basis {'j} of S. Thus, theLebesgue measure of S is called the volume of S.

Example 4.1 Zn is a full lattice in Rn for any n % N. In other words, a freeabelian group of rank n in Rn is a full lattice. Hence, OF is a full lattice in Rn,where |F : Q| = n. Also, note that any lattice of dimension m % N is full in Rm.

We will now show that lattices as subsets of Rn are characterized by thefollowing property, where the notation for a cardinality of a set |S| < " meansS has finitely many elements.

Definition 4.5 Discrete Sets

Suppose that S ( Rn, n % N, r % R+, and

Sr = {s % Rn : |s| # r}

is the sphere or ball in Rn, with radius r, centered at the origin. Then S iscalled discrete if

|S * Sr| < ",

for all r % R+.

Remark 4.5 For what follows, the reader is asked to recall that if

s = (s1, s2, . . . , sn) % Rn,

then |s| # r means thatn!

j=1

s2j # r2,

since

|s| =

B

Cn!

j=1

s2j

D

E1/2

,

so |sj | # r for each such j. Also, the symbol

G:H

denotes the additive free abelian group structure on free abelian groups G, H,called a direct sum of G and H.


Theorem 4.9 Lattices are Discrete

Let L ( Rn, L '= ". Then L is a lattice if and only if L is a discrete,additive subgroup of Rn.

Proof. Let L be a lattice of dimension n, namely a full lattice in Rn. If

L = '1Z: · · ·: 'nZ,

then{'1, . . . , 'n}

is an R-basis for Rn. Thus, any % % Rn can be written in the form

% =n!

j=1

rj'j (rj % R).

If % % L * Sr for any r % R+, then each rj % Z and |rj | # r for each j =1, 2, . . . , n. Hence, there exist only finitely many points in L * Sr. In otherwords, L is discrete.

Conversely, assume that L is a discrete, additive subgroup of Rn. We useinduction on n. For n = 1, let {'} be a basis for R, namely

R1 = R'.

Since Sr *L is finite for all r % R+, there exists a smallest positive value r1 suchthat r1' % L. Therefore,

Zr1' ( L.

Since any s % R may be written as

s ==

s

r1

>r1 + s1r1,

for some real number s1 with 0 # s1 < 1, then any s' % L may be written inthe form

s' = nr1' + s1r1',

withn =

=s

r1

>% Z,

and 0 # s1 < 1. Therefore, by the minimality of r1, we must have that s1 = 0,so

L = Z[r1'].

This establishes the induction step. Assume the induction hypothesis, namelythat any discrete subgroup of Rk for k < n is a lattice. Hence, we may assumethat

L ( Rn is discrete and L '( Rk for any k < n.


Hence, we may choose a basis

{'1, . . . , 'n}

of Rn with 'j % L for each j = 1, 2, . . . , n. Set

V = R['1, . . . , 'n#1].

By the induction hypothesis,

LV = L * V

is a lattice of dimension n! 1. Let

{&1, . . . ,&n#1}

be a basis for LV . Therefore, any element * % L may be written

* =n#1!

j=1

rj&j + rn'n (rj % R).

By the discreteness of L, there exist only finitely many such * with all rj

bounded. Thus, we may choose one with rn > 0, and minimal with respectto |rj | < 1 for all j '= n. Let &n denote this choice. Thus,

Rn = R[&1, . . . ,&n].

Then for any ) % L,

) =n!

j=1

tj&j (tj % R).

Let

( = ) !n!

j=1

0tj1&j =n!

j=1

sj&j .

Therefore, 0 # sj < 1 for all j = 1, . . . , n. By the minimality of rn, we musthave that sn = 0. Hence,

( % LV ,

so) % LV : Z&n.

This gives us, in total, that

L ( LV : Z&n ( L.

Therefore,L = LV : Z&n

is a lattice. !

We also need other fundamental notions from geometry.


Definition 4.6 Bounded, Convex, and Symmetric Sets

A set S in Rn is said to be convex if, whenever s, t % S, the point

.s + (1! .)t % S

for all . % R such that 0 # . # 1. In other words, S is convex if it satisfies theproperty that, for all s, t % S, the line segment joining s and t is also in S. Thevolume of a convex set S is given by the multiple integral

V (S) =N

S· · ·

Ndx1dx2 · · · dxn

carried out over the set S. A set S in Rn is said to be bounded if there exists asu"ciently large r % R such that |s| # r for all s % S. Another way of lookingat this geometrically is that S is bounded if it can fit into a sphere with centerat the origin of Rn and radius r.

A set S in Rn is symmetric provided that, for each s % S, we have !s % S.

Remark 4.6 A theorem of W. Blanschke says that the volume of everybounded, convex set exists. Hence, the integral in Definition 4.6 always ex-ists for convex sets.

Example 4.2 Clearly, ellipses and squares are convex in R2, but a crescentshape, for instance, is not. Also, an n-dimensional cube

S = {s = (s1, . . . , sn) % Rn : !1 # sj # 1 for j = 1, 2, . . . , n}

is a bounded, symmetric convex set, as is an n-dimensional unit sphere

{s % Rn : |s| # 1}.

Before proceeding to the main result, we need a technical lemma.

Lemma 4.1 Translates and Volume

Let S ( Rn be a bounded set and let L be an n-dimensional lattice. If thetranslates of S by L, given by

Sz = {s + z : s % S},

for a given z % L, are pairwise disjoint, namely

Sz * Sy = ",

for each y, z % L with y '= z, then

V (S) # V (P)

where P is a fundamental parallelotope of L.


Proof. Since P is a fundamental parallelotope of L, we have the followingdescription of S as a disjoint union:

S = 3z'L(S * P#z),

whereP#z = {x! z : x % P},

so it follows thatV (S) =

!

z'L

V (S * P#z).

Since the translate of the set

S * P#z

by the vector z isSz * P,

thenV (S * P#z) = V (Sz * P). (4.27)

Therefore,V (S) =

!

z'L

V (Sz * P).

If the translates Sz are pairwise disjoint, then so are Sz * P. Since

Sz * P ( P,

then Equation (4.27) tells us that!

z'L

V (Sz * P) # V (P),

so the result is proved. !

Remark 4.7 The interested reader will note that the term convex body, usedin what follows, refers to a nonempty, convex bounded and closed subset S ofRn. The topological term “closed” means that every accumulation point of asequence of elements in S must also be in S. This is equivalent to saying thatS is closed in the topological space Rn, with its natural topology. However, wedo not need to concern ourselves here with this, since it is possible to state andprove the result without such topological considerations. It can also be shownthat if S is “compact,” namely every “cover” (a union of sets containing S)contains a finite cover, then it su"ces to assume that

V (S) ) 2nV (P).

Now we are in a position to state the central result of this section.


Theorem 4.10 Minkowski’s Convex Body Theorem

Suppose that L is a lattice of dimension n, and let V (P) be the volume of afundamental parallelotope P of L. If S is a symmetric, convex set in Rn withvolume V (S) such that

V (S) > 2nV (P),

there exists an x % S * L such that x '= 0.

Proof. It su"ces to prove the result for a bounded set S. To see this, weobserve that when S is unbounded, we may restrict attention to the intersectionof S with an n-dimensional sphere, centered at the origin, having a su"cientlylarge radius. Let

T = 12S = {s/2 : s % S}.

ThenV (T ) =

V (S)2n

> V (P).

If the translatesTz =

12S + z

were pairwise disjoint, then by Lemma 4.1,

V (P) ) V (T ),

a contradiction. Therefore, there must exist two distinct elements s, t % L suchthat

( 12S ! s) * ( 1

2S ! t) '= ".

Let x, y % S such that12x! s =

12y ! t.

Thent! s =

12y ! 1

2x.

Since S is symmetric, then !x % S, and since S is convex, then

12y +

12(!x) % S.

Hence,t! s % S * L,

and t! s '= 0, as required. !

We summarize the contents of this section as a closing feature of this chapter.Minkowski’s convex body result given in Theorem 4.10 is an exceptionally simpletest to guarantee a convex symmetric set contains a nonzero lattice point. Ithas a broad range of applications some of which are beyond the scope of this


book–see [64], for instance. However, we may conclude with the application ofMinkowski’s result to verify (4.1) on page 159. Let % be a real number suchthat 0 < % < 1 and let n % N. Define

S =0

(x, y) % R2 : !n! 12# x # n +

12, and |x%! y| <

1n

L.

This is a convex, symmetric set with area

(2n + 1)2n

= 4 +2n

> 4.

Therefore, Minkowski tells us that there is a nonzero lattice point (p, q), say,and by symmetry we may assume without loss of generality that q > 0. Hence,by the definition of S, q # n and

####%!p

q

#### <1qn

<1q2

,

which is (4.1).

Exercises

4.10. Let G be a free abelian group with basis

S = {g1, g2, . . . , gn}.

Suppose that A = (ai,j) is an n 5 n matrix with entries from Z. Provethat the elements

hi =n!

j=1

ai,jgj for i = 1, 2, . . . , n

form a basis for G if and only if A % GL(n, Z).

4.11. Let G be free abelian group of rank n, and let H be a subgroup of G.Prove that G/H is finite if and only if the rank of H is n. Conclude that asubgroup H of a lattice L that has finite index in L must also be a lattice.(See Exercise 4.10.)


Biography 4.8 Hermann Minkowski (1864–1909) was born on June 22, 1864in Alexotas of what was then the Russian empire, but is now Kaunas, Lithua-nia. He studied at the Universities of Berlin, then Konigsberg where he receivedhis doctorate in 1885. He taught at both Bonn and Zurich, until Hilbert createda chair for him at Gottingen, which he accepted in 1902 and remained there forthe rest of his life. He pioneered the area we now call the geometry of numbers.This led to work on convex bodies and to packing problems—see Remark 4.7on page 187. He is also known for having laid the groundwork for relativitytheory by thinking of space and time as linked together in a four-dimensionalspace-time continuum. Indeed by 1907, he came to the conclusion that the workof Einstein and others could be best formulated in a non-euclidean space. LaterEinstein used these ideas to formulate the general theory of relativity (see alsoBiography 2.1 on page 73 for Noether’s influence on Einstein’s theory). Fur-thermore, his geometric insights paved the way for modern functional analysis.He died from a ruptured appendix on January 12, 1909 in Gottingen.

Minkowski is best known for his ideas applied as cited above, especiallyhis creation of the geometry of numbers in 1890. However, he had an earlyinterest in pure mathematics such as his study of binary quadratic forms andcontinued fractions. In 1907, he published Diophantische Approximationen:Eine Einfuhrung in die Zahlenthorie, which provided an elementary discussionof his work on the geometry of numbers, and the applications to the theories ofDiophantine approximation and algebraic numbers.

Chapter 5

Arithmetic Functions

To still be searching what we know not, by what we know, still closing up truthto truth as we find it (for all her body is homogeneal and proportional), thisis the golden rule in theology as well as in arithmetic, and makes up the bestharmony in a church.

from Areopagitica (1644).John Milton (1608–1674)

British poet

Arithmetic functions, studied in a first course in number theory, are thosefunctions whose domain is N and whose range is a subset of C— for instance,see [68, §2.3 –§2.5]. In this chapter we look at a more in-depth analysis of thesefunctions, especially from the perspective of their behaviour for large values ofn. Actually plotting an arithmetic function seems to show chaotic behaviour,but most such functions do behave well on “average,” a term we will defineprecisely in §5.2. First, we need a strong result from the number-theoretictoolkit provided in the following.

5.1 The Euler–Maclaurin Summation Formula

We seek to establish the formula in the title, and explore some of the appli-cations such as Fourier series of Bernoulli polynomials–see Definitions 5.2 on thenext page and 5.3 on page 194 as well as Biographies 5.1 on page 197 and 5.4 onpage 207. First, we need to introduce the following, which first appeared in theposthumous work Ars Conjectandi by Jacob (Jacques) Bernoulli in 1713. Also,the reader should be familiar with the background on the basics concerningseries–for instance, see [68, Appendix A, pp. 307–310].

191

192 5. Arithmetic Functions

Definition 5.1 Bernoulli Numbers

In the Taylor series, for a complex variable x,

F (x) =x

ex ! 1=

"!

j=0

Bjxj

j!,

the coe"cients Bj are called the Bernoulli numbers.

Example 5.1 Using the recursion formula given in Exercise 5.2 on page 206,we calculate the first few Bernoulli numbers:

n 0 1 2 3 4 5 6 7 8 9 10Bn 1 ! 1

216 0 ! 1

30 0 142 0 ! 1

30 0 566

n 11 12 13 14 15 16 17 18 19Bn 0 ! 691

2730 0 76 0 ! 3617

510 0 43867798 0

Example 5.1 suggests that B2n+1 = 0 for all n % N and this is indeed thecase–see Exercise 5.1 on page 205.

Suppose that x, s are complex variables and set

F (s, x) =sexs

es ! 1=

"!

n=0

Bn(x)sn

n!, for |s| < 2". (5.1)

Then by comparing coe"cients of xn in

"!

n=0

Bn(x)sn

n!= F (s, x) = F (s)exs =

"!

n=0

Bnsn

n!

"!

j=0

xj sj

j!,

we get the following.

Definition 5.2 Bernoulli Polynomials

For x % C,

Bn(x) =n!

j=0

$n

j

&Bjx

n#j ,

called the n-th Bernoulli polynomial.

Example 5.2 Using the recursion formula in Exercise 5.2 on page 206 again,we calculate the first few Bernoulli polynomials:

B0(x) = 1, B1(x) = x! 12, B2(x) = x2 ! x +

16,

5.1 The Euler–Maclaurin Summation Formula 193

B3(x) = x3 ! 32x = x(x! 1)

$x! 1

2

&,

B4(x) = x4 ! 2x3 + x2 ! 130

,

B5(x) = x5 ! 52x4 +

53x3 ! 1

6x.

B6(x) = x6 ! 3x5 +52x4 ! 1

2x2 +

142

.

Now we are in a position to prove the result in the section’s header. We willbe invoking the integration by parts formula several times in what follows–see(4.15) on page 172. The following formula has the dual attribution since itwas discovered independently and almost simultaneously by the two authorsin the first half of the eighteenth century, but neither of them obtained theremainder term displayed in the second line of the theorem, and that is anessential ingredient.

Theorem 5.1 The Euler–Maclaurin Summation Formula

Let a < b be integers and let n % N. If f(x) has n continuous derivatives onthe interval [a, b], then

b!

j=a+1

f(j) =N b

af(x)dx +

n!

i=1

(!1)i Bi

i!

.f (i#1)(b)! f (i#1)(a)

/

+(!1)n#1

n!

N b

aBn(x! 0x1)f (n)(x)dx.

Proof. If we set N 1

0f(x)dx =

N 1

0B0(x)f(x)dx,

then we may integrate by parts n times,

N 1

0f(x)dx =

n!

i=1

(!1)i#1 Bi(x)i!

f (i#1)(x)

#####

1

0

+(!1)n

N 1

0

Bn(x)n!

f (n)(x)dx

=n!

i=1

(!1)i#1 Bi

i!

.f (i#1)(1)! f (i#1)(0)

/+ f(1) + (!1)n

N 1

0

Bn(x)n!

f (n)(x)dx,

where the f(1) comes from the fact that we must add it back on given thatB1 = !1/2, but B1(1) = 1/2 by Exercise 5.4, whereas Bi(1) = Bi for i > 1,and Bi(0) = Bi by Definition 5.2 on the facing page. Now by replacing f(x) byf(j ! 1 + x), we obtain that f(1) becomes f(j) so by the above,

f(j) =N 1

0f(j ! 1 + x)dx +

n!

i=1

(!1)i Bi

i!

.f (i#1)(j)! f (i#1)(j ! 1)

/


+(!1)n#1

N 1

0

Bn(x)n!

f (n)(j ! 1 + x)dx.

Since we haveb!

j=a+1

N 1

0f(j ! 1 + x)dx =

N b

af(x)dx,

b!

j=a+1

.f (i#1)(j)! f (i#1)(j ! 1)

/= f (i#1)(b)! f (i#1)(a),

andb!

j=a+1

N 1

0Bn(x)f (n)(j ! 1 + x)dx =

N b

aBn(x! 0x1)f (n)(x)dx,

then we have secured the result. !

In order to be able to apply Theorem 5.1 to Fourier series, we need to knowmore about the functions fn(x) = Bn(x ! 0x1) in the remainder term of theEuler-Maclaurin summation formula. Thus, we need the formal definition inorder to introduce such expansions for fn(x).

Definition 5.3 Fourier Series

A Fourier series is a periodic function f , defined for x % [!", "], given by theconvergent series

f(x) =a0

2+

"!

j=1

(aj cos("jx) + bj sin("jx)) .

The study of Fourier series is known as harmonic analysis.

It is known that one may compute the Fourier series of a 2"-periodic functionf via the following:

f(x) =a0

2+

"!

j=1

(aj cos(2"jx) + bj sin(2"jx)),

where

a0 = 2N 1

0f(x)dx,

aj = 2N 1

0f(x) cos(2"jx)dx,

and

bj = 2N 1

0f(x) sin(2"jx)dx.


Since fn(x) = Bn(x! 0x1) is periodic with period length 1, we have

fn(x) =a(n)0

2+

"!

j=1

.a(n)

j cos(2"jx) + b(n)j sin(2"jx))

/,

with

a(n)0 = 2

N 1

0Bn(x)dx,

a(n)j = 2

N 1

0Bn(x) cos(2"jx)dx,

and

b(n)j = 2

N 1

0Bn(x) sin(2"jx)dx.

However, by Exercises 5.4 and 5.6 on page 206 in conjunction with Definition 5.2on page 192, it holds for any n % N that

a(n)0 = 2

N 1

0Bn(x) = 2

N 1

0

B%n+1(x)n + 1

dx =2

n + 1(Bn+1(x))

#####

1

0

=

2n + 1

(Bn+1(1)!Bn+1(0)) =2

n + 1(Bn+1 !Bn+1) = 0.

Also, using integration by parts

a(n)j = 2

N 1

0Bn(x) cos(2"jx)dx = 2

N 1

0Bn(x)d

$sin(2"jx)

2"j

&

= 2Bn(x)sin(2"jx)

2"j

#####

1

0

! 1"j

N 1

0B%

n(x) sin(2"jx)dx

= ! n

"j

N 1

0Bn#1(x) sin(2"jx)dx = ! n

2"jb(n#1)j ,

for any n ) 2 and a(1)j = 0 for any j % N. Furthermore, again employing

integration by parts,

b(1)j = !2B1(x)

cos(2"jx)2"j

#####

1

0

+1"j

N 1

0cos(2"jx)dx = ! 1

"j,

and for any n ) 2,

b(n)j = !2Bn(x)

cos(2"jx)2"j

#####

1

0

+1"j

N 1

0B%

n(x) cos(2"jx)dx


=n

"j

N 1

0Bn#1(x) cos(2"jx)dx =

n

2"ja(n#1)

j .

Thus far, we have demonstrated that for any j % N,

a(n)0 = 0, a(1)

j = 0, b(1)j = ! 1

"j,

a(n)j = ! n

2"jb(n#1)j = !n(n! 1)

(2"j)2a(n#2)

j for any n ) 2,

andb(n)j =

n

2"ja(n#1)

j = !n(n! 1)(2"j)2

b(n#2)j for any n ) 2.

Continuing in this fashion, an inductive process gives us that for any j, k % N,

a(2k#1)j = 0, a(2k)

j = (!1)k#1 2(2k)!(2"j)2k

,

b(2k)j = 0, and b(2k#1)

j = (!1)k 2(2k ! 1)!(2"j)2k#1

.

We have therefore proved the following.

Theorem 5.2 Fourier Series for Bernoulli Polynomials

For all x % R and k % N,

B2k#1(x! 0x1) = (!1)k2(2k ! 1)!"!

j=1

sin(2"jx)(2"j)2k#1

, for k ) 2, (5.2)

B2k(x! 0x1) = (!1)k#12(2k)!"!

j=1

cos(2"jx)(2"j)2k

, for k ) 1. (5.3)

Remark 5.1 We have deliberately left out from Theorem 5.2 the case of

B1(x! 0x1) =12(x! 0x1)

since the Fourier series vanishes, while B1(x ! 0x1) jumps between +1/2 and!1/2 for x '% Z, and is 0 at integer values of x. This is the only case whereBn(x! 0x1) is not continuous of period 1. Note, as well, that by setting x = 0in (5.2), we get that B2k#1 = 0 for any k ) 2, which is Exercise 5.1 on page 205.Similarly, we have the next result.

Corollary 5.1 If k % N, then (!1)k#1B2k > 0.


Proof. Set x = 0 in (5.3) to get

B2k = 2(!1)k#1 (2k)!(2")2k

"!

n=1

1n2k

(5.4)

from which the result follows. !

Biography 5.1 Jean Baptiste Joseph Fourier (1768–1830) was born on March21, 1768 in Auxerre, Bourgogne, France. His early teenage schooling began atthe Ecole Militaire of Auxerre, and he later became a teacher at the Benedictinecollege there. Unfortunately, he got enmeshed in the politics of the French rev-olution. By July of 1794, he was arrested and imprisoned, then freed later thatyear but was arrested again and imprisoned in 1795. However, by September 1,1795, he was teaching at the Ecole Polytechnique where he had been during hisbrief stint of freedom earlier. He stayed out of trouble, remained free, and in1797 succeeded Lagrange to the chair of analysis and mechanics. However, in1797, he joined Napleon’s army in its invasion of Egypt, acting as a scientificadvisor. While Fourier was in Cairo, he assisted in the founding of the CairoInstitute, and was one of the members of the division of mathematics, laterbeing elected secretary to the Institute. He held this position during the entiretyof France’s occupation of Egypt. In 1801, Fourier returned to his position asProfessor of Analysis at the Ecole Polytechnique. However, Napoleon requestedthat Fourier go to Grenoble as Prefect. Although he did not want to leave theworld of academe, he could not refuse the request and so he went, where hespent an inordinate amount of time on the historical document Description ofEgypt, which was completed in 1810, largely a rewriting of Napoleon’s influencethere. Yet it was in Grenoble that Fourier accomplished his best work on thetheory of heat. By 1807 he had completed his memoir On the Propagation ofHeat in Solid Bodies, which contained expansions of functions, which we nowcall Fourier series. In 1811, he was awarded a prize by the Paris Institute forthis work. When Napoleon was defeated on July 1, 1815, Fourier returned toParis, where he was elected to the Academie des Sciences in 1817. In 1822,Fourier filled the post as Secretary to the mathematical section of the Academiedes Sciences, a vacancy created by the death of Delambre. In 1822, Fourierpublished Theorie analytique de las chaleur, which was a prize winning essay.Fourier continued his mathematical output during his eight years in Paris. Hedied there on May 16, 1830. Fourier’s work paved the way for subsequent workon trigonometric series and the theory of functions of a real variable, which arevital areas in today’s modern world.

Remark 5.2 Bernoulli numbers are among the most distinguished and impor-tant numbers in all of mathematics. Indeed, they play a vital role in numbertheory, especially in connection with Fermat’s last theorem, see Remark 1.17 onpage 41, as well as Biography 5.6 on page 228.

The Bernoulli numbers may also be calculated from the integral

Bn =n!2"i

Nz

ez ! 1dz

zn+1,


as well as from the derivative

Bn =P

dn

dxn

$x

ex ! 1

&Q

x=0

,

and they have connections to the Riemann !-function

!(s) ="!

j=1

j#s =7

p=prime

(1! p#s)#1,

via the identity given in (5.4), namely the following formula first proved byEuler — see Exercise 10.14 on page 346,

!(2k) =(2")2k

2(2k)!|B2k| (5.5)

– see [68, §1.9, pp. 65–72]. We will look, in detail, at the Riemann !-functionin §5.3.

Now we proceed to demonstrate yet more applications of the Maclaurin sumformula by deriving, from it, a well-known and very accurate approximation forn!. First of all, we need the following basic formula from elementary calculus.

Lemma 5.1 Integral of Powers of Sine

For any n % N,

N (/2

0sinn(x)dx =

( (n! 1)(n! 3) · · · 3 · 1n(n! 2) · · · 4 · 2 · "

2if 2

## n,

(n! 1)(n! 3) · · · 4 · 2n(n! 2) · · · 5 · 3 if 2 ! n.

(5.6)

Proof. If we set

In =N (/2

0sinn xdx,

then using integration by parts we get

In =N (/2

0(sinn#1 x)(sinx)dx = !(sinn#1 x)(cos x)

#####

(/2

0

+(n! 1)N (/2

0(sinn#2 x)(cos2 x)dx = (n! 1)(In#2 ! In).

Therefore,

In =n! 1

n· In#2 (5.7)


for any integer n ) 2. By including I0 = "/2, and I1 = 1, we get (5.6) from therecursion in (5.7). !

From the above we are able to obtain the following renowned formula–seeBiography 5.3 on page 205.

Theorem 5.3 The Wallis Formula

Given n % N,

limn!"

22n(n!)2

(2n)!&

n=&

". (5.8)

Proof. Since for any n % N we have,

0 < I2n+1 < I2n < I2n#1,

by Lemma 5.1 we have,

0 <(2n)(2n! 2) · · · 4 · 2

(2n + 1)(2n! 1) · · · 5 · 3 <(2n! 1)(2n! 3) · · · 3 · 1

2n(2n! 2) · · · 4 · 2 · "

2

<(2n! 2)(2n! 4) · · · 4 · 2(2n! 1)(2n! 3) · · · 5 · 3

By inverting this inequality and multiplying through by

(2n! 2)(2n! 4) · · · 4 · 2(2n! 1)(2n! 3) · · · 5 · 3 · 1 · "

we get2n + 1

2n· " >

1n

$(2n)(2n! 2)(2n! 4) · · · 4 · 2(2n! 1)(2n! 3) · · · 5 · 3 · 1

&2

> ".

By letting n / " and observing the outside values go to ", then the center issqueezed to " as well. Therefore,

limn!"

1n

$(2n)(2n! 2)(2n! 4) · · · 4 · 2(2n! 1)(2n! 3) · · · 5 · 3 · 1

&2

= ",

namely

limn!"

1n

$22n(n!)2

(2n)!

&2

= ",

and by taking square roots we get (5.8). !

Now we have one more result before we present the approximation for n!.


Definition 5.4 Asymptotically Equal

In what follows the notationf(n) $ g(n)

will signify that

limn!"

f(n)g(n)

= 1,

which is sometimes referenced as f and g being asymptotically equal.The following is a renowned constant–see Biography 5.2 on page 204.

Theorem 5.4 Stirling’s Constant

For N % N,

limN!"

$loge(N !)!

$N +

12

&loge(N) + N

&= loge

&2".

Proof. Let

C = limN!"

$loge(N !)!

$N +

12

&loge(N) + N

&.

Then

ec = limN!"

N !eN

NN+1/2.

In other words,N ! $ eC#NNN+1/2. (5.9)

Also, by inverting Wallis’ formula (5.8) on page 199, we get

limn!"

(2n)!(2nn!)2

&n =

1&"

.

Now by using N = 2n and N = n in the latter employing (5.9), we get

limn!"

eC#2n(2n)2n+1/2

(2neC#nnn+1/2)2&

n =1&"

,

which simplifies to &2

eC=

1&"

,

from which we geteC =

&2",

yielding that C = loge

&2". !

Now we are ready for the approximation for the factorial.


Theorem 5.5 Stirling’s Formula

For any N % N

N ! $&

2" · e#N · NN+1/2 =&

2"N

$N

e

&N

. (5.10)

Proof. Forf(x) = loge(x)

and n % N, , the n-th derivative is given by

f (n)(x) = (!1)n#1xn#1(n! 1)!.

We now apply Theorem 5.1 on page 193 to f(x), with a = 1, b = N ) 2, andn = 2k, to get

loge(N !) =N!

j=2

loge(j) =N N

1loge(x)dx +

2k!

j=1

(!1)j Bj

j!

.f (j#1)(N)! f (j#1)(1)

/

+12k

N "

NB2k(x! 0x1)x#2kdx =

N N

1loge(x)dx!B1 (f(N)! f(1))

+2k!

j=2

(!1)j Bj

j!9(!1)j#2N1#j(j ! 2)!! (!1)1#j(j ! 2)!

:

=N N

1loge(x)dx +

loge(N)2

+k!

i=1

B2i(2i! 2)!(N1#2i ! 1)(2i)!

+N N

1

B2k(x! 0x1)x#2k

2kdx,

and using integration by parts on the first integral while rewriting the remainderyields that the above equals

$N +

12

&loge(N)!N + 1 +

k!

i=1

B2i

(2i! 1)2iN1#2i

+12k

N N

1B2k(x! 0x1)x#2kdx!

k!

i=1

B2i

(2i! 1)2i. (5.11)

Claim 5.1 For k % N,

12k

N "

1B2k(x! 0x1)x#2kdx = loge(

&2") +

k!

i=1

B2i

(2i! 1)2i! 1.


From (5.11),

12k

N "

1B2k(x! 0x1)x#2kdx = lim

N!"

12k

N "

1B2k(x! 0x1)x#2kdx =

limN!"

Ploge(N !)!

$N +

12

&loge(N) + N

Q! 1

! limN!"

*k!

i=1

B2i

(2i! 1)2iN1#2i

++

k!

i=1

B2i

(2i! 1)2i= loge(

&2")!1+

k!

i=1

B2i

(2i! 1)2i,

by Theorem 5.4 on page 200, which is the claim.Plugging the result of Claim 5.1 into (5.11), we get,

limN!"

loge(N !) = limN!"

P$N +

12

&loge(N)!N

Q+ loge(

&2"),

and by rewriting using the laws for logs,

limN!"

Ploge

$N !

NN+1/2

&+ N

Q= loge(

&2"),

and raising to the power of e,

limN!"

$N !

NN+1/2· eN

&=&

2".

namely,N !

NN+1/2· eN $

&2".

In other words,

N ! $&

2"e#NNN+1/2 =&

2"N

$N

e

&N

,

as required. !

One of the really slick applications of the Euler–Maclaurin summation for-mula is Euler’s constant (4.13) which we introduced in the discussion of tran-scendence on page 172. (We do not know if this constant is irrational, let alonetranscendental.) The definition given in (4.13) is an exceptionally bad methodfor computing * given by

limN!"

(N!

n=1

1/n! loge N)

since we are taking the limit of a quantity that is within a constant timesN#1 of *. This means that we require approximately 1010 summation terms


to compute * to ten decimal places. Even using a computer to do this willlead to astronomical round-o! errors and so the loss of significant figures isdevastating. Euler–Maclaurin comes to the rescue. In Theorem 5.1 on page 193,take f(x) = 1/x, n = m, a = 1, and b = N . Using the techniques of this sectionsuch as used in the derivation of Stirling’s approximation, it follows that

N!

n=1

1n

= loge N + * +1

2N!

m!

j=1

B2j

2jN#2j + R2m(N), (5.12)

where|R2m(N)| # |B2m|

2mN#2m. (5.13)

We now demonstrate how the estimates given by (5.12) are far more precisethan that given in (4.13) can be for *. We choose small values for pedagogicalreasons, but larger values for m and N yield more precision.

Example 5.3 Let m = 5 and N = 8. Then 2.717857142857 =

8!

n=1

1n

= loge 8+*+116!B2

28#2!B4

48#4!B6

68#6!B8

88#8!B10

108#10+R10(8).

By Example 5.1 on page 192, we know the values of B2j for j = 1, 2, 3, 4, 5,so we know from (5.13) that with error no greater than |R10(8)| # |B10|

10 8#10 =0.000000000007055, we have

* = 2.717857142857143! loge 8! 116

+6#1

28#2 ! 30#1

48#4 +

42#1

68#6

!30#18#9 +66#1 · 5

108#10 $ 0.577215664901822.

Since higher values for N and m will yield more accurate estimates, we notethat the above is accurate within the error expected since

* = 0.577215664901532860606512090082402431042 . . . .

This value is sometimes called the Euler–Mascheroni constant since it was cal-culated to sixteen digits of decimal accuracy by Euler in 1781, but later byMascheroni to double that length in 1790. However, Mascheroni’s calculationswere correct only to the first nineteen digits. In 1809, Soldner correctly computedit to forty decimal digits, which Gauss verified in 1812. The latest calculationwas by Alexander Yee and Raymond Chan done March 13, 2009, accurate to29, 844, 489, 545 decimal digits, the world record at the time of this writing. Tocheck for future updates see:

http://en.wikipedia.org/wiki/Euler-Mascheroni constant#Known digits.

http://en.wikipedia.org/wiki/Euler-Mascheroniconstant#Known_digits


Biography 5.2 James Stirling (1692–1770) was born in Garden, near Stirling,Scotland. Little is known of his early education, or even his exact birth date. Itis known that he matriculated at Balliol College in Oxford on January 18, 1711with two scholarships, one of which was the Bishop Warner Exhibition and theother was the Snell Exhibition. However, he lost both of them when he refusedto swear an oath of allegiance to the king since it went against his Jacobitesympathies. The Jacobite cause was that of King James II of England, alsoknown as James the VII of Scotland (Jacobus in Latin) and his descendants.This king was one of the Stuarts, who were Scottish but not Roman Catholics,and who o"ered an alternative to the British crown. Stirling’s father was astrong Jacobite supporter and was even imprisoned for his sympathies and ac-cused of high treason when Stirling was only seventeen, but was later acquitted.Stirling himself was charged with blaspheming the British King George, butwas acquitted as well. In 1717, Stirling published Lineae Tertii Neutonianae,a generalization of Newton’s theory of plane curves of degree three, as well asresults on curves of quickest descent, and on orthogonal trajectories. The latterproblem was coined by Leibniz, and was advanced not only by Stirling, but alsoby Johann Bernoulli, Nicolaus (I) Bernoulli, Nicolaus (II) Bernoulli, and Eu-ler. Stirling solved the problem in 1716. He held the chair at the University ofPadua from 1716 to 1722, when he returned to Glasgow. What he did between1722 and 1724 is not clearly known. Yet he went to London in 1724 where hestayed for the next decade. There he was friends with Newton and was very ac-tive mathematically. Indeed, Newton supported Stirling in a bid for fellowshipof the Royal Society of London, and on November 3, 1726, Stirling was elected.In 1730, he published Methodus Di!erentialis, a book on infinite series, sum-mation, interpolation and quadrature, including results on the Gamma functionand the Hypergeometirc function. Theorem 5.5 on page 201 appears in this bookas Example 2 of Proposition 28. Thus, this was Stirling’s most important work.In 1735, Stirling returned to Scotland where he was appointed manager of theScottish mining company, Leadhills, in Lanarkshire. In 1745, he published apaper on the ventilation of mine shafts. In that year arose the greatest of theJacobite rebellions. On September 17, 1745, Charles Edward, the Young Pre-tender, entered Edinburgh with his army. Maclaurin played a very active partin the defence of the city against the Jacobites. In fact, he died in 1746 fromconsequences of the battles in the previous year. Stirling was subsequently con-sidered for his chair at Edinburgh. However, his Jacobite sympathies preventedthat from happening. In 1746, Stirling was elected to membership of the RoyalAcademy of Berlin. In 1752 was his last work in the realm of science when heconducted the first survey of the River Clyde for the Corporation of Glasgow.He fell ill later in his life and died on December 5, 1770 in Edinburgh where hewas buried at Greyfriars Churchyard. There his contributions to the theory ofinfinite series are honoured by a small plaque in the cemetery wall.

In §5.2, we will use results in this section to get asymptotic facts for certainarithmetic functions.


Biography 5.3 John Wallis (1616–1703) was born in Ashford, Kent, England,the son of a minister, who died when John was only six years old. His motherleft Ashford when there was an outbreak of the plague in the area. When hewas only thirteen, he felt that he was ready for university. However, it wasnot until 1632 that he entered Emmanuel College Cambridge. In 1637 he wasawarded his bachelor’s degree and received his master’s degree in 1640. In thatyear he was also ordained by the bishop of Winchester and appointed chaplainto Sir Richard Darley at Butterworth in Yorkshire. During the next few yearshe excelled as a cryptanalyst by deciphering messages sent by the Royalists whowere engaged in a civil war with the Parliamentarians. (For background onthis and related historical and crytological issues, see [67].) By 1649, his sup-port for the Parliamentarians paid o" when he was appointed to the SavilianChair of Geometry at Oxford by Cromwell, who had dismissed the previouschair holder for his Royalist views. (Oliver Cromwell (1599–1658 ) was a sol-dier and statesman who was instrumental in the execution of King Charles I onJanuary 30, 1649. Then the monarchy was abolished and Cromwell made him-self chairman of the Council of State of the new Commonwealth. By 1653, hehad reorganized the Church of England, established Puritanism, brought pros-perity to Scotland, and granted Irish representation in Parliament.) Indeed,Wallis held this chair for fifty years until his death. Yet, in 1657, he wasappointed as keeper of the University archives there. Wallis is known for hiscontributions to the foundations of the calculus and was, arguably, the mostprominent English mathematician before Newton. His most renowned workwas Arithmetica Infinitorum, published in 1656, which built upon Cavalieri’smethods of indivisibles. He contributed further to the history of mathematicsby restoring some Greek texts from antiquity such as Ptolemy’s Harmonics, aswell as Archimedes’ Sand-reckoner, among others. In the mathematics thathe did, Wallis may be said to have helped to build a calculus established uponarithmetical, rather than geometrical conceptions. This work won the respectand support of his contemporaries such as James Gregory. Those who sawthe solution of problems through geometric means opposed this point of viewincluding Thomas Hobbes, with whom Wallis had an ongoing public disputethat lasted over twenty years. Hobbes’ views of mathematics were rooted in theGreek thought that accepted mathematics as derived from the senses by abstrac-tion from real objects, rather than an abstract branch of formal logic. Yet theanalytic symbolism of Descartes, Fermat, and Wallis may be seen today in thecalculus as embodying the rules of di"erentiation and integration, even the fun-damental theorem of calculus. For this and many other contributions, Walliswill be remembered. He died on October 28, 1703 in Oxford, England.

Exercises

5.1. Without using Theorem 5.2 on page 196, prove that the odd-indexedBernoulli numbers bigger than one are equal to zero, namely B2n+1 = 0for all n % N.


5.2. Prove the following recursion formula for Bernoulli numbers for n % N,

n#1!

i=0

$n

i

&Bi =

(1 if n = 1,0 if n > 1,

where9n

i

:is the binomial coe"cient.

(Hint: Use the fact that ex =""

i=0xi

i! .)

5.3. Prove that""

j=1(1/j) diverges.

(Hint: Assume""

j=1(1/j) = d % R and reach a contradiction.)

5.4. Prove that, from Definition 5.2 on page 192,

Bn(1) =

(1/2 if n = 1,Bn if n > 1.


5.5. Prove the following result by Jacob Bernoulli on the sums of n-th powers,namely that, for every nonnegative n % Z and k % N,

Sn(k) =k#1!

j=1

jn =Bn+1(k)!Bn+1

n + 1=

1n + 1

n!

j=0

$n + 1

j

&Bjk

n+1#j .

(Hint: Compare the coe!cients of sn on both sides of F (s, x) ! F (s, x !1)—see (5.1) on page 192.)

5.6. Prove the following derivative formula for Bernoulli polynomials.

B%n+1(x) = (n + 1)Bn(x).

(Hint: Replace x by x+1 in Equation (S21) on page 422 and di"erentiatewith respect to x.)

5.7. Prove that for any real a # b, and integers n ) 0,N b

aBn(t)dt =

1n + 1

(Bn+1(b)!Bn+1(a)).



Biography 5.4 Jacob Bernoulli (1654–1705) was born on December 27, 1654in Basel, Switzerland. He was one of ten children of Nicolaus and MargarethaBernoulli. His brother Johann (1667–1748) was the tenth child of the union,and the two brothers had an influence on each other’s mathematical develop-ment. Jacob was the first to explore the realms of mathematics, and being thepioneer in the family in this regard, he had no tradition to follow as did hisbrothers after him. In fact, his parents forced him to study philosophy and the-ology, which he silently resented. However, he obtained a licentiate in theologyin 1676, after which he moved to Geneva where he was employed as a tutor.Then he travelled to France where he studied with Nicholas Malebranche, aleader among Rene Descartes’ followers. (Malebranche represented the synthe-sis of the philosophies of St. Augustine and Descartes. This resulted in theMalebranche doctrine, which says that we see bodies through ideas in God andthat God is the only real cause.) In 1681, Bernoulli travelled to the Netherlandswhere he met the mathematician Hudde, then to England where he met withBoyle and Hooke. This began a correspondence with numerous mathematiciansthat continued over several years. In 1683, he returned to Switzerland to teachat the University in Basel. He studied the work of leading mathematicians thereand cultivated an increasing love of mathematics. In 1687, his brother Johannwas appointed professor of mathematics at Basel. The two brothers embarkedupon a study of mathematical publications, including the calculus proposed byLeibniz—see Biography 4.5 on page 175. However, their collaboration turned torivalry with numerous public and private recriminations. Yet they both madesignificant contributions. Jacob’s first such important work was in his 1685publications on logic, algebra, and probability. In 1689, he published significantwork on infinite series and on his law of large numbers. The latter is a math-ematical interpretation of probability as relative frequency. This means that ifan experiment is carried out for a large number of trials, then the relative fre-quency with which an event occurs equals the probability of the event. By 1704,Jacob had published five works on infinite series containing such fundamen-tal results such as that

""j=1 1/j diverges—see Exercise 5.3 on the preceding

page. Although Jacob thought he had discovered the latter, it had been alreadydiscovered by Mengoli some four decades earlier. In 1690, Jacob published animportant result in the history of mathematics by solving a di"erential equationusing, in modern terms, separation of variables. This was the first time that theterm integral was employed with its proper meaning for integration. In 1692,he investigated curves, including the logarithmic spiral, and in 1694, conceivedof what we now call the lemniscate of Bernoulli.By 1696, he had solved whatwe now call the Bernoulli equation: y% = p(x)y + q(x)yn. Eight years after hisdeath, the Ars Conjectandi was published in 1713, a book in which the Bernoullinumbers first appear—see Definition 5.1 on page 192. In the book, they appearin his discussion of exponential series. Jacob held his chair at Basel until hisdeath on August 16, 1705, when it was filled by his brother Johann. Jacobwas always enthralled with the logarithmic spiral mentioned above. Indeed, herequested that it be carved on his tombstone with the (Latin) inscription I shallarise the same though changed.


5.2 Average Orders

If all the arts aspire to the condition of music, all the sciences aspire to thecondition of mathematics.

from Some Turns of Thought in Modern Philosophy (1933)George Santayana (1863–1952)

Spanish-born American skeptical philosopher

In this section, we look at methods for getting accurate estimates for thebehaviour of arithmetic functions for large n. More precisely, we look at thefollowing notion.

Definition 5.5 Average Order of Arithmetic Functions

If f(n) is an arithmetic function and g(n) is an elementary function, then wesay that f(n) is of the average order of g(n) if

n!

j=1

f(j) $n!

j=1

g(j),

where $ is given by Definition 5.4 on page 200.

One of the arithmetic functions, studied in a first course in number theory, isthe number of divisors function ,(n), which is the number of the positive divisorsof n % N. This is the first arithmetic function we explore from the perspective ofDefinition 5.5. If we were to simply look at ,(n) as n gets large, we see that ,(n)is equal to 2 infinitely often since there are infinitely many primes. Furthermore,since it holds that for any prime p and a % N, ,(pa) = a + 1, then ,(n) can bemade to be as large as desired infinitely often. However, looking at the averageorder of ,(n) tames down the process considerably. In order to determine this,we first need the following result—see Biography 3.4 on page 126.

Lemma 5.2 Hermite’s Formula

For n % Nn!

j=1

,(j) = 2+$

n,!

j=1

=n

j

>! 0

&n12. (5.14)

Proof. It is easy to see that the number of solutions to rs = j for r, s % N is thesame as ,(j). Hence,

"nj=1 ,(j) is the number of solutions of the inequality

rs # n, (5.15)

for r, s % N. We partition the number of solutions of the inequality (5.15) intosets for each given s # n, as follows. Define

Ts = {r % N : rs # n},

5.2. Average Orders 209

and let ts be the cardinality of Ts. We now calculate ts explicitly.If s % N, s # n is fixed, then the number of solutions of r # n

s is clearly

ts =Rn

s

S.

Hence,n!

j=1

,(j) =n!

s=1

Rn

s

S. (5.16)

Also, we can split this sum as follows.

n!

j=1

,(j) =+$

n,!

s=1

Rn

s

S+

n!

s=+$

n,+1

Rn

s

S.

In the second summand, we have for each r % Ts, with r #&

n, that&

n < s # n/r.

There are Rn

s

S! 0

&n1

such pairs r, s, since the cardinality of the set of those s #&

n is 0&

n1. Thus,

n!

s=+$

n,+1

Rn

s

S=+$

n,!

s=1

(Rn

s

S! 0

&n1) =

+$

n,!

s=1

Rn

s

S!+$

n,!

s=1

0&

n1 =+$

n,!

s=1

Rn

s

S! 0

&n12.

Hence,n!

s=1

Rn

s

S=

n!

j=1

,(j) = 2+$

n,!

s=1

Rn

s

S! 0

&n12,

which is Hermite’s formula. !

In what follows, we remind the reader that the big O symbol for positivereal-valued functions f and g, denoted by f = O(g), means that there is aconstant c % R such that f(x) < cg(x) for all su"ciently large x. — see [68,Appendix B].

Remark 5.3 Comparing the symbols f $ g with f = O(g), we see that theformer is generally weaker than the latter. For instance, from (5.12) on page203, we may deduce that

n!

j=1

1j

= loge n + * + O

$1n

&. (5.17)


However, since it may also be deduced from that (4.13) on page 172 that

n!

j=1

1j! loge n $ *, (5.18)

(5.17) is a stronger statement than (5.18), the reason being that the formercannot be deduced from the latter. In fact, (5.18) is tantamount to merelysaying

n!

j=1

1j$ loge n. (5.19)

In other words, terms may not be transposed in a relation between asymptoti-cally equal functions.

Now we are in a position to derive the average order for the number ofdivisors function.

Theorem 5.6 Average Order of the Number of Divisors Function

If n % N and ,(n) is the number of divisors function, then

n!

j=1

,(j) $ n loge n, (5.20)

and the average order of ,(n) is loge n.

Proof. From (5.16), we know that

n!

j=1

,(j) =n!

j=1

=n

j

>,

and the latter equals

nn!

j=1

1j

+ O(n) = n loge n + O(n),

since removal of the floor function introduces an error of less than 1 for each j.Note that the last equality may be deduced from (5.19). Hence,

n!

j=1

,(j) $ n loge n

which is (5.20).Since

n!

j=1

loge(j) = loge(n!),


then by Stirling’s formula (5.10) on page 201,

n!

j=1

loge(j) $$

n +12

&loge n! n $ n loge n.

Hence,n!

j=1

,(j) $n!

j=1

loge(j),

so by Definition (5.5), loge n is the average order of , . !

Remark 5.4 Although Theorem 5.6 tells us that the average order of ,(n) isloge n, this should not be interpreted as saying that almost all n % N have ap-proximately loge n divisors. Here the term “almost all,” when used in referenceto n % N satisfying a certain property P , means that the proportion of nat-ural numbers not possessing property P for n # x is o(x) — see Remark 4.2on page 162. In other words, if P (x) denotes the number of n # x satisfyingproperty P and

P (x) $ x

then almost all n % N have property P . Indeed, it can be shown that almostall n % N have approximately (loge n)loge 2 divisors, since it holds that for any3 > 0 that

(loge n)#& <,(n)

(loge n)loge 2< (loge n)& .

The reason that the average order of ,(n) is loge n arises from the contributionsof a small proportion of n % N where ,(n) is unusually big. What this means isthat for a very small minority of n % N, ,(n) is closer to a power of n than ofloge n.

We use Lemma 5.2 and results of the last section to derive the followingmore accurate estimate for ,(n), which was proved by Dirichlet in 1838.

Theorem 5.7 A Precise Estimate for ,(n)

If n % N and * is the Euler constant given by (4.13) on page 172, then

n!

j=1

,(j) = n loge n + (2* ! 1)n + O(&

n).

Proof. From Hermite’s formula (5.14),

n!

j=1

,(j) = 2+$

n,!

j=1

=n

j

>! 0

&n12,


and this in turn equals

2n

+$

n,!

j=1

n

j! n + O(

&n),

which, by (5.12) on page 203, equals

2n loge(&

n) + 2*n + O(n/&

n)! n + O(&

n) = n loge n + (2* ! 1)n + O(&

n),

as required. !

Remark 5.5 The value

#(x) =!

n&x

,(n)! x loge x! (2* ! 1)x

is called the error term in Theorem 5.7 on the preceding page, which says that#(x) = O(

&x). The problem of estimating #(x) is known as the Dirichlet

divisor problem, a celebrated area of research that is largely open. The di"-culty of solving this problem has led to much more complex problems involvingwhat are called exponential sums, which have intimate connections with otherproblems such as the Riemann hypothesis. Hence, any progress on the Dirichletdivisor problem will probably have implications for a variety of other unsolvedproblems. Typically, estimates are of the type

#(x) = O(x&).

Theorem 5.7 shows us that 3 = 1/2 may be chosen. The consensus is that3 = 1/4 works, but this is still open. However, G.H. Hardy showed that 3 ) 1/4.Also, G.F. Voronoii proved, in 1903, that 3 = 1/3 may be selected, but sincethat time about a century ago, there has not been much advancement. To datethe best known value is

3 # 131416

= 0.314903846 . . .

obtained by M.N. Huxley in 2003.

Now we turn our attention to the sum of divisors function ((n), which isthe sum of all the positive divisors of n, where the irregularities are far lesspronounced than those for ,(n) discussed in Remark 5.4 on the previous page.

Theorem 5.8 Average Order of ((n)

For n % N,n!

j=1

((j) =("n)2

12+ O(n loge n), (5.21)

and the average order of ((n) is "2n/6.


Proof. We know from a first course in number theory that

n!

j=1

((j) =n!

k=1

kRn

k

S,

— see [68, Corollary 2.4, p. 110], for instance. Also, since

n!

k=1

kRn

k

S= (1+2+3+· · ·+n)+

.1 + 2 + 3 + · · · +

Rn

2

S/+(1+2+3+· · ·+

Rn

3

S)

+ · · · +.1 + 2 + 3 + · · · +

Rn

k

S/+ · · · + (1),

thenn!

j=1

((j) =n!

j=1

+n/j,!

k=1

k.

However, since we know that

m!

!=1

' = m(m + 1)/2

– see [68, Theorem 1.1, p.2], for instance, then

n!

j=1

((j) =n!

j=1

0n/j1(0n/j1+ 1)2

,

and by the same reasoning as in the proof of Theorem 5.6 on page 210, thelatter equals

12

n!

j=1

$n

j+ O(1)

& $n

j+ O(1)

&=

n2

2

n!

j=1

1j2

+ O

B

Cnn!

j=1

1j

D

E + O(n).

Given that (5.5) on page 198 tells us

n!

j=1

1j2

="!

j=1

1j2

+ O

$1n

&= !(2) + O

$1n

&=

"2

6+ O

$1n

&,

and since

O(nn!

j=1

1/j) + O(n) = O(n)O(n!

j=1

1/j) = O(n)O(loge n) = O(n loge n),

we have,n!

j=1

((j) ="2n2

12+ O (n loge n) ,


which is (5.21). Also, sincen!

j=1

j $ n2/2,

thenn!

j=1

((j) $ "2n2

12$ "2

6

n!

j=1

j =n!

j=1

"2j

6,

and then the average order of ((n) is "2n/6. !

Lastly, in this section, we look at Euler’s totient #(n) from the perspective ofaverage order. Recall that the totient is equal to the number of positive integersless than n and relatively prime to it. In what follows µ(n) denotes the Mobiusfunction defined by

µ(n) =

( 1 if n = 1,0 if n is not squarefree,(!1)k if n =

8kj=1 pj where the pj are distinct primes.

(5.22)

We remind the reader of a fundamental relationship between the totient and theMobius function given by the following that we will use in the closing result,

#(n) = n!

d##n

µ(d)d

(5.23)

— see [68, Theorem 2.17, p. 99]. There is also a relationship between the Mobiusfunction and the zeta function (studied in detail in §5.3) that is an importantcomponent of what follows. It is given by

"!

d=1

µ(d)ds

=1

!(s), for s % R, s > 1 (5.24)

— see [68, top formula, page 112].

Theorem 5.9 The Average Order of the Totient

For n % N,n!

j=1

#(j) =3n2

"2+ O(n loge n), (5.25)

and the average order of #(n) is 6n/"2.

Proof. From (5.23), we get

n!

j=1

#(j) =n!

j=1

j!

d##j

µ(d)d

=!

1&dd$&n

d%µ(d) =n!

d=1

µ(d)+n/d,!

d$=1

d%


and the latter is equal to the following by the same reasoning as in the proof ofTheorem 5.8

12

n!

d=1

µ(d)$Rn

d

S2+

Rn

d

S&=

12

n!

d=1

µ(d)$

n2

d2+ O

.n

d

/&

=n2

2

n!

d=1

µ(d)d2

+O

*n

n!

d=1

1d

+=

n2

2

"!

d=1

µ(d)d2

+O

*n2

"!

d=n+1

1d2

++O(n loge n).

However, by (5.24) on page 214, and (5.5) on page 198,

"!

d=1

µ(d)d2

=1

!(2)=

6"2

and since

O

*n2

"!

d=n+1

1d2

++ O(n loge n) = O(n2)O

* "!

d=n+1

1d2

++ O(n)O(loge n)

= O(n2)O$

1n

&+ O(n)O(loge n) = O(n) + O(n)O(loge n) = O(n loge n),

thenn!

j=1

#(j) =3n2

"2+ O(n loge n),

which is (5.25). Also, the same reasoning as in the proof of Theorem 5.8, wehave that

n!

j=1

#(j) $ 3n2

"2$ 6

"2

n!

j=1

j =n!

j=1

6j

"2,

so the average order of #(n) is 6n/"2. !

Remark 5.6 An application of Theorem 5.9 is given in Exercise 5.9 on the nextpage where it is shown that two integers, less than n % N, have a probability ofbeing relatively prime equal to 6/"2. Here the probability means the following.If A(n) is the total number of pairs of integers less than n and B(n) is thenumber of them that are relatively prime (in lowest terms) then the probabilitythat any two are coprime is

limn!"

A(n)B(n)

.

Another application is to Farey sequences— see [68, Page 239]. The numberof terms in a Farey sequence of order n is

"nj=1 #(j) + 1, so by Theorem 5.9,

the number of terms in a Farey sequence of order n is approximately 3n2/"2.


Exercises

5.8. Prove that for x % R+, if SF (x) denotes the number of squarefree n % Nwith n # x, then

limx!"

SF (x) =6x

"2+ O(

&x).

(Hint: Use (5.24) on page 214 and the Mobius inversion formula whichsays: If f and g are arithmetic functions, then

f(n) =!

d|n

g(d) for every n % N,

if and only if

g(n) =!

d|n

µ(d)f.n

d

/for every n % N

— see [68, Theorem 2.16, p. 98].)

5.9. Given n % N and integers x and y satisfying 1 # x # y # n, prove thatthe probability they are relatively prime is 6/"2.(Hint: See Remark 5.6 on the preceding page and note that the total num-ber of pairs 1 # x # y # n is equal to n(n+1)/2, and the number of themthat are relatively prime is

"nj=1 #(j).)

5.10. Given arithmetic functions f and g related by

f(n) =!

d##n

d · g.n

d

/=

!

d1d2=n

d1g(d2),

and given""

n=1 |g(n)|n < ", prove that

limx!"

1x

!

n&x

f(n) ="!

n=1

g(n)n

.

Remark 5.7 The result in this exercise is known as Wintner’s meanvalue theorem— see [104]. The mean value of an arithmetic functionf is defined to be

limx!"

1x

!

n&x

f(n),

provided the limit exists. For instance, by Exercise 5.8, the mean value ofSF is 6/"2.


(Hint: You may use the fact, following from the hint to Exercise 5.8, that!

n&x

f(n) =!

d&x

g(d)0x/d1,

and thatlim

x!"

1x

!

d&x

|g(d)| = 0,

the latter of which follows from a result known as Kronecker’s Lemma,which states that if f is an arithmetic function and

"!

s=1

f(n)ns

converges for a complex number s with Re(s) > 0, then

limx!"

1xs

!

n&x

f(n) = 0.

In particular, if"!

n=1

f(n)n

converges, then f has mean value zero since

limx!"

(1/x)!

n&x

f(n) = 0.)

5.11. Find"

n&x |µ(n)|, and show that the mean value of µ2 is 6/"2.(Hint: Use Theorem 5.9 on page 214.)


5.3 The Riemann "-function

To see a world in a grain of sandAnd a heaven in a wild flowerHold infinity in the palm of your handAn eternity in an hour.

from Auguries of Innocence (1803)William Blake (1757–1827)

English Poet

In this section, we will be looking at infinite series, especially the renownedzeta function. To this end, we remind the reader of a few salient facts. Theterm analytic, or holomorphic function of a complex variable, is one which hasderivatives whenever the function is defined. Also, absolutely convergent seriesare those with the property that the series formed by the absolute values ofthe terms converges. Convergence of an infinite series means that the sequenceformed by the partial sums of the terms of the sequence converges, in whichcase this limit is the sum of the series. In other words, if we have an infiniteseries given by

"!

n=1

an,

then the partial sums are

sm =m!

n=1

an,

and iflim

m!"sm = S % R,

then S is the sum of the (convergent) series. Series that do not converge aresaid to diverge. Exercises 5.12–5.14 are designed to test some basic knowledgeof series and establish a foundation for the establishment of some facts below.

Herein, we explore the Riemann !-function given for s = a + bi % C with;(s) = a > 1 by

!(s) ="!

n=1

1ns

=7

p=prime

(1! p#s)#1, (5.26)

which we discussed briefly in Remark 5.2 on page 197, as well as in §5.2.The last equality, which follows from Exercise 5.13 on page 227, is known

as the Euler product, which provides a fundamental relationship between theprimes and the zeta function.

The series on the left is absolutely convergent, which implies that !(s) isanalytic on the half plane ;(s) > 1. To see this we may employ Theorem 5.1on page 193 and in addition this will provide us a formula which is a meansof computationally evaluating the Riemann !-function as well as extending itsdomain of definition to the entire complex plane, with one singularity. Thefollowing proof follows the line of reasoning given in [74, Section 3.3].

5.3. The Riemann !-function 219

Theorem 5.10 A Formula for !(s) from Euler–Maclaurin

For s % C and ;(s) > 1! n, for n % N, !(s) is convergent, except at s = 1,and

!(s) =1

s! 1+

12

+n!

j=2

Bj

j!s(s + 1) · · · (s + j ! 2)

! 1n!

s(s + 1) · · · (s + n! 1)N "

1Bn(t! 0t1)t#s#ndt. (5.27)

Proof. Let n % N and set f(x) = x#s, a = 1, and b = N in Theorem 5.1. Since

f (n)(x) = (!1)ns(s + 1)(s + 2) · · · (s + n! 1)x#s#n

and

!(s) = 1 + limN!"

N!

j=2

f(j),

then

!(s)! 1 = limN!"

; N N

1x#sdx!

N!

j=1

Bj

j!s(s + 1) · · · (s + j ! 2)(N#s+1#j ! 1)

! 1n!

s(s + 1) · · · (s + n! 1)N N

1Bn(x! 0x1)x#s#ndx

<

= limN!"

;N1#s ! 1

1! s+

N#s ! 12

!n!

j=2

Bj

j!s(s + 1) · · · (s + j ! 2)(N#s#j+1 ! 1)

! 1n!

s(s + 1) · · · (s + n! 1)N N

1Bn(x! 0x1)x#s#ndx

<.

For ;(s) > 1, we may pull the limit through. Thus, since limN!"N1#s =0 = limN!"N#s, we get (5.27), the right-hand side of which converges for;(s) > 1! n, except at s = 1. !

Remark 5.8 To delve into some deeper complex analysis, Theorem 5.10 saysthat !(s) can be analytically continued to a meromorphic function in the wholecomplex plane with its only singularity a simple pole at s = 1. The principle ofanalytic continuation says that two analytic functions that agree on a su!cientlydense set are identical. A set S is said to be “dense” in a set T if the smallest“closed” set in T containing S is equal to T . Think of a closed set as one thatcontains all of its limit points. A function is “meromorphic” on a region if it isanalytic there except for some “poles” which are singularities that behave likethe singularity of f(x) = 1/xn at x = 0.


Theorem 5.10 may be employed as a useful tool to calculate the zeta functionfor values of s—see Exercise 5.15 on page 227 for instance. Note that we mayestimate the error term via Theorem 5.2 on page 196 as follows. For n > 1!;(s),we have from Theorem 5.10:

|Bn(x! 0x1)| # 2n!(2")n

"!

j=1

1jn

=2n!

(2")n!(n) # 2n!

(2")n!(2) =

n!12(2")n#2

,

and for even n we have from (5.5) on page 198 that |B2m(x! 0x1)| #| B2m|.

Remark 5.9 Another application of the Riemann !-function is to probabilityas discussed in Remark 5.6 on page 215. Via Exercise 5.9 on page 216, weshowed that the probability of two randomly selected integers being relativelyprime is approximately equal to

1!(2)

=6"2

= 0.608 . . . .

This is also the probability that a randomly selected integer is squarefree—seeExercise 5.11 on page 217. The reason the latter is true, in terms of the Riemann!-function given by the Euler product in (5.26), is that for a number to besquarefree it must not be divisible by the same prime more than once. In otherwords, either it is not divisible by p or it is divisible by p but not divisible by itagain. Thus, the probability that an integer is not divisible by the square of aprime p equals $

1! 1p

&+

1p

$1! 1

p

&= 1! 1

p2,

and taking the product over all primes (assuming the independence of the di-visibility by di!erent primes) the probability then that an integer is squarefreetends to 7

p=prime

91! p#2

:= !(2)#1.

This has a generalization, which can be shown by the same reasoning as forn = 2, namely that the probability that n randomly selected integers are coprimeis

Pn $ !(n)#1. (5.28)

Thus, we may calculate P3 $ 0.832, P4 $ 90/"2 $ 0.9239, and so forth. Againusing similar reasoning to the above, the probability that a randomly selectedinteger is cube-free, or fourth-power free, etc., is also given by (5.28). Thus, theprobability that a randomly selected integer is cube-free equals roughly 83%,and that an integer is fourth-power free is roughly 92%.

A more general question still is the following. What is the probability thatn randomly selected integers have greatest common divisor equal to g? Let thisprobability be denoted by Pn(g) — see [84, page 48]. To resolve this question,


let IN = [1, N ] where N ) 1 is real, and let PNn (g) be the probability of selecting

n random integers from IN with gcd equal to g. Then

PNn (g) =

1gn

P +N/g,n (1) + o(N),

observing that o(N) = 0 if N % N with N + 0(mod g), so,

Pn(g) = limN!"

PNn (g) =

1gn

Pn(1) =1

gn!(n),

with thanks to Thomas Hagedorn of the College of New Jersey, USA for theidea behind the proof of the above generalization.

Now we turn to the relationship between the Riemann !-function and thedistribution of primes, extending what is covered in [68, §1.9, pp. 65–72], towhich we refer the reader for background, especially pertaining to the Riemannhypothesis that we will discuss with the covered material from [68] in mind. Webegin by reminding the reader that "(x) denotes the number of primes # x.The first celebrated result, the history of the proof of which is given in detail in[68], is our starting point.

Theorem 5.11 The Prime Number Theorem (PNT)

For x % R+,"(x) $ x

loge x.

The close relationship between the Riemann !-function and "(x) is given by

loge !(s) = s

N "

2

"(x)x(xs ! 1)

dx, (5.29)

for ;(s) > 1—see Exercise 5.20 on page 227.It is noteworthy that the Euler product (5.26) for the Riemann !-function

tells us that since !(s) / " as s / 1 then there are infinitely many primes.To see this consider !(s) for s % R+. By the series expansion in (5.26), !(s)diverges as s / 1+ since the harmonic series

""j=1 1/j diverges. Actually, more

can be said, namely

loge

B

C7

p=prime

(1! p#s)#1

D

E = !!

p=prime

loge(1! p#s)

=!

p=prime

p#s + O(1) <!

p=prime

p#1 + O(1) when s > 1. (5.30)

Understanding the sums"

p&x p#1 is implicit in the development of Theorem5.11. Indeed, the following predates the PNT, and follows from it.


Theorem 5.12 Merten’s Theorem

!

p&x

1p

= loge loge x + M + o(1),

andM = * +

!

p=prime

$loge

$1! 1

p

&+

1p

&,

where * is Euler’s constant and M is called Merten’s constant.

Note that Theorem 5.12 is equivalent to the asymptotic relationship

7

p&x

$1! 1

p

&$ e#'

loge x.

There is an equivalent formulation of Theorem 5.11 via the following functioncalled Merten’s function–see [68, Biography 2.4, p. 100].

M(x) =!

n&x

µ(x),

for any x % R, where µ is the Mobius function defined in (5.22) on page 214. Itcan be shown that Theorem 5.11 is equivalent to the following.

Theorem 5.13 Merten’s Equivalence to the PNT

M(x) = o(x).

Even more, Theorem 5.11 is also equivalent to the following.

Theorem 5.14 Mobius’ Equivalence to the PNT

"!

n=1

µ(n)n

= 0.

The relationship between the Riemann !-function and Merten’s function isevoked from (5.24) on page 214, namely

1!(s)

="!

d=1

µ(d)ds

= s

N "

1

M(x)xs+1

dx,

for ;(s) > 1. This brings us to one of the most important and celebratedoutstanding problems.


Conjecture 5.1 The Riemann Hypothesis (RH)

All of the zeros of !(s) in the critical strip 0 < ;(s) < 1 lie on the critical line;(s) = 1/2.

In terms of the Merten’s function we may reformulate Conjecture 5.1 asbeing equivalent to the following.

Conjecture 5.2 Merten’s Equivalence to the RH

M(x) = O&

.x

12+&

/,

for any fixed 3 > 0, where O& means that, in the big O notation, the constantdepends on 3 only.

Also, Riemann postulated the following in 1859, which is also equivalent toConjecture 5.1.

Conjecture 5.3 Integral Equivalence to the RH ‘

"(x) = li(x) + O9&

x loge x:,

whereli(x) =

N "

2

dt

loge t,

called the logarithmic integral.

Conjecture 5.3 will hold if and only if the Riemann !-function does not vanishon the half plane ;(s) > 1/2. In other words, Conjecture 5.1 is equivalent tothe statement that the error which occurs, when "(x) is estimated by li(x), isO(&

x loge x).Now we are in a position to establish a fundamental equation, which puts

the above more in focus. Indeed, with Remark 5.9 on page 220 in mind, thefollowing shows the central role that the Riemann !-function plays in analyticnumber theory via the functional equation, !(s) = f(s)!(1!s), where we definef(s) below.

Let n = 3 in (5.27) to get

!(s) =1

s! 1+

12

+B2s

2! s(s + 1)(s + 2)

6

N "

1B3(t! 0t1)t#s#3dt,

for ;(s) > !2. By Exercise 5.21 on page 227,

s(s + 1)(s + 2)6

N 1

0B3(t! 0t1)t#s#3dt = !B2s

2! 1

2! 1

s! 1, (5.31)

so!(s) = !s(s + 1)(s + 2)

6

N "

0B3(t! 0t1)t#s#3dt.


Replacing B3(t! 0t1) by the Fourier series in Theorem 5.2 on page 196, weget

!(s) = !s(s + 1)(s + 2)6

"!

j=1

12N "

0

sin(2"jt)(2"j)3

t#s#3dt

and by setting x = 2"jt, this equals

!2s(s + 1)(s + 2)"!

j=1

1(2"j)1#s

N "

0x#s#3 sinx dx.

Since sinx =""

i=0(!1)ix2i+1/(2i+1)! is the analytic continuation of the usualtrigonometric function, and converges for all x % C, then we may interchangethe sum and integral above so the latter equals

!2s"s#1s(s + 1)(s + 2)$N "

0x#s#3 sin x dx

& "!

j=1

1j1#s

.

Hence,

!(s) = !2s"s#1s(s + 1)(s + 2)$N "

0x#s#3 sin x dx

&· !(1! s). (5.32)

In order to complete the derivation of the functional equation, we need thefollowing concept due to Euler.

Definition 5.6 The Gamma Function

For s % C and ;(s) > 0, %(s) =O"0 e#tts#1dt is called the gamma function.

We will employ two well-known formulas for the gamma function given asfollows.

For 0 < ;(z) < 1, the Wolfskehl equation—see Biography 5.6 on page 228—is given by .

sin"z

2

/· %(1! z) = z

N "

0y#z#1 sin y dy, (5.33)

a formula known since 1886—see [105], and

(!z)%(!z) = %(1! z) (5.34)

—see Exercise 5.23 on page 227.Now we are ready for the functional equation.


Biography 5.5 Andrew John Wiles (1953–) was born on April 11, 1953 inCambridge, England. When he was merely ten years old he had an interestin FLT. In 1971, he entered Merton College, Oxford and achieved his B.A. in1974, after which he entered Clare College, Cambridge and studied under JohnCoates, obtaining his doctorate in 1980. However, he did not work on FLTat that time. In 1981, he took a position at the Institute for Advanced Studyat Princeton and was appointed professor in 1981 there. Wiles learned, in themid 1980s, that the works of G. Frey and K. Ribet established that FLT wouldfollow from the Shimura–Taniyama conjecture, namely that every elliptic curvedefined over the rational numbers is modular. Eventually, Wiles proved thatall semistable elliptic curves defined over the rational numbers are modular,from which FLT follows. On June 23, 1993, he announced he had a proofof FLT and wrote up the results for publication. However, a subtle error wasdiscovered. Over the next year, with help from R. Taylor, he eventually filledthe gap and the proof was published in the Annals of Mathematics in 1995.Wiles commented: “There’s no other problem that will mean the same to me.I had this very rare privilege of being able to pursue in my adult life what hadbeen my childhood dream. I know it’s a rare privilege but I know if one can dothis it’s more rewarding than anything one can imagine.” See §10.4 for details.

Theorem 5.15 Riemann’s Functional Equation for !(s)

For s % C, !(s) = 2s"s#1%(1! s)!(1! s) ·9sin (s

2

:.

Proof. From (5.32), we have to show only that

!s(s + 1)(s + 2)N "

0x#s#3 sinx dx =

.sin

"s

2

/%(1! s).

To this end, we employ (5.33)–(5.34) as follows,

!s(s + 1)(s + 2)N "

0x#s#3 sin x dx = s(s + 1)

.sin

"s

2

/%(1! (s + 2))

= s(s + 1).sin

"s

2

/%(!1! s) = !s

.sin

"s

2

/(!(s + 1))%(!(s + 1)))

= !s.sin

"s

2

/%(!s) =

.sin

"s

2

/(!s%(!s)) =

.sin

"s

2

/%(1! s),

and we have our functional equation. !

Note that the standard form for the functional equation is given by

"#s/2%.s

2

/!(s) = "#(1#s)/2%

$1! s

2

&!(1! s),


which can be derived from the form in Theorem 5.15 via Legendre’s duplicationformula given by

%(2z) = (2")#1/222z#1/2%(z)%$

z +12

&. (5.35)

Remark 5.10 The functional equation is valid for all complex numbers s whereboth sides are defined. We know that !(s) has no zeros for ;(s) ) 1 and has onlytrivial zeros for ;(s) # 0, which correspond to poles of %(s/2), and has infinitelymany zeros on the critical strip 0 < ;(s) < 1. We may define a related function,which shows symmetry properties more readily than does Theorem 5.15. If wedefine

0(s) = "#s/2%(s/2)!(s),

then by Exercise 5.24 on the facing page,

0(s) = "#(1#s)/2%$

1! s

2

&!(1! s) = 0(1! s) (5.36)

showing that 0(s) is symmetric about the critical line ;(s) = 1/2. Note that0(s) is analytic on the whole plane (such functions are called entire), since thefactor of s ! 1 eliminates the pole of !(s) at s = 1. (Often 0(s) is called thecompleted zeta function.) The functional equation given in Theorem 5.15 showsthat if s is a zero in the critical strip, then so is 1! s, since by Theorem 5.10 onpage 219, zeros occur in complex conjugate pairs. So if it were to be (incredibly)that the Reimann hypothesis is false, then zeros in the critical strip that arenot on the critical line would occur in four-tuples corresponding to vertices ofrectangles in the complex plane.

The zeros of the !-function are intimately connected with the distributionof primes. If U denotes the upper bound of the real parts of the zeros of !(s),with 1/2 # U # 1, then |"(x) ! li(x)| # cxU loge x for a constant c % R+. TheRiemann hypothesis is tantamount to U = 1/2.

Furthermore, as discussed on page 178 in reference to transcendence, the val-ues of !(2n+1) are largely a mystery, with the exception of a notable—Apery’sconstant

!(3) =52

"!

j=1

(!1)j#1

j392j

j

: '% Q.

Exercises

5.12. Suppose that f(n) is a multiplicative arithmetic function and the seriesS =

""n=1 f(n) is absolutely convergent. Prove that the product

P =7

p=prime

B

C"!

j=0

f(pj)

D

E

is also absolutely convergent and S = P . (Recall that an arithmetic multi-plicative function f : N ./ C where f(ab) = f(a)f(b) when gcd(a, b) = 1.)


(Hint: Set SN = {n % N : n8n

j=1 paj where aj ) 0},and reformulate S interms of SN . Then look at the limit as n goes to infinity of |S ! P |.)

5.13. Suppose that f(n) is a completely multiplicative arithmetic function andthe series S in Exercise 5.12 is absolutely convergent. Prove that

S =7

p=prime

11! f(p)

.

(Recall that a completely multiplicative function f is one for which f(ab) =f(a)f(b) for any a, b % N.) (Hint: Prove that |f(p)| < 1 for all primes p.)

5.14. If f(n) is a multiplicative function, s % C and the following series isabsolutely convergent,

"!

n=1

f(n)n#s, (5.37)

then"!

n=1

f(n)n#s =7

p=prime

"!

j=0

f(pj)p#js.

(Hint: Use Exercise 5.12.) (The series in (5.37) is called a Dirichletseries, a special case of which is our Riemann !-function given in (5.26)on page 218.)

5.15. Use Theorem 5.10 on page 219 to evaluate !(!k) for any k % N. (Hint:Use Exercise 5.5 on page 206 once a formulation from Theorem 5.10 isobtained.)

5.16. Prove that for any N % N, !(1!2N) = !B2N/(2N). (Hint: Use Exercise5.15.)

5.17. Prove that !(!2N) = 0 for any N % N, called the trivial zeros or realzeros of the Riemann !-function. (Hint: Use Exercise 5.15.)

5.18. Prove that lims!1(s! 1)!(s) = 1.

5.19. Prove that !(0) = !1/2.

5.20. Prove that (5.29) on page 221 holds.

5.21. Prove that (5.31) on page 223 holds.

5.22. Prove that %(s) = (s ! 1)%(s ! 1). (Hint: Use integration by parts onDefinition 5.6 on page 224 for a real argument.)

5.23. Establish formula (5.34) on page 224. (Hint: Use Exercise 5.22.)

5.24. Prove the formula for 0(s) given in (5.36), displayed on page 226.(Hint: You may use the formula %(z)%(1 ! z) = "/ sin"z–see [101, For-mula (25), page 697]), as well as (5.35) on page 226.


5.25. Prove that for any n % N, %(n) = (n! 1)!. (Hint: Use Exercise 5.22.)

Biography 5.6 Paul Wolfskehl (1856–1906) was born on June 30, 1856 inDarmstadt, Germany. He acheived a doctorate in medicine around 1880. (Itis di!cult to be accurate since documentation of some parts of his life do notexist.) However, he su"ered from multiple sclerosis (MS) and decided to leavemedicine for the more solitary study of mathematics. In 1881, in Berlin, hebegan his mathematical journey. He was deeply influenced by the lectures ofKummer in 1883–84, and largely due to that connection, he decided to studynumber theory. Indeed, Wolfskehl himself gave lectures in number theory atthe Institute of Technology in Darmstadt starting in 1887. However, his MSworsened and he was completely paralyzed by 1890, giving up his lectures there.In January of 1905, he added to his will “whosoever first succeeds in provingthe great Theorem of Fermat” would receive 100,000 marks. He entrusted theRoyal Society of Science in Gottingen with the money and with the task ofjudging and awarding the prize. This speaks to the influence that Kummer’swork must have had on him in Kummer’s failed attempts to prove Fermat’s lasttheorem (FLT). Wolfskehl died on September 13, 1906. On June 27, 1908, theGottingen Royal Society of Science published their conditions for awarding theprize. Ironically, exactly eighty-nine years—to the day—later, the prize wasawarded to Andrew Wiles for his solution of FLT on June 27, 1997, a total ofDM 75,000. The value had decreased due to the hyperinflation of the WeimarRepublic in the early 1920s. See [68, Biography 1.10, p. 38] for background onFLT and the life of Fermat.

Chapter 6

Introduction to p-AdicAnalysis

The Analytical Engine weaves algebraic patterns just as the Jacquard looma

weaves flowers and leaves.from Luigi Manabrea’s Sketch of the Analytical Engine invented

by Charles Babbage (1843) translated and annotated by Ada Lovelace(1815–1852).

English mathematician, and daughter of Lord ByronaJoseph Marie Jacquard was a silk-weaver, who invented an improved textile loom in

1801. Jacquard’s loom used interchangeable punched cards that controlled the weaving ofthe cloth so that any desired pattern could be obtained automatically to produce beautifulpatterns in a style previously accomplished only with very hard manual labour. Thesepunched cards were adopted by the pioneer English inventor Charles Babbage as an input-output medium for his proposed “analytical engine.” They were eventually used as a meansof inputting data into digital computers but were later replaced by electronic devices.

6.1 Solving Modulo pn

The topic of this chapter is due to Hensel. The theory of p-adic numbers isrich with numerous applications, not only to number theory, but also to algebrain general, as well as to algebraic functions and algebraic geometry. This sectionis devoted to motivating the definitions of the theory by starting with elementarycongruential arithmetic—see [68, Chapter 2]. In particular, we look at integralpolynomial congruences

f(x) + 0 (mod pk), for k % N (6.1)

for a prime p. The goal is to begin with k = 1 and build upon solutions of(6.1) for successively higher powers of p, then show how this translates into a

229

230 6. Introduction to p-Adic Analysis

power series in p that will be the foundation for the theory. In order to dothis, we call upon the pioneering work of Hensel, and remind the reader of hisfundamental result presented in an introductory course in number theory, suchas [68, Theorem 2.24, p. 115].

Lemma 6.1 Hensel’s Lemma

Let f(x) be an integral polynomial, p a prime, and k % N. Suppose thatr1, r2, . . . , rm for some m % N are all of the incongruent solutions of f(x) modulopk, where 0 # ri < pk for each i = 1, 2, . . . ,m. If a % Z is such that

f(a) + 0 (mod pk+1) with 0 # a < pk+1, (6.2)

then there exists q % Z such that

(a) for some i % {1, 2, . . . ,m}, a = qpk + ri with 0 # q < p, and

(b) f(ri) + qf %(ri)pk + 0(mod pk+1).

Additionally, if f %(ri) '+ 0(mod p), then

f(qpk + ri) + 0 (mod pk+1) (6.3)

has a unique solution for the value of q given by

q + !f(ri)pk

(f %(ri))#1 (mod p), (6.4)

with (f %(ri))#1 being a multiplicative inverse of f %(ri) modulo p.If f %(ri) + 0(mod p) and f(ri) + 0(mod pk+1), then all values of q =

0, 1, 2, . . . , p! 1 yield incongruent solutions to (6.3).If f %(ri) + 0(mod p) and f(ri) '+ 0(mod pk+1), then f(x) + 0(mod pk+1)

has no solutions.

Remark 6.1 Note that in Hensel’s Lemma for k = m,

q + !f(rm#1)p

(f %(rm#1))#1 (mod p)

uniquely determines q and

x = r1 + r2p + r3p2 + · · · + rmpm#1 (6.5)

is a solution of (6.1) for k = m.

6.1 Solving Modulo pn 231

Example 6.1 Consider f(x) = x3 + 5x2 + 1 and solve for f(x) + 0(mod 73).By inspection, we see that x = 1 = r1 is a solution of f(x) + 0(mod 7). Also,we observe that f %(1) + !1(mod 7), so we set r2 = 1 + 7q, where k = 1 and qis uniquely determined by

q + !f(r1)pk

(f %(r1))#1 + !1 · (13)#1 + !6 + 1 (mod 7).

Thus, r2 = 8 we set r3 = 8 + 72q, where k = 2 and q is uniquely determined by

q + !f(r2)pk

(f %(r2))#1 + !3 · 6 + 3 (mod 7),

so r3 + 155(mod 73), which is the solution we sought to find. Moreover, sincem = 1 this is the only such solution.

We need not stop the process illustrated in Example 6.1, since it may becontinued indefinitely to obtain a power series in p,

x ="!

j=0

rj+1pj = r1 + r2p + · · · + rjp

j#1 + · · · , where 0 # rj < p, (6.6)

which we call a p-adic solution to f(x) + 0(mod p). The power series solutionsmay be approximated to higher degrees of accuracy as we solve modulo pk forsuccessively higher powers k. The values in (6.6) are known, formally, as p-adic numbers. However, in the most general sense of the term, we allow for afinite number of negative powers k. Therefore, a p-adic number is formally anexpression of the form

r#mp#m#1 + · · · + r1 + r2p + · · · + rnpn#1 + · · · for m, n % N. (6.7)

The reader will note that this is akin to binary expansions of % % R givenby

a#m2#m#1 + · · · + a02#1 + a1 + · · · + an2n#1 + · · · where for 0 # aj < 2,

or decimal expansions or expansions to any base b > 1, given the base represen-tation theorem—see [68, Theorem 1.5, p. 8]. Indeed, if q % Q, then the p-adicrepresentation of q is the p-adic solution of x = q, and if q = z % N this is justthe representation of z in base p. Thus, the methodology for finding a p-adicrepresentation of z is to divide it by p and set q0 = 0z/p1, and write z = q0p+z0

where 0 # z0 < p. Then, divide q0 by p and write q1 = 0q0/p1, so q0 = q1p + z1

with 0 # z1 < p, and z = z0 + z1p + q1p2. Continuing in this fashion, we getthe unique p-adic representation of z, namely

z = z0 + z1p + z2p2 + · · · + z!p

!.

Note that the addition and subtraction of p-adic numbers is just obtained inthe usual way by increasing the next coe"cient by 1 if a given coe"cient is


greater than p such as (5 + 3)52 = 3 · 52 + 1 · 53. Similarly, when subtracting, ifa given coe"cient becomes negative, we “borrow” from the next term such as(2 ! 4)32 + (2 ! 1)33 = (2 ! 4 + 3)32 + (2 ! 2)33 = 32. Also, multiplication ofp-adic numbers is just formal multiplication of power series, allowing for shiftingterms to ensure all coe"cients are nonnegative and less than p.

Example 6.2 A 7-adic solution of 3x = 5 is given by

4 + 2 · 7 + 2 · 72 + 2 · 73 + 2 · 73 + 2 · 74 + · · · .

Looking at a power series expansion of 3x, we have

3(4+2·7+2·72+2·73+2·73+2·74+· · · ) = 12+6·7+6·72+6·73+6·73+6·74+· · ·

= 5+7+6·7+6·72+6·73+6·73+6·74+· · · = 5+0·7+7·72+6·73+6·73+6·74+· · ·

= 5+0·7+0·72+7·73+6·73+6·74+· · · = 5+0·7+0·72+0·73+7·73+6·74+· · ·

= 5+0 · 7+0 · 72 +0 · 73 +7 · 74 + · · · = 5+0 · 7+0 · 72 +0 · 73 +0 · 74 + · · · = 5.

Example 6.3 A 5-adic solution to x2 = 11 is

1+5+2·52+2·55+3·57+3·58+2·59+511+4·512+3·513+2·514+3·515+3·516+· · · ,

which corresponds to the positive square root. Another 5-adic solution corre-sponding to the negative square root is given by

4+3·5+2·52+4·53+4·54+2·55+4·56+57+58+2·59+4·510+3·511+513+2·514+· · · .

Exercises

6.1. Find all solutions of x3 + 3x2 + 12 + 0(mod 72).

6.2. Find all solutions of x3 + 4x + 1 + 0(mod 53).

6.3. Find all solutions of x3 + x2 + x + 1 + 0(mod 133).

6.4. Find all solutions of x3 + x2 ! 11 + 0(mod 173).

6.5. Find the 7-adic solution to 2x = 3.

6.6. Find the 11-adic solution to 2x = 3.

6.7. Find the 5-adic solution to x3 + 4x + 1 = 0.

6.8. Find the 13-adic solution to x3 + x2 + x + 1 = 0.

6.2. Introduction to Valuations 233

6.2 Introduction to Valuations

Eureka! [I’ve got it!]from Preface, Section 10 of Vitruvius Pollio De Architectura book 9

Archimedes (c. 287–212 B.C.)Greek mathematician and philosopher

In this section, we address the problem of convergence of the power serieswe considered in §6.1. Indeed if we look at Example 6.2 on the preceding page,then we see that we are getting higher and higher powers of 7 as we zero outthe previous terms. Thus, we need a formal definition of the notion.

Definition 6.1 Valuations Over Q

If 8 is a function mapping Q to Q, satisfying the following conditions,

(a) 8(x) ) 0 with equality if and only if x = 0,

(b) 8(xy) = 8(x)8(y) for any x, y % Q,

(c) 8(x + y) # 8(x) + 8(y) for any x, y % Q,

then 8 is called a valuation on Q.

Two important types of valuations are isolated as follows.

Definition 6.2 Absolute Value on a Field

An absolute value on a field F is a function | · | : F ./ R satisfying each of thefollowing.

(a) |x| ) 0 for all x % F and |x| = 0 if and only if x = 0.

(b) |x · y| = |x| · |y| for all x, y % F .

(c) |x + y| #| x| + |y| for all x, y % F . (Triangle inequality)

If the triangle inequality can be replaced by the condition

|x + y| # max{|x|, |y|} for all x, y % F, (6.8)

then the absolute value is said to be non-Archimedean, and otherwise it is calledArchimidean.

Definition 6.3 p-Adic Absolute Value and Valuations

Let x % Q, and set

x = ±7

p=prime

p*p(x), where /p(x) % Z,


then for a fixed prime p, observing that there are only finitely many of the vp(x)that are not zero, there exist nonzero integers a, b such that

x =a

b· p*p(x) with ab '+ 0 (mod p). (6.9)

Then the p-adic absolute value on Q is given by

|x|p =

(p#*p(x) if x '= 0,0 if x = 0.

The function that maps x ./ /p(x) is called a p-adic valuation.

Example 6.4 If we define 8(x) = 1 if x '= 0 and 8(0) = 0, this is known asthe unitary, identical, or trivial absolute value, which is non-Archimedean. SeeExercise 6.10 on page 238 for some more elementary properties of valuations.

Example 6.5 The ordinary absolute value is given by

|x|" =

(x if x ) 0,!x if x < 0,

where the symbol | · |" is used in the context of “p-adic numbers” which wedefine below, where we typically allow p = " to denote the ordinary absolutevalue in what follows.

Remark 6.2 Given a fixed prime p any rational number x may be uniquelywritten in the form (6.9). By Exercise 6.9, the p-adic absolute value | · |p isindeed an absolute value in the sense of Definition 6.2. Hensel’s idea was toensure that the number x has small p-adic absolute value precisely when x isdivisible by a large power of p, so the magnitude of x has no e!ect in this context.The p-adic absolute value gives us an arithmetical notion of “distance.” Tworationals are close together under the p-adic absolute value if the numerator oftheir di!erence has a power of p as a factor. Indeed, if we look only at integers,then the following holds. If p is a prime and z, w % Z, then

|z ! w|p # 1/pn if and only if z + w (mod pn)

for some nonnegative integer n—see Exercise 6.18.

Definition 6.4 Cauchy Sequences

Let | · |p be a p-adic absolute value on Q for p # ". Then a sequence of rationalnumbers {qj}"j=1 is called a Cauchy sequence (relative to |·|p, also called a p-adicCauchy sequence) if for every rational 3 > 0, there exists an integer n = n(3)such that

|qj ! qk|p < 3 for all j, k > n.


A Cauchy sequence is called a null sequence if limj!" |qj |p = 0. Two Cauchysequences {qj}"j=1 and {q%j}"j=1 are said to be equivalent if they di!er by a nullsequence, namely if

limj!"

|qj ! q%j |p = 0,

and we denote equivalence of sequences by

{qj} $ {q%j},

using the notation {qj} for the class containing {qj}.

Definition 6.4 tells us that a sequence is Cauchy if the terms get arbitrarilyclose to each other with respect to the p-adic absolute value.

By Exercises 6.12– 6.13 on page 238, Cauchy sequences are partitioned intoequivalence classes since {qj} ${ q%j} is an equivalence relation. Let

Qp = {{qj} : qj % Q and {qj} is a Cauchy sequence}.

If p = ", then an equivalence class {qj} is called a real number, and if p < ",then it is called a p-adic number. (In fact Cantor employed Cauchy sequencesto provide a constructive definition of R without using Dedekind cuts, whichare more di"cult to manipulate than Cauchy sequences.) Also, by the afore-mentioned exercises,

{qj} · {q%j} = {qj · q%j}

and{qj} ± {q%j} = {qj ± q%j}

are well defined. This makes the classes of Cauchy sequences into a commutativering with identity. Here, the class of the null sequence is the zero element, andthe constant sequence qj = 1 for all j % N provides the unity element. It followsthat when {q%j} '= {0}, then

{qj} · {q%j}#1

= {qj · (q%j)#1},

so the classes, excluding the null sequence class, form a multiplicative commu-tative group. Hence, Qp is a field, called the field of p-adic numbers. Whenp = ", then Qp = R. The p-adic fields Qp are known as completions of Q withrespect to the p-adic valuation. This larger field contains Q.

We define(p)

limj!"

{qj} = {qj}, and(p)

limj!"

|qj |p = |{qj}|p (6.10)

and say the sequence {qj}"j=1 converges p-adically. The p-adic field is complete inthe sense that all Cauchy sequences converge to a p-adic number. Observe thatQ is not complete with respect to the p-adic valuation because Cauchy sequences


may have irrational limit points. For instance, the well-known sequence qj =Fj/Fj#1 where Fj is the j-th Fibonacci number converges to the golden ratio

g = (1 +&

5)/2,

which is clearly irrational—see [68, p. 4, !.]. Note as well that exponentialand trigonometric functions such as f(x) = ex and g(x) = sin x are known tobe irrational for any rational value of x, but may be defined as the limit ofa Cauchy sequence via Maclaurin series—see §5.1. In other words, there are“holes” in Q, missing some points to which Cauchy sequences converge in R.We filled those holes by completing Q to the fields Qp for each p # ", a muchlarger field. In the case of p = ", we get Qp = R, so we can build the realnumbers by using the rationals and the notion of distance in the reals providedby the usual absolute value function. The notion of distance provided by p-adic valuations is also an absolute value, as noted in Remark 6.2 on page 234.When /p(x) ) 0, then x is called a p-adic integer, and the set Op of all p-adicintegers is easily checked to be an integral domain whose units are the integerswith /p(x) = 1, and Qp is the quotient field of Op. For p < ", Qp is thenon-Archimedean analogue of R.

This is summarized in the following.

Theorem 6.1 The p-Adic Fields and Domains

For any prime p # ", Qp, the field of p -adic numbers, forms a field whereQp = R when p = " and each of the p-adic fields, for p < ", has an isomorphiccopy of Q via the embedding

q % Q ./ (q, q, q, . . .) % Qp,

(where (q, q, q, . . .) is a Cauchy sequence). Furthermore, if

Op = {x % Qp : /p(x) ) 0} = {x % Qp : |x|p # 1},

then Op is an integral domain and the units in Op are those for which |x|p = 1,and Qp is the quotient field of Op.

In order to classify valuations, we need the following concept.

Definition 6.5 Equivalent Valuations

If 8 and 8% are valuations, then we say that 8 and 8% are equivalent providedthat for any x, y % Q,

8(x) < 8(y) if and only if 8%(x) < 8%(y).


Theorem 6.2 Equivalent Valuations are Powers

A nontrivial valuation 8 is equivalent to a valuation 8% on Q, if and only ifthere exists a positive real number r such that 8% = 8r.

Proof. Since 8 is nontrivial, then there exists a q0 % Q such that 8(q0) '= 0, 1.If 8(q0) > 1, then by property (b) of Definition 6.1 on page 233, 8(1/q0) < 1.Hence, we may assume without loss of generality that 0 < 8(q0) < 1. Let q bean arbitrarily chosen nonzero rational number and set

S = {(m, n) % N2 : 8(qm0 ) = 8(q0)m < 8(q)n = 8(qn)},

where the equalities in the definition of S above also come from property (b).Thus, if (m, n) % S, then

m

n>

loge 8(q)loge 8(q0)

.

If 8% is equivalent to 8, then for any nonzero q % Q,

loge 8(q)loge 8(q0)

=loge 8%(q)loge 8%(q0)

,

so there exists a constant r % R+, depending solely upon 8 and 8%, such that

loge 8%(q)loge 8(q)

=loge 8%(q0)loge 8(q0)

= r > 0.

Hence, since we know from elementary calculus that

loge 8%(q)loge 8(q)

= log+(q)(8%(q)),

then 8%(q) = 8(q)r.Conversely, if /% = /r for some r % R+, then /%(x) < /%(y) if and only if

/r(x) < /r(y) if and only if /(x) < /(y), whch secures the result. !

Remark 6.3 Exercise 6.19 on the following page tells us that for p a prime,all triangles are p-adically isosceles. This shows the di!erence betweenArchimedean and non-Archimedean geometry. We explore this di!erence inmore depth in §6.3.

Exercises

6.9. Prove that | · |p given in Definition 6.3 on page 233 is an absolutevalue in the sense of Definition 6.2, and that the absolute value is non-Archimedean.


6.10. Prove that if 8 is a valuation on Q, then 8(1) = 8(!1) = 1, 8(!x) = 8(x)for any x % Q, and if n % N then 8(n) # n.

6.11. Prove that all Cauchy sequences are bounded. In other words, if {qj} isa Cauchy sequence, then there exists an M % R+ such that |qj |p < M forall j % N.

6.12. Show the sum {qj}+{q%j} = {qj +q%j} and the product {qj}·{q%j} = {qj ·q%j}of Cauchy sequences is again a Cauchy sequence.(Hint: Use Exercise 6.11.)

6.13. Prove that equivalence of Cauchy sequences, given in Definition 6.4 onpage 234, is an equivalence relation, namely that it satisfies the threeproperties of being reflexive, symmetric, and transitive.

6.14. Prove that every Cauchy sequence is convergent in R.(Recall that a sequence {qj} is convergent in R if there exists an L % Rsatisfying the property that for any 3 > 0, there exists an N % N such that|qj ! L| < 3 for all j ) N .)(Hint: Use Exercise 6.11 and the fact that every bounded sequence hasa convergent subsequence, which is the interpretation of the well-knownBolzano–Weierstrass theorem for R.)

6.15. Prove that every sequence that converges in R is a Cauchy sequence.

6.16. Prove that if p is prime, then when |x|p '= |y|p, we have

|x + y|p = max{|x|p, |y|p}.

6.17. In Exercise 6.16, provide an example where |x|p = |y|p and |x + y|p <max{|x|p, |y|p}, called the strong triangle inequality.

6.18. Prove that if z, w % Z, then

|z ! w|p #1pn

if and only if z + w (mod pn) for some integer n ) 0.

6.19. Prove that all p-adic triangles are isosceles, i.e., all sets of vertices x, y, zwith x, y, z % Qp are isosceles. In other words, demonstrate that, withrespect to a p-adic valuation as a measure of distance, the length of twoof the sides must always be the same.(Hint: Use Exercise 6.16.)

6.20. Prove that Exercise 6.16 holds if | · |p is replaced by any non-Archimedeanabsolute value on a field F .


Biography 6.1 Augustine-Louis Cauchy (1789–1857) was born on August 21,1789 in Paris, France. When still a teenager, Laplace and Lagrange were vis-itors to the Cauchy home. Indeed, it was on the recommendation of Lagrangethat Cauchy’s father took his advice to have the young Cauchy well educated inlanguages before studying mathematics in earnest. Thus, in 1802, he enteredthe Ecole Centrale du Pantheon where he devoted two years to the study of clas-sical languages. Then he went on to study mathematics graduating from EcolePolytechnique in 1807, after which he entered the Ecole des Ponts Chassees.In 1810, he assumed his first job in Cherbourgh to work on port facilities forNapoleon’s English invasion fleet. Despite what was a heavy workload in thisposition, he engaged in mathematical research. One well-known result that heproved in 1811 was that the angles of a complex polyhedron are determined byits faces. In 1812, Cauchy returned to Paris when his health took a turn forthe worse. By 1814, he had published his now-famous memoir on definite inte-grals that became the foundation for our modern theory of complex functions.In 1815, he was appointed assistant professor of analysis at the Ecole Poly-technique, and there, in 1816, he was awarded the Grand Prix of the FrenchAcademy of Sciences for his work on waves. In 1817, he took a post at theCollege de France. There he lectured on his integration methodology that in-volved the first rigorous scheme for convergence of infinite series and a formaldefinition of the integral. By 1829, he defined the meaning of a complex functionof a complex variable, which he published in Lecons sur le Calcul Di!erential,which was a culmination, among other works, of the study of the calculus ofresidues begun in 1824. Politics intervened in 1830 when he left for Switzerlandand after refusing to swear an oath of allegiance to the new regime and failingto return to Paris, he lost all his positions there. In 1831, he went to Turinand taught there in 1832–33, after which he left for Prague on an order fromCharles X to tutor his grandson. In 1838, he returned to Paris, and reclaimedhis position at the Academy, but was not allowed to teach since he continued torefuse to take the aforementioned oath. Between 1840 and 1847, he publishedhis renowned four-volume Exercises d’analyse et de physique mathematique.In 1848, when Louise Phillpe was overthrown, Cauchy reclaimed his universitypositions. In 1850, he lost an election to Liouville for the chair at the Collegede France, which led to bad temperament between the two of them from thattime on. Also, during the last years of his life, he had a dispute with Duhamelover a priority claim on a result in inelastic shocks, a claim, it turns out, aboutwhich Cauchy was wrong. He died in Sceaux outside of Paris on May 23, 1857.He managed to publish 789 papers in mathematics. Indeed, Cauchy’s name ispresent on various terms in modern-day mathematics including, the Cauchyintegral theorem, the Cauchy-Kovalevskaya existence theorem, the Cauchy-Riemann equations, and the Cauchy sequences that we are studying in thissection. Also, his contributions to the foundation of mathematical physics andtheoretical mechanics via his work on the theory of light and his theory of elas-ticity necessitated that he develop not only his calculus of residues, but also newtechniques such as Fourier transforms and diagonalization of matrices.

240 6. Introduction to Valuations

6.3 Non-Archimedean vs. Archimedean Valua-tions

Philosophy is written in that great book which ever lies before our eyes—Imean the universe. . .This book is written in mathematical language and itscharacters are triangles, circles, and other geometrical figures, without whosehelp. . .one wanders in vain through a dark labyrinth.

from The Asayer (1623)Galileo Galilei (1564–1642)

Italian astronomer and physicist

In §6.2 we got a taste of the di!erence between Archimedean and non-Archimedean valuations. In particular, the counterintuitive result in Exer-cise 6.19 on page 238, which says that all p-adic triangles are isosceles is seem-ingly incredible. Let us explore the di!erences at greater length. The non-Archimedean case Qp for p < " has no analogue when p = " and Qp = Rsince there is no proper subdomain of R that has R as its quotient field, whereasby Theorem 6.1 on page 236, Op is a subdomain of Qp, which is its quotientfield. The fields R and Qp for p < " are all uncountable and no two of themare isomorphic. Furthermore, and most importantly, we have exhausted all pos-sible valuations on Q since every such valuation is equivalent to a | · |p for somep # ". This is the following, proved in 1918—see Biography 6.2 on page 242.Recall the definition of a trivial absolute value given in Example 6.4 on page 234in what follows.

Theorem 6.3 Ostrowski’s Theorem

Every nontrivial valuation on Q is equivalent to one of the absolute values| · |p for a prime p or p = ".

Proof. First assume that for every integer n > 1 we have that |n| > 1.

Claim 6.1 There exists an r % R+ such that for any integer m > 1, |m| = mr.

Let n > 1 and t ) 1 be integers. Write nt to base m,

nt =!!

j=0

cjmj ,

where the cj % Z with 0 # cj # m ! 1 and c! '= 0. By the triangle inequality,|cj | = |1 + 1 + · · · + 1| # cj |1| = cj for each j. Also, since nt ) m!, then

' # loge(nt)loge m

,

so by the triangle inequality again,

|nt| #!!

j=0

|cjmj | =

!!

j=0

|cj ||mj | < m!!

j=0

|mj | # m!!

j=0

|m|!

6.3. Non-Archimedean vs. Archimedean Valuations 241

# m(' + 1)|m|(loge(nt)/ loge m).

Hence,|n| # lim

t!"

,m1/t(' + 1)1/t|m|loge(nt)/(t loge m))

-

= limt!"

,m1/t(' + 1)1/t

-·,

limt!"

|m|(t loge(n))/(t loge m))-

= limt!"

|m|loge n/ loge m

= |m|loge n/ loge m.

By reversing the roles of m and n in the above argument, we also get

|m| #| n|loge m/ loge n,

so|m|1/ loge m = |n|1/ loge n,

for every m > 1 and n > 1. By setting the constant

K = |m|1/ loge m = |n|1/ loge n,

we get|m| = K loge m = e(loge m)·(loge K) = mloge K = mr,

where r = loge K % R+. This establishes the claim.By Claim 6.1 in the case where n > 1 implies |n| > 1, we must have that

|m| = |m|" by Theorem 6.2 on page 237.Now assume that |n| < 1 for some integer n > 1. Since | · | is nontrivial,

then there exists a least value q % N such that |q| < 1. Assume that q is theleast such value. If q = q1 · q2 for q1, q2 % N with qj < q for j = 1, 2, then|q1| = 1 = |q2|, by the minimality of q, so |q| = |q1| · |q2| = 1, contradictingthat |q| < 1. Hence, q is prime. Let p '= q be a prime with |p| < 1, then forsu"ciently large N % N, we have that |pN | = |p|N < 1/2. Similarly, |qM | < 1

2 ,for su"ciently large M % N. Hence, since gcd(p, q) = 1, there exist u, v % Zsuch that upN + vqM = 1, so by the triangle inequality,

1 = |1| = |upN + vqM | # |upN | + |vpM | <12

+12

= 1,

a contradiction. Hence, our assumption that there exists a prime p di!erentfrom q with |p| < 1 is false. This proves that |p| = 1 for all primes p '= q.Hence, for any z % Z with q ! z, |z| = 1 = |z|q. Since any x % Q may be writtenuniquely in the form

x =a

bq*q(x) where |a| = |a|q = 1 = |b|q = |b|,

then|x| =

|a||b| |q|

*q(x) = |q|*q(x)

and since |q| < 1, then for some r % R+, |q| = q#r, where r is independent of x.Therefore, |x| = q#r*q(x) = (q#*q(x))r = |q|rq = |x|rq, so | · | is equivalent to theq-adic valuation by Theorem 6.2. !


Biography 6.2 Alexander Markowich Ostrowski (1893–1986) was born onSeptember 25, 1893 in Kiev, Ukraine. He began his post-secondary studiesat Marburg University in Germany in 1912 under Hensel’s supervision. How-ever, after the outbreak of World War I, Ostrowski was imprisoned as a hostileforeigner. When the war ended in 1918, he was allowed his freedom, and wentto Gottingen, where he worked on his doctorate under Hilbert and Landau—seeBiographies 3.5 on page 127 and 3.1 on page 104. In 1920, his doctoral dis-sertation was published in Mathematische Zeitschrift, and this was already hisfifteenth publication, having written his first paper before he even entered uni-versity. In that year, he went to Hamburg to work for his habilitation as Hecke’sassistant, and was awarded it in 1922. In 1923, he accepted a lecturing posi-tion at Gottingen. He moved around in the mid 1920s and finally settled ona position o"ered to him at the University of Basil in Switzerland, where hestayed until he retired in 1958. He published approximately 275 papers in hiscareer, in diverse areas such as determinants, algebraic equations, number the-ory, topology, di"erential equations, conformal mappings, among many others.In particular, concerning the topic of this section, he provided a comprehensivedescription of valuations in 1934. He also worked on the Euler-Maclaurin for-mula and the Fourier integral formula, among other valued topics. He died onNovember 20, 1986 in Montagnola, Lugano, Switzerland.

Exercises

6.21. Prove that a sequence of rational numbers {qj}"j=1 is a Cauchy sequencewith respect to the p-adic absolute value | · |p for a prime p < " if andonly if

(p)

limj!"

|qj+1 ! qj |p = 0.

Conclude that {qj}"j=1 is p-adically convergent. (See (6.10) on page 235.)

6.22. Prove that the series* =

!

j-k'Zcjp

j (6.11)

for a prime p < " with cj % Z with 0 # cj # p!1 is p-adically convergent,and that the partial sums *n =

"nj-k cjpj are Cauchy sequences for all

n % N. (Note that a series""

j=1 qj with qj % Qp converges in Qp if andonly if lim(p)

j!" |qj |p = 0.)

6.4. Representation of p-Adic Numbers 243

6.4 Representation of p-Adic Numbers

To get practice in being refused—on being asked why he was begging for alms from a statue

from Digenese Laertius Lives of the PhilosophersDiogenes (c. 400–c. 325 B.C.)

Greek cynic philosopher

In this section, we explore the methodology for representation of p-adic num-bers as power series. The series representation given by (6.11) in Exercise 6.22on the facing page tells us, via Exercise 6.21, that such series are limits ofCauchy sequences of elements in Q. We now demonstrate that a number hasa representation as a series given in (6.11) if and only if it is a p-adic number.First we need the following.

Lemma 6.2 Given a prime p < ", every % % Q has a representation as apower series in p.

Proof. Suppose first that % = a/b where gcd(a, b) = 1, and p ! b. For a givenj % N, we know from (6.5) on page 230 that a solution to bxj + a(mod pj) isgiven by

xj =j#1!

i=0

cipi with 0 # ci < p

so that ###a

b! xj

### # p#j .

Thus,(p)

limj!"

.a

b! xj

/= 0.

In other words,a

b=

(p)

limj!"

xj .

Now suppose that for j < j%, xj , xj$ are two solutions as above, then

|xj ! xj$ |p =

######

j$#1!

i=j

cipi

######p

#j$#1!

i=j

p#i|ci|p #j$#1!

i=j

p#i =1pj ! 1

pj$

1! 1p

< 3,

where j% may be assumed to be larger than some constant J(3) for any given3 > 0. Hence, by Exercise 6.21, the sequence {xj}"j=1 is p-adically convergent.In other words,

a

b=

"!

i=0

cipi with 0 # ci < p

is the p-adic representation of the rational number %.


Now we consider the case where % = a/b where gcd(a, b) = 1, and p!#### b for

some ' % N. From (6.11) in Exercise 6.22 on page 242 we know that a generalp-adic representation of % is given by

% = p#!

B

C"!

j=0

cjpj

D

E , where 0 # cj < p. (6.12)

!

If it is the case that in (6.12) in the above proof, there exist fixed m, n % Nsuch that for any integer r ) 0,

cj+m = cj+m+1 = · · · = cj+m+rn = · · · for j = 1, 2, . . . , n,

then we call this p-adic representation of % periodic. In this case we may rewrite(6.12) as

% = p#!

*B

Cm!

j=0

cjpj

D

E + pm+1

B

Cm+n!

j=m+1

cjpj#m#1

D

E

+pm+n+1

B

Cm+n!

j=m+1

cjpj#m#1

D

E + · · ·+

= p#!

B

C

B

Cm!

j=0

cjpj

D

E +"!

j=0

pm+1+jnC

D

E ,

where

C =m+n!

j=m+1

cjpj#m#1.

In what follows we prove that every rational number must be so represented.

Theorem 6.4 p-Adic Numbers as Periodic Power Series

For a prime p < ", % % Q if and only if % has a representation as a periodicpower series in p.

Proof. First assume that % % Q+, say % = a/b where gcd(a, b) = 1, and p!#### b

for some ' ) 0. By Lemma 6.2 on the preceding page, % has a representationvia

p!% =n!

j=0

cjpj +

u

w, where 0 # cj < p,

and either u/w = 0 or gcd(u, w) = 1, w > 0 > u, 0 > u/w > !1, and p ! w.Assuming u/w '= 0, let i % N be the least value such that pi + 1(mod w), and

6.4. Representation of p-Adic Numbers 245

there is a negative integer j with 1! pi = jw, so u/w = ju/(1! pi). Since theabove conditions imply that 0 < u(1! pi)/w < pi ! 1, then

ju =i#1!

j=0

ajpj , with 0 # aj < p.

Hence, since""

j=0 pij(1! pi) = 1, then

u

w=

B

Ci#1!

j=0

ajpj

D

E·

B

C"!

j=0

pij

D

E =

B

Ci#1!

j=0

ajpj

D

E+pi

B

Ci#1!

j=0

ajpj

D

E+p2i

B

Ci#1!

j=0

ajpj

D

E · · · ,

so % has a periodic power series in p.If % < 0, then we perform the above to obtain the power series for !%. We

obtain that % = 0! (!%) has a power series in p since we may represent 0 as

0 = p + (p! 1)"!

j=1

pj .

Conversely, assume that

% = p#!

B

C

B

Cm!

j=0

cjpj

D

E +"!

j=0

pm+1+jnC

D

E ,

where

C =m+n!

j=m+1

cjpj#m#1.

Therefore,

%p! !

B

Cm!

j=0

cjpj

D

E ="!

j=0

pm+1+jnC ="!

j=0

pm+1C

B

C"!

j=0

pjn

D

E .

However,t!

j=0

pjn =1! p(t+1)n

1! pn,

and for t ) t0, namely for t su"ciently large, we have for any 3 > 0 that####

11! pn

! 1! p(t+1)n

1! pn

####p

= p#(t+1)n < 3.

So"!

j=0

pjn =1

1! pn.


Hence,

%p! !

B

Cm!

j=0

cjpj

D

E = pm+1C1

1! pn,

namely

% = p#!

B

Cm!

j=0

cjpj

D

E + pm+1#!C1

1! pn% Q,

as required. !

It is worth isolating a fact proved in the above.

Corollary 6.1 In Qp for a prime p < ", given any n % N,

"!

j=0

pjn =1

1! pn.

Exercises

6.23. Prove that for a prime p < ", % % Op if and only if % = a/b where p ! b.

6.24. Prove that if p < " is prime then

P = {% % Qp : |%|p < 1}

is an ideal in Op.(Hint: See Theorem 6.1 on page 236.)

6.25. Prove that the ideal P in Exercise 6.24 is maximal in Op.(Hint: Use Hensel’s Lemma 6.1 on page 230 with modulus P to show thatOp/P is the set of invertible elements Up in Op, then employ Theorem 2.7on page 68.)

6.26. With reference to Exercises 6.24–6.25, prove that every nonzero ideal ofOp is of the form

I = pnOp = Pn

for some integer n ) 0.(Hint: Prove that I = Pn = {x : |x| # p#n} then use induction on n toestablish that Pn = Pn.)

6.27. Prove that every nonzero % % Qp may be written uniquely in the form% = upn where n % Z, u % Up.

Chapter 7

Dirichlet: Characters,Density, and Primes inProgression

Talent develops in quiet places, character in the full current of human life.translation from Act 1, Scene 2 of Torquato Tasso (1790)

Johann Wolfgang von Goethe (1749–1832)German poet, novelist, and dramatist

7.1 Dirichlet Characters

A principal goal of the chapter is to establish the renowned Dirichlet Theo-rem on primes in arithmetic progression. In order to do so, we need to generalizethe Riemann !-function, which we studied in detail in §5.3. The generalizationrequires the introduction of the following notion, the topic of this section.

Definition 7.1 Dirichlet Characters

If D % N is fixed and7 : N ./ C

is a function satisfying the following for each m, n % N, then 7 is called aDirichlet character modulo D.

(a) 7(mn) = 7(m)7(n).

(b) If m + n(mod D), then 7(m) = 7(n).

247

248 7. Dirichlet: Characters, Density, and Primes in Progression

(c) 7(n) = 0 if and only if gcd(n, D) > 1.

Example 7.1 If D > 1 is odd then the Jacobi symbol (n/D) is a Dirichletcharacter for the modulus D.

Remark 7.1 If #(D) denotes the Euler totient, and we have that gcd(n, D) =1, then

7(n),(D) = 7(n,(D)) = 7(1)

by parts (a)–(b) of Definition 7.1 in conjunction with Euler’s generalization ofFermat’s little theorem, namely

n,(D) + 1 (mod D).

Moreover, since7(1) = 7(12) = 7(1)2,

and 7(1) '= 0, then 7(1) = 1. In particular, this shows that 7(n) is a #(D)-th root of unity for all nonvanishing values of 7. Note, as well, that Dirichletcharacters are completely multiplicative—see Exercise 5.13 on page 227.

Example 7.2 The character 70(n) = 1 for all n % N relatively prime to Dand 70(n) = 0 otherwise is called the principal character for the modulus D.(When referring to a character 7 modulo D, the modulus D will be understoodin context.) Moreover, if 71, 72 are Dirichlet characters modulo D, then it isclear that 7172 is also a Dirichlet character modulo D, where

7172(n) = 71(n)72(n).

In fact, by Remark 7.1, a Dirichlet character is a #(D)-th root of unity wheneverit is nonvanishing. Also, it is completely multiplicative, and is constant onresidue classes modulo D. Thus, the Dirichlet characters form a multiplicativegroup where 70 is the identity element, and for any character 7 the complexconjugate 7 is also a character with

77(n) = 7(n)7(n) = |7(n)|2 = 1

when gcd(n, D) = 1. In other words, 77 = 70, so 7 is the multiplicative inverseof 7. The group of characters maps homomorphically into the roots of unity inC. We denote this group by GD

char, the group of Dirichlet characters modulo D.In what follows we establish the cardinality of GD

char.

Remark 7.2 If 71 is a Dirichlet character modulo D1 and 72 is a Dirichletcharacter modulo D2, then 7172 is a Dirichlet character modulo lcm(D1, D2),where

7172(n) = 71(n)72(n).

We will use this fact in the following result.

7.1 Dirichlet Characters 249

Theorem 7.1 The Number of Dirichlet Characters

For a given integer D > 1, there exist exactly #(D) distinct Dirichlet char-acters modulo D. In other words,

|GDchar| = #(D).

Proof. Suppose that pa||D for a prime p and a % N. If q = pa is 2, 4, oran odd prime power, then there exists a primitive root modulo q, a fact fromelementary number theory—see [68, Theorem 3.7, p. 151]. Let g be one suchprimitive root. Then the values gi for i = 1, 2, . . . ,#(q) form a complete set ofreduced residues modulo q, namely those residues relatively prime to q—see [68,Theorem 3.1, p. 142]. Therefore, by selecting

7i(g) = gi for i = 1, 2, . . . ,#(q),

we have defined #(q) distinct Dirichlet characters modulo q. If q = 2a wherea > 2, then q has no primitive root. However, ±5 have order 2a#2 modulo2a—see [68, Exercise 3.8, p. 152]. Thus, together ±5j for j = 1, 2, . . . , 2a#2

generate all odd residues modulo 2a. By selecting a primitive 2a#2-th root ofunity !2a"2 and defining characters 7i via

7i(5) = !i2a"2

and7i(2a ! 1) = ±1, for i = 1, 2, . . . , 2a#2,

we have constructed#(2a) = 2a#1 = 2 · 2a#2

distinct characters modulo 2a. By Remark 7.2, these characters may be puttogether to form #(D) distinct characters modulo D. !

Immediate from the above is the following—see [68, p. 81 !.]. Recall thatthe symbol a means the residue class of a.

Corollary 7.1 For any integer D > 1,

GDchar

$= (Z/DZ)) = {a % Z/DZ : 0 < a < D and gcd(a, D) = 1},

the group of units of Z/DZ.

Next are identities involving characters that will allow us to introduce an-other celebrated function due to Dirichlet.

Theorem 7.2 Orthogonality Identities for Dirichlet Characters

If D > 1 is an integer, then the following both hold.


(a)D!

n=1

7(n) =

(#(D) if 7 = 70,0 if 7 '= 70.

(b)!

-'GDchar

7(n) =

(#(D) if n + 1(mod D),0 if n '+ 1(mod D).

Proof. (a) If 7 = 70, then the sum picks only those nonzero elements prime toD, for which 7(n) = 1, so the result is immediate. If 7 '= 70, then there existsan integer z relatively prime to D such that 7(z) '= 1. Thus,

7(z)D!

n=1

7(n) =D!

n=1

7(zn) =D!

n=1

7(n),

since 7(zn) runs over all values of 7 as does 7(n) for n = 1, 2, . . . , D. Therefore,

(7(z)! 1)D!

n=1

7(n) = 0,

and since 7(z) '= 1, we may divide both sides by (7(z)! 1) to get the result forpart (a).

(b) If n + 0(mod D), then the result is obvious. Assume that n '+ 0(mod D). By Theorem 7.1 on the previous page, there are #(D) distinct char-acters 7 % GD

char, so if n + 1(mod D), then by part (b) of Definition 7.1 onpage 247, !

-'GDchar

7(n) =!

-'GDchar

7(1) = #(D).

On the other hand, if n '+ 1(mod D), then by Exercise 7.1 on the next page,there exists a character 7n % GD

char such that 7n(n) '= 1. Thus,

7n(n)!

-'GDchar

7(n) =!

-'GDchar

7n(n)7(n) =!

-'GDchar

7n7(n) =!

-'GDchar

7(n),

by Example 7.2 on page 248, since 7n7 is again a Dirichlet character for each7. Hence,

(7n(n)! 1)!

-'GDchar

7(n) = 0,

and since 7n(n) '= 1, then we divide both sides by (7n(n) ! 1) to secure theresult. !

Corollary 7.2 If G = GDchar, and 7, 5 % G, each of the following holds.

(a) If )(7, 7) = 1 and )(7, 5) = 0 if 7 '= 5, then

D!

n=1

7(n)5(n) = #(D))(7, 5).

7.1 Dirichlet Characters 251

(b) If )(m, n) = 1 when m + n(mod D) and )(m, n) = 0 when m '+ n(mod D),then !

-'G

7(m)7(n) = #(D))(m, n).

Proof. (a) Since

D!

n=1

7(n)5(n) =D!

n=1

7(n)5(n)#1 =D!

n=1

75#1(n),

then by part (a) of Theorem 7.2, this sum is equal to 0 if 75#1 '= 70 and is#(D), otherwise. In other words, the sum is 0 if 7 '= 5, and is #(D) if 7 = 5.

(b) Since !

-'G

7(m)7(n) =!

-'G

7(mn#1),

then by part (b) of Theorem 7.2, this sum is equal to 0 if

mn#1 '+ 1 (mod D)

and ifmn#1 + 1 (mod D)

it equals #(D). In other words, it is 0 if m '+ n(mod D) and is #(D) otherwise.!

Now we have the tools to proceed to §7.2 where we will provide the gener-alization of Riemann’s function promised at the outset of this section.

Exercises

7.1. Let n % N and D > 1 an integer such that n '+ 0, 1(mod D). Prove thatthere exists a 7n % GD

char such that 7n(n) '= 1.

7.2. Prove that if 7 is a Dirichlet character modulo D, and s % C, then theseries

"!

n=1

7(n)n#s

converges absolutely for ;(s) > 1.(Hint: Use Theorem 5.10 on page 219 by bounding |7(n)n#s|.)


7.2 Dirichlet’s L-Function and Theorem

This frightful word [function] was born under other skies than those I haveloved—those where the sun reigns supreme.

from the introduction of Le Corbusier (1974) Stephen GardinerLe Corbusier (Charles-Edouard Jeanneret) (1887–1965)

French architect

In §7.1 we laid the groundwork for the next notion that will be a generaliza-tion of the !-function promised therein.

Definition 7.2 Dirichlet L-Functions

If 7 is a Dirichlet character modulo D > 1 and s % C, then

L(s, 7) ="!

n=1

7(n)ns

is called a Dirichlet L-function.

Dirichlet defined and studied these L-functions primarily to prove Theo-rem 7.7 on page 258, which is a principal result of this chapter. We now developsome salient features of these functions. First, we note that by Exercise 7.2on the previous page,

""n=1 7(n)n#s converges absolutely for ;(s) > 1. Note

that these L-functions are special cases of the Dirichlet series we encounteredin Exercise 5.14 on page 227. Indeed, we have the following.

Theorem 7.3 L-Functions and Euler Products

If s % C and ;(s) > 1, then for a Dirichlet character 7 modulo D > 1,

L(s, 7) =7

p=prime

(1! 7(p)p#s)#1.

Proof. This is Exercise 7.3 on page 260. !

When we restrict to the principal character, then we have a close relationshipwith the Riemann !-function as follows.

Corollary 7.3 If 7 = 70 in Theorem 7.3, then for ;(s) > 1,

L(s, 70) =7

p|D

(1! p#s) · !(s).

Proof. By Theorem 7.3,

L(s, 70) =7

p=prime

(1! 70(p)p#s)#1.

7.2 Dirichlet’s L-Function and Theorem 253

However, 70(p) = 1 except when p## D, where we have 70(p) = 0. Therefore,

since!(s) =

7

p=prime

(1! p#s)#1 =7

p|D

(1! p#s)#1 ·7

p!D(1! p#s)#1

=7

p|D

(1! p#s)#1 ·7

p!D(1! 70(p)p#s)#1 =

7

p|D

(1! p#s)#1L(s, 70),

we have the result. !

The following provides a functional equation for L-functions based uponCorollary 7.3.

Corollary 7.4 If 70 is the principal character modulo D, then L(70, s) satisfiesthe functional equation

L(s, 70) = 2s"s#17

p|D

1! p#s

1! ps#1· %(1! s)

.sin

"s

2

/L(1! s, 70),

which is tantamount to

L(1! s, 70) = 21#s"#s7

p|D

1! ps#1

1! p#s· %(s)

.cos

"s

2

/L(s, 70).

Proof. By Theorem 5.15 on page 225, for s % C,

!(s) = 2s"s#1%(1! s)!(1! s) ·.sin

"s

2

/,

and via Corollary 7.3, we may replace the zeta functions by L-functions to get

L(s, 70)7

p|D

(1!p#s)#1 = 2s"s#1L(1!s, 70)7

p|D

(1!ps#1)#1 ·%(1!s).sin

"s

2

/,

from which the first result easily follows. For the second result, we rearrangethe above to get

L(1! s, 70) = 2#s"1#s7

p|D

1! ps#1

1! p#s·*

1%(1! s)

9sin (s

2

:+

L(s, 70).

By Exercise 7.5 on page 260,*

1%(1! s)

9sin (s

2

:+

= %(s)"#12 cos"s

2,

soL(1! s, 70) = 21#s"#s

7

p|D

1! ps#1

1! p#s· %(s)

.cos

"s

2

/L(s, 70),

which is the entire result. !


Remark 7.3 Corollary 7.3 and Exercise 7.4 on page 260 provide an analyticcontinuation of L(s, 70) as a meromophic function in the whole plane with asole singularity at s = 1—see Remark 5.8 on page 219. Now we need to lookat analytically continuing L(s, 7) to the region ;(s) > 0 for arbitrary Dirichletcharacters 7. This will provide an essential step in the development of materialto prove Theorem 7.7 on page 258.

Theorem 7.4 Analytic Continuation of L-Functions

If 7 '= 70 is a Dirichlet character modulo D > 1, then

L(s, 7) ="!

n=1

7(n)n#s

converges for all ;(s) > 0.

Proof. We begin with a necessary bound.

Claim 7.1 If 7 '= 70, then for any N % N,#####

N!

n=1

7(n)

##### # #(D).

Let N = qD + r where q and r are integers with 0 # r < D. Thus,

N!

n=1

7(n) = q

*D!

n=1

7(n)

++

r!

n=1

7(n)

since 7(n) = 7(m) for m + n(mod D) by part (a) of Definition 7.1 on page 247.By part (a) of Theorem 7.2 on page 249, if 7 '= 70, then

"Dn=1 7(n) = 0, so by

the triangle inequality,#####

N!

n=1

7(n)

##### =

#####

r!

n=1

7(n)

##### #D!

n=1

|7(n)| # #(D),

which is the claim.Now define for any real x ) 1 and m % N, S(0) = 0, and

S(x) =!

m&x

7(m).

Then7(n) = S(n)! S(n! 1).


Now suppose that N % N. Then

N!

n=1

7(n)ns

=N!

n=1

S(n)! S(n! 1)ns

=N!

n=1

S(n)ns

!N!

n=1

S(n! 1)ns

=N!

n=1

S(n)ns

!N#1!

n=0

S(n)(n + 1)s

=S(N)Ns

+N#1!

n=1

S(n)ns

!N#1!

n=1

S(n)(n + 1)s

=N#1!

n=1

S(n)$

1ns

! 1(n + 1)s

&+

S(N)Ns

.

Hence,

"!

n=1

7(n)n#s = limN!"

*N#1!

n=1

S(n)$

1ns

! 1(n + 1)s

&+

S(N)Ns

+

="!

n=1

S(n)$

1ns

! 1(n + 1)s

&.

Moreover, we have that

"!

n=1

S(n)$

1ns

! 1(n + 1)s

&= s

"!

n=1

S(n)N n+1

nx#s#1dx = s

N "

1S(x)x#s#1dx.

Now by Claim 7.1, |S(x)| # #(D) for all x. Thus, the integral converges anddefines an analytic function for all s with ;(s) > 0. !

Remark 7.4 Just as we commented in Remark 5.10 on page 226 to the e!ectthat the zeros of Riemann’s !-function are intimately connected with the dis-tribution of primes, so too the zeros of the L-functions L(s, 7) speak about thedistribution of primes in arithmetic progression. In fact, the principal featureof the proof of Dirichlet’s theorem on primes in arithmetic progression is thevalidation that L(1, 7) '= 0 when 7 '= 70. This is encapsulated in the followinggeneralization of Conjecture 5.1 on page 223. See Remark 7.6 on page 258 andExercise 7.7 on page 261.

Conjecture 7.1 The Generalized Riemann Hypothesis (GRH)

If 7 is a Dirichlet character, then the zeros of L(s, 7) for ;(s) > 0 lie on theline ;(s) = 1/2.

Remark 7.5 Note that in the literature, Conjecture 7.1 is sometimes calledthe Extended Riemann hypothesis (ERH), and sometimes there is a distinctionmade between Conjecture 7.1 and a yet more general conjecture involving the


Dedekind-zeta function for an algebraic number field F , which is given by thefollowing sum over all nonzero ideals I of OF ,

!F (s) =!

I

1(N(I))s

for every s % C with ;(s) > 1, where N(I) = |OF /I| is the norm of the idealI—see Exercise 8.32 on page 292. The more general assertion is: If F is anumber field and s % C with !F (s) = 0 and 0 < ;(s) < 1, then ;(s) = 1/2.Conjecture 5.1 follows from this with F = Q and OF = Z. Depending on thesource in the literature, Conjecture 7.1 is sometimes called the ERH and thelast more general one the GRH and sometimes this is reversed. We maintain theGRH label for Conjecture 7.1 since it appears to be the most ubiquitous label.Indeed, for computational relevance and the historical significance of Conjecture7.1, see [62, §5.4, pp. 172–186].

Now we proceed to verify the contents of the assertions made above in ourquest to prove Dirichlet’s theorem. In preparation, the reader should solveExercise 7.6 on page 261.

Theorem 7.5 Nonvanishing of L(1, 7) for Complex 7

If 7 is a nontrivial complex Dirichlet character modulo D, then L(1, 7) '= 0.

Proof. By Theorem 7.4 on page 254,

L(s, 7) ="!

n=1

7(n)n#s,

so for s % R, s > 1,

L(s, 7) ="!

n=1

7(n)n#s = L(s,7).

Thus, if L(1, 7) = 0, then L(1,7) = 0.Assume that L(1, 7) = 0 for a complex character 7. Then L(s, 7) '= L(s,7)

for s % C, both have a pole at s = 1, and L(1, 7) = 0 = L(1,7). In the product

F (s) =7

-'GDchar

L(s, 7),

the term L(s, 70) has a pole at s = 1 and by Theorem 7.4 on page 254, L(s, 7)for 7 '= 70 is analytic about s = 1. Hence, F (1) = 0. However, by Exercise 7.6,F (s) ) 1 for all s % R with s > 1, so

lims!1+

F (s) = F (1) '= 0,


a contradiction, so L(1, 7) '= 0 for any complex character 7. !

Now in the final bid to establish the key result in the proof of Dirichlet’stheorem, we need to establish the nonvanishing of L(1, 7) real characters 7.This is the more di"cult case.

Theorem 7.6 Nonvanishing L(1, 7) for Real 7

If 7 is a nontrivial real Dirichlet character modulo D, then L(1, 7) '= 0.

Proof. Suppose that 7 is a real character and L(1, 7) = 0. Now define thefunction

f(s) =L(s, 7)L(s, 70)

L(2s, 70).

Since L(1, 7) = 0 and L(s, 70) has a simple pole at s = 1 means that the twoevents cancel out, so L(s, 7)L(s, 70) is analytic on ;(s) > 0. Also, L(2s, 70) isanalytic on ;(s) > 1/2, has a pole at s = 1/2, and by Theorem 7.4 on page 254may be continued to an interval containing 1/2 with a simple pole at s = 1/2.Hence, lims!1/2+ f(s) = 0.

If s % R with s > 1, then f has an infinite product expansion,

f(s) =7

p=prime

(1! 7(p)p#s)#1(1! 70(p)p#s)#1(1! 70(p)p#2s)

=7

p!D

(1! p#2s)(1! p#s)(1! 7(p)p#s)

. (7.1)

By Exercise 7.8 on page 262, 7(p) = ±1. If 7(p) = !1, then from (7.1),

(1! p#2s)(1! p#s)(1! 7(p)p#s)

= 1.

Hence, from (7.1),

f(s) =7

-(p)=1

(1! p#2s)(1! p#s)(1! p#s)

=7

-(p)=1

(1! p#s)(1 + p#s)(1! p#s)(1! p#s)

=7

-(p)=1

1 + p#s

1! p#s.

However,

1 + p#s

1! p#s= (1 + p#s)

B

C"!

j=0

p#js

D

E ="!

j=0

p#js +"!

j=0

p#(j+1)s

= 1 +"!

j=1

p#js +"!

j=1

p#js = 1 +"!

j=1

2p#js.

By Exercise 7.9 on page 262, f(s) =""

n=1 gnn#s where gn is nonnegative forall n and converges for s > 1. Indeed, since g1 = 1, and f(s) is analyticfor ;(s) > 1/2, then f(s) ) 1 for s > 1/2, whereas lims!1/2+ f(s) = 0, acontradiction. Hence, L(1, 7) '= 0. !


Remark 7.6 Exercise 7.7 speaks to the comments made in Remark 7.4 onpage 255. Equation (7.4) tells us that if we can prove that

!

'#GDchar

'%='0

7(a) loge L(s, 7) ./ " as s / 1+,

then there are infinitely many primes p + a(mod D). We know from Exercise7.4 that the term

L(s, 70) ./ " as s / 1+,

but the other terms could cancel out this fact, so we get to the comments madein Remark 7.4 to the e!ect that the core of the proof of Dirichlet’s theorem isto show that L(s, 7) '= 0 when 7 '= 70. This is what we proved in Theorems 7.5on page 256 and Theorem 7.6 on the preceding page. We are now ready for themain result.

Theorem 7.7 Dirichlet: Primes in Arithmetic Progression

If a, m % Z with gcd(a, m) = 1, then there are infinitely many primes of theform p = mn + a for n % N.

Proof. By Exercise 7.7 on page 261,

loge L(s, 70) +!

'#GDchar

'%='0

7(a) loge L(s, 7) = #(D)!

p&a (mod D)

1ps

+ O (#(D)) . (7.2)

By Theorems 7.5–7.6,lim

s!1+L(s, 7) > 0

for 7 '= 70. Hence,lim

s!1+

!

-(=-0

7(a) loge L(s, 7) < "

while by Exercise 7.4 on page 260, lims!1+ L(s, 70) = ", so the left hand side of(7.2) increases indefinitely as s / 1+. If the number of primes in the arithmeticprogression p + a(mod m) is finite, then

lims!1+

!

p&a(mod m)

1ps

=!

p&a(mod m)

1p

< ",

indeed it is rational, but this contradicts (7.2) since the left side is goes to "while the right side is finite. !

Remark 7.7 Although Dirichlet’s L-functions are generalizations of the Rie-mann !-function, Dirichlet introduced them before Riemann developed complexfunction theory. Thus, Dirichlet did not have the complex variable tools at his


disposal to establish the nonvanishing of L(1, 7). He did this by looking at classnumbers hD of binary quadratic forms of discriminant D–see §3.1. He defined,for a quadratic number field F , the character given as follows—see Remark 2.2on page 63 for a reminder of the terms used below,

7F (p) =

( 1 if (p) is a split prime in F,!1 if (p) is an inert prime in F,0 if (p) is an ramified prime in F.

(7.3)

Then Dirichlet proved that

L(1, 7F ) = NhD,

where

N =

(2 loge 3D/

&D if D > 0,

2"/(w'

|D|) if D < 0,

where w = 4 if D = !4, w = 6 if D = !3, and w = 2 otherwise. The value 3D

is the smallest unit in OF that exceeds 1 when F is real. Also, when F is real,

RF = loge 3D

is called the regulator of F and 3D is called the fundamental unit of F . Clearly,hD > 0 and RF > 0, so L(1, 7) > 0 is immediate.

It also follows thatL(s, 7F ) =

!F (s)!(s)

,

where !F (s) is the Dedekind-zeta function given in Remark 7.5 on page 255.In Theorem 7.4 on page 254 we saw that L(s, 7) may be continued analyticallyfor ;(s) > 0. Riemann showed how to continue it to the entire complex plane.Thus, every zero of !(s) is cancelled by a zero of !F (s) with at least the samemultiplicity. If we look at the more general case where Q is replaced by anynumber field K ( F , then is it still true that

!F (s)/!K(s)

is analytic on the whole complex plane? The a"rmative answer to this is the,still open, Artin Conjecture.

The above notion (7.3) of a character defined for a quadratic field may begeneralized to any algebraic number field in order to, therefore, associate a given(generalized) Dirichlet character 7F with any number field F . Once done it canbe shown that they form a group GF and

!F (s) =7

-F'GF

L(s, 7F ),


and if N is the order of a given character 7F in GF , then it can also be demon-strated that

!F (s) = !(s)N#17

n=1

L(s, 7nF ).

Therefore, since !(s) has only a simple pole at s = 1, none of the factors L(s, 7nF )

can vanish at s = 1. In particular, L(s, 7F ) '= 0, providing a simple proof of theresults we achieved in Theorems 7.5–7.6, albeit by employing a more general!-function with ostensibly deeper results.

Exercises

7.3. Prove Theorem 7.3 on page 252.(Hint: Use Exercise 7.2 on page 251 in conjuction with Exercises 5.12–5.13 on page 227.)

7.4. Prove that if 70 is the principal Dirichlet character modulo D > 1, ands % C then

lims!1+

(s! 1)L(s, 70) =7

p|D

(1! p#1) =#(D)

D.

Conclude that lims!1+ L(s, 70) = ".(Hint: Use Corollary 7.3 on page 252 in conjunction with Exercise 5.18on page 227 and the fact that:

#(D) = D7

p|D

(1! 1/p)

—see [68, Corollary 2.1, p. 92].)

7.5. Prove that for the Gamma function given in Definition 5.6 on page 224,

%(s)%(1! s) ="

sin"s=

"

2(sin (s2 )(cos (s

2 ).

(Hint: You may use the fact thatN "

0

us#1

1 + udu =

"

sin "s,

for 0 < ;(s) < 1. This integral is derivable from the relationship betweenthe Beta function

B(x, y) =N 1

0tx#1(1! ty#1)dt,


and the following relationship with the Gamma function

B(x, y) =%(x)%(y)%(x + y)

.

+

7.6. IfF (s) =

7

-'GDchar

L(s, 7),

namely the product over all Dirichlet characters modulo D, show thatF (s) ) 1 for s % R, s > 1.(Hint: Form the sum, G(s, 7) =

"p

""n=1

1n7(pn)p#ns and use Corol-

lary 7.2 on page 250, observing that for z % C with |z| < 1,

exp

* "!

n=1

1n

zn

+=

11! z

where exp(x) = ex. Note that G(s, 7) converges uniformly for ;(s) > 1since !(s) converges uniformly for s ) 1 + 3 > 1. Recall that a seriesconverges uniformly if the sequence of partial sums converges uniformly,and a sequence {sn}"n=1 converges uniformly for a set S of values of xprovided that for each 3 > 0, there exists an N % Z with |sn(x)!s(x)| < 3for n ) N and all x % S. From these considerations, G(s, 7) is continuousfor ;(s) > 1.)

7.7. Prove that if s % C with ;(s) > 1 and a % Z such that gcd(a, D) = 1,then the following equation holds for Dirichlet characters modulo D,

loge L(s, 70) +!

'#GDchar

'%='0

7(a) loge L(s, 7) = #(D)!

p&a (mod D)

1ps

+ O (#(D)) .

(7.4)(Hint: Use Theorem 7.2 on page 249, Corollary 7.2 on page 250, andTheorem 7.3 on page 252.)(Note that the left-hand side of (7.4) is a special case of another !-functioncalled the Hurwitz !-function defined for s, q % C with ;(s) > 1 and;(q) > 0 by

!(s, q) ="!

n=0

1(q + n)s

,

which is absolutely convergent for the aforementioned values of s, q andcan be extended to a meromorphic function defined for all s '= 1. TheRiemann !-function is the case where q = 1, and (7.4) is given by q = a/Dwhen D > 2, namely

!(s, a/D) =!

-'GDchar

7(a)L(s, 7).


Moreover, we can write the Dirichlet L-functions in terms of the Hurwitz!-function as follows,

L(s, 7) =1ns

n!

j=1

7(j)!$

s,j

n

&,

and

!(s) =1ns

n!

j=1

!

$s,

j

n

&,

as well.)

7.8. If 7 is a Dirichlet character such that 7(n) is real for all n % Z, then provethat 7(n) = ±1 when gcd(n, D) = 1, and 72 = 70.

7.9. Let f be a nonnegative multiplicative arithmetic function and assumethere exists a K % R+ such that f(pj) < K for all prime powers pj . Provethat

""n=1 f(n)n#s converges for all s % R with s > 1. Also, prove that

"!

n=1

f(n)n#s =7

p

B

C1 +"!

j=1

f(pj)p#js

D

E .

(Hint: Use Exercise 5.14 on page 227 for the last assertion.)

7.3 Dirichlet Density 263

7.3 Dirichlet Density

That all things are changed, and that nothing really perishes, and that thesum of matter remains exactly the same, is su!ciently certain.

translation from Cogitationes de Natura Rerum Cogitatioin The Works of Francis Bacon, Volume 5 (1858)

J. Spedding, editorFrancis Bacon (1561–1626)

English lawyer, courtier, philosopher, and essayist

This section deals with a concept that allows us to measure the size of aset of primes in an accurate fashion and will provide another interpretation ofDirichlet’s Theorem 7.7 on page 258.

Definition 7.3 Dirichlet Density

If S is a subset of the primes in Z, and if

lims!1+

"p'S p#s

loge(s! 1)#1= k % R,

then we say that S has Dirichlet density k, and we denote this by D(S). If thelimit does not exist then S has no Dirichlet density. Dirichlet density is oftencalled analytic density.

Remark 7.8 Note that Definition 7.3 may be reformulated in terms of Defini-tion 5.4 on page 200, asymptotic equality, to say that as s / 1,

!D(S) loge(s! 1) $!

p'S

p#s.

One may also define another notion of “density” for two sets relative to oneanother in the following fashion. If S ( W ( N with |W| = ", then if

limN!"

|{n % S : n # N}||{n % W : n # N}| = ' % R,

then we say that S has natural density or asymptotic density ' in W, denotedby ND(S,W). In other words, in terms of asymptotic equality, S has naturaldensity in W if

ND(S,W)|{n % W : n # N}| $ |{n % S : n # N}|.

Natural density is a more restrictive notion than Dirichlet density. For instance,it can be shown that for any integer b > 2, the set of primes with first digit 1when written in base b has Dirichet density but does not have natural density.Yet any set of primes that has natural density, has Dirichlet density equal tothe same value.


We digress from the main topic to provide an example of natural densityand some most interesting consequences with the following, which was provedin 1926—see [5].

Theorem 7.8 Beatty’s Theorem

Suppose that %,& % R+ are irrational and

1/% + 1/& = 1.

If{sn}"n=1 = {n%}"n=1 and {tn}"n=1 = {n&}"n=1,

then for any N % N there is exactly one element of the sequence

{sn}"n=1 3 {tn}"n=1

in the interval (N, N + 1).

Proof. SetS# = {0n%1 : n % N}, (7.5)

andS% = {0n&1 : n % N}. (7.6)

Then for each N % N, if

SN# = {x % S# : x # N},

andSN

% = {x % S% : x # N},

we have cardinalities

##SN#

## ==

N

%

>and

##SN%

## ==

N

&

>.

We haveN

%! 1 <

=N

%

><

N

%(7.7)

andN

&! 1 <

=N

&

><

N

&. (7.8)

Adding (7.7) and (7.8) and using the fact that 1/% + 1/& = 1, we get

N ! 2 <

=N

%

>+

=N

&

>< N,

so|SN

# 3 SN% | =

=N

%

>+

=N

&

>= N ! 1.


Hence, ###SN+1# 3 SN+1

%

### ==

N + 1%

>+

=N + 1

&

>= N.

Thus, the number of elements of

{sn}"n=1 3 {tn}"n=1

in the interval (N, N + 1) is

|SN+1# 3 SN+1

% |!| SN# 3 SN

% | = 1,

which secures the result. !

Corollary 7.5 With S# and S% given by (7.5)–(7.6),

S# 3 S% = N and S# * S% = ".

Also,

ND(S#) =1%

and ND(S%) =1&

.

Proof. Immediate from Theorem 7.8 is the first assertion. Also, from the proof,

ND(S#) = limN!"

|SN# |

N= lim

N!"

0N# 1N

=1%

,

and similarly,

ND(S%) =1&

,

as required. !

Remark 7.9 What is remarkable about the Beatty result is that the sequencescomplement each other in N as explicitly stated in Corollary 7.5. Indeed, twosequences that complement each other in N are called complementary.

Now that we have illustrated the natural density case, we return to Theo-rem 7.7 on page 258 from the perspective of Dirichlet density.

Theorem 7.9 Dirichlet: Primes and Density

If a, m % Z with gcd(a, m) = 1, and

Sap = {p % N : p is prime and p + a (mod m)},

thenD(Sa

p) =1

#(m).


Proof. We begin with some claims that will resolve the issue.

Claim 7.2!

p.a(mod m)

p#s =1

#(m)

!

-'Gm'

7#1(a)!

p!m

7(p)ps

.

We have

1#(m)

!

-'Gm'

7#1(a)!

p!m

7(p)ps

=1

#(m)

!

-'Gm'

!

p!m

7(a#1p)ps

.

However, by Therorem 7.2 on page 249,

!

-'Gm'

7(a#1p) =

(#(m) if a#1p + 1(mod m),0 otherwise.

Thus,1

#(m)

!

-'Gmchar

7#1(a)!

p!m

7(p)ps

=!

p.a (mod m)

p#s,

which is the claim.

Claim 7.3 For 7 '= 70, !

p!m

7(p)ps

remains bounded as s / 1.

We have!

p!m

7(p)ps

=!

p!m

"!

n=1

7(p)n

pns!

!

p!m

"!

n=2

7(p)n

psn. (7.9)

However,!

p!m

"!

n=2

7(p)n

psn#

!

p!m

"!

n=2

1pns

=!

p!m

1ps(ps ! 1)

,

where the last equality comes from a fact about geometric series—see [68, The-orem 1.2, p. 2]:

"!

n=2

1pns

="!

n=0

1pns

! 1! p#s = limN!"

p#(N+1)s ! 1p#s ! 1

! 1! p#s

=1

1! p#s! 1! p#s =

1ps(ps ! 1)

.

Also, since for s ) 1,

!

p!m

1ps(ps ! 1)

#!

p!m

1p(p! 1)

<"!

n=2

1n(n! 1)

="!

n=2

$1

n! 1! 1

n

&


="!

n=2

1n! 1

!"!

n=2

1n

= 1 +"!

n=2

1n!

"!

n=2

1n

= 1,

we have shown that!

p!m

"!

n=2

1pns

is bounded as s / 1+. To complete Claim 7.3, it remains to show that the left-hand sum in (7.9) is bounded—see Exercise 7.6 on page 261 for some backgroundinto the following. Since

exp("!

n=1

zn/n) = 1/(1! z),

then by substituting z = 7(p)p#s, we have

exp

* "!

n=1

7(p)n

npns

+= (1! 7(p)p#s)#1,

so it follows by taking products over primes then taking logarithms that

!

p

"!

n=1

7(p)n

npns= loge L(s, 7),

and by Theorems 7.5 on page 256 and 7.6 on page 257, L(1, 7) remains boundedfor 7 '= 70. Hence, the same is true for

!

p!m

"!

n=1

7(p)n

npns,

so we have Claim 7.3.Now by (5.30) on page 221 and Exercise 5.18 on page 227,

!

p!mp#s $ loge(s! 1)#1,

so by Claim 7.3,!

p!m

7(p)n

npns

remains bounded for all 7 % Gmchar as s / 1. Now we may use Claim 7.2 to

conclude that !

p.a (mod m)

p#s $ 1#(m)

loge(s! 1)#1,

namely

D(Sap) = lim

s!1

"p.a (mod m) p#s

loge(s! 1)#1=

1#(m)

,

which secures our density result. !


Corollary 7.6 Dirichlet’s Theorem 7.7 on page 258

|Sap| = ".

Proof. If Sap were finite, then by Exercise 7.11, D(Sa

p) = 0, contradicting Theo-rem 7.9. !

Exercises

7.10. Prove that D(N) = 1. Conclude that any S ( N where S contains all butfinitely many primes must also satisfy D(S) = 1.

7.11. Prove that if S is a set of primes in Z, and S ( W ( N with |S| < " = |W|,then

ND(S,W) = 0 = D(S).

7.12. Given sets S and W of primes in Z with S * W = ", and such thatD(S),D(W) both exist, prove that

D(S 3W) = D(S) + D(W).

7.13. In general a multiplicative character is a mapping from Fp (the finite fieldof p elements for a prime p) into C such that

7(ab) = 7(a)7(b)

for all a, b % Fp. For instance, the Legendre symbol (a/p) for an oddprime p is such a character. The principal character 70 satisfies 70(a) =1 for all a % Fp, including a = 0, whereas 7(0) = 0 for all 7 '= 70.Prove that each of the following hold if a % F)p, the multiplicative groupof nonzero elements of Fp.

(a) 7(1p) = 1, where 1p is the unit in F)p.

(b) 7(a) = !jp#1 for some j = 1, 2, . . . , p ! 1, where !p#1 is a primitive

p! 1-st root of unity.(c) 7(a#1) = 7(a)#1 = 7(a).

Exercises 7.14–7.18 are all with reference to Exercise 7.13.

7.14. If 7 is a multiplicative character prove that

!

a'F'p

7(a) =

(0 if 7 '= 70,p if 7 = 70.


(Hint: use the same technique as given in the proof of Theorem 7.2 onpage 249.)

7.15. If 7, * are multiplicative characters, define the map

7* : F)p ./ C

to be defined by 7*(a) = 7(a)*(a), for a % F)p. Also, define the map

7#1 : F)p ./ C

to be defined by 7#1(a) = 7(a)#1 for a % F)p. Prove that these are againmultiplicative characters and that the set of all multiplicative charactersis in fact a cyclic group G of order p! 1.

7.16. With reference to Exercise 7.15, prove that if a % F)p and a '= 1p, then!

-'G

7(a) = 0.

(Hint: Use the same technique as given in the proof of Theorem 7.2 onpage 249.)

7.17. Prove that if a % F)p, m## (p! 1), and xm '= a for any x % F)p, then there

is a character 7 on Fp such that 7m = 70 and 7(a) '= 1.

7.18. For a % Fp, let N(m, a) denote the number of solutions of xm = a in Fp,where m

## (p! 1). Prove that

N(m, a) =!

-m=-0

7(a).

7.19. With reference to Exercise 7.18, prove that if p > 2 is prime, then

N(2, a) = 1 + (a/p)

where (4/p) is the Legendre symbol.

In Exercises 7.20–7.24, we will be referring to the following concept. If 7is a multiplicative character on Fp, a % Fp, and !p is a primitive p-th root ofunity, then

Ga(7) =!

j'Fp

7(j)!ajp

is called a Gauss sum over Fp belonging to the character 7.

7.20. Prove that if a '= 0 and 7 '= 70, then Ga(7) = 7(a#1)G1(7).

7.21. Prove that


!

j'Fp

!ajp =

(0 if a '= 0p if a = 0.

7.22. Prove that

Ga(7) =

( 0 if 7 = 70, and a '= 00 if 7 '= 70, and a = 0p if 7 = 70, and a = 0.

(Hint: Use Exercises 7.21 and 7.14 on page 268.)

7.23. Prove that for a prime p,

p#1!

j'Fp

!j(a#b)p =

(0 if a '= b1 if a = b.


7.24. Prove that if 7 '= 70, then

|Ga(7)| =&

p.

(Hint: Use Exercises 7.20–7.21 and Exercise 7.23 to evaluate!

j'Fp

Gj(7)Gj(7)

in two ways.)

Chapter 8

Applications toDiophantine Equations

One must divide one’s time between politics and equations. But our equationsare much more important to me.from writings of C. P. Snow in Einstein (1980) M. Goldsmith et al. (eds.)

Albert Einstein (1879–1955)German-born theoretical physicist

In a first course in number theory, elementary Diophantine equations arestudied and we assume herein familiarity with the fundamentals such as in [68,Chapters 1, 5, & 7], where norm-form equations, including Pell’s equation, arecompletely solved via continued fractions, as are linear equations by congruenceconditions. We have already encountered some nonlinear Diophantine equationsin our developments in Chapter 1, especially in Theorem 1.8 on page 14, wherewe looked at the Ramanujan–Nagell equation and its solutions. We revisit thisequation in §8.2, where we study solutions of the generalized Ramanujan–Nagellequation introduced in Definition 1.10 on page 13. We begin with a theory tosolve these latter equations.

8.1 Lucas–Lehmer Theory

Let % and & be the roots of

x2 !&

Rx + Q = 0, (8.1)

where R,Q % Z, with gcd(R,Q) = 1. By Exercise 8.1 on page 275,

% + & =&

R, %& = Q, and %! & ='

R! 4Q. (8.2)

271

272 8. Applications to Diophantine Equations

Set &# =

'R! 4Q.

By Exercise 8.2 on page 275,

2% =&

R +&

# =&

R +'

R! 4Q, (8.3)

and2& =

&R!

&# =

&R!

'R! 4Q. (8.4)

Definition 8.1 Lucas Functions

Let n ) 0 be an integer. Then the following are called Lucas functions:

Un = (%n ! &n)/(%! &),

andVn = %n + &n.

The above were dubbed functions rather than sequences by Lucas, then laterextended by D.H. Lehmer—see [68, Biographies 1.18–1.19, pp. 63–64].

Remark 8.1 Note that when discussing divisibility properties of Lucas func-tions in what follows, in order to avoid confusion, we assume that a factor of&

R may be ignored in Un or Vn. For instance, if R = 5, and Q = !3, thenU3 = 8, and U6 = 112

&5. We say that gcd(U3, U6) = 8, and U6 is called even,

ignoring&

5. Also, m, n are nonnegative integers throughout.

Theorem 8.1 Properties of Lucas Functions

(a) Un+2 =&

RUn+1 !QUn.

(b) Vn+2 =&

RVn+1 !QVn.

(c) 2QmVn#m = VnVm !#UnUm (n > m).

(d) V 2n !#U2

n = 4Qn.

(e) 2Um+n = UnVm + VnUm.

(f) 2Vm+n = VmVn + #UmUn.

(g) For all m % N, ((V1 + U1

&#)/2)m = (Vm + Um

&#)/2.

Proof. (a): From (8.1)–(8.4) and Definition 8.1, we have that

&RUn+1 !QUn = (% + &)

%n+1 ! &n+1

%! &! %&

%n ! &n

%! &

=%n+2 ! %&n+1 ! &n+2 + &%n+1 ! %n+1& + %&n+1

%! &

8.1 Lucas–Lehmer Theory 273

=%n+2 ! &n+2

%! &= Un+2.

(b): From (8.1)–(8.4) we also have that,&

RVn+1 !QVn = (% + &)(%n+1 + &n+1)! %&(%n + &n)

= %n+2 + %&n+1 + &%n+1 + &n+2 ! %n+1& ! %&n+1

= %n+2 + &n+2 = Vn+2.

(c): We use induction on m. If m = 0, then the result is clear. Assume that

2Qm#jVn#m+j = VnVm#j !#UnUm#j ,

for 1 # j < m. Then by parts (a)–(b),

VnVm !#UnUm = Vn(&

RVm#1 !QVm#2)!#Un(&

RUm#1 !QUm#2)

=&

R(VnVm#1 !#UnUm#1)!Q(VnVm#2 !#UnUm#2)

=&

R(2Qm#1Vn#m+1)!Q(2Qm#2Vn#m+2),

where the last equality is from the induction hypothesis, and this equals

2Qm#1(&

RVn#m+1 ! Vn#m+2) = 2QmVn#m,

where the last equality is from part (b).

(d): Use induction on n. The induction step is

V 20 !#U2

0 = 4, with U0 = 0, V0 = 2.

The induction hypothesis is V 2i !#U2

i = 4Qi for all i < n. By parts (a)–(b)

V 2n !#U2

n = (&

RVn#1 !QVn#2)2 !#(&

RUn#1 !QUn#2)2

= R(V 2n#1 !#U2

n#1)! 2&

RQ(Vn#1Vn#2 !#Un#1Un#2) + Q2(V 2n#2 !#U2

n#2),

which, by induction hypothesis and part (c), must equal

4RQn#1 ! 2&

RQ(2Qn#2V1) + Q2(4Qn#2),

and since V1 =&

R by (8.2), and Definition 8.1, then the latter equals 4Qn,which secures part (d).

(e): We have from Definition 8.1,

UnVm + VnUm =(%n ! &n)(%m + &m)

%! &+

(%n + &n)(%m ! &m)%! &

=


%n+m + %n&m ! %m&n ! &m+n + %n+m ! %n&m + %m&n ! &m+n

%! &=

2%n+m ! &n+m

%! &= 2Un+m.

(f): From Definition 8.1 and (8.2),

VmVn + #UnUm =

(%m + &m)(%n + &n) + (%! &)2(%n ! &n)(%m ! &m)

(%! &)2=

%n+m + %m&n + %n&m + &m+n + %n+m ! %n&m ! %m&n + &m+n =

2(%n+m + &n+m) = 2Vn+m.

(g): We use induction on m. For m = 1, the result is obvious. Assume that*

V1 + U1

&#

2

+m#1

=Vm#1 + Um#1

&#

2.

Then *V1 + U1

&#

2

+m

=

*V1 + U1

&#

2

+ *V1 + U1

&#

2

+m#1

=

*V1 + U1

&#

2

+ *Vm#1 + Um#1

&#

2

+=

(V1Vm#1 + U1Um#1#) + (U1Vm#1 + V1Um#1

&#)

4.

By parts (e)–(f), this equals

2Vm + 2Um

&#

4=

Vm + Um

&#

2,

which secures the entire result. !

In §8.2, we will use the properties given in Theorem 8.1 to solve the gen-eralized Ramanujan–Nagell equation for certain cases as well as some relatedequations that we will develop therein. The exercises below are designed togive the reader a grounding in the properties developed above by expanding thetheory.

8.1 Lucas–Lehmer Theory 275

Exercises

8.1. Verify the equations in (8.2) on page 271, where the positive square rootin the formula for % ! & is guaranteed by an appropriate selection of %and &.

8.2. Verify (8.3)–(8.4) on page 272.

In Exercises 8.3–8.12, prove each of the statements involving the Lucas func-tions given in Definition 8.1 on page 272.

8.3. (a) U2n+1 % Z, V2n % Z.(b) U2n and V2n+1 are integer multiples of

&R.

8.4. 2QmUn#m = UnVm ! VnUm (n > m).

8.5. For n % N,

2n#1Un =+(n+1)/2,!

j=1

$n

2j ! 1

&V n#2j+1

1 #j#1,

and

2n#1Vn =+n/2,!

j=0

$n

2j

&V n#2j

1 #j .

8.6. gcd(Un, Q) = 1 = gcd(Vn, Q), and gcd(Un, Vn) divides 2.

8.7. If Un is even, then one of the following must hold:

(a) R + 0(mod 4), Q is odd and n is even.(b) R + 2(mod 4), Q is odd and n + 0(mod 4).(c) R is odd, Q is odd and n + 0(mod 3).

8.8. If Vn is even, then one of the following must hold:

(a) R + 0(mod 4) and Q is odd.(b) R + 2(mod 4), Q is odd and n is even.(c) R and Q are odd and n + 0(mod 3).

8.9. If m|n, m ) 1, then Um|Un.

8.10. If m## n and n/m is odd, then Vm|Vn.

8.11. If gcd(m, n) = g, then gcd(Um, Un) = Ug.

8.12. (a) Assume that |Q| > 1. Prove that Un '= 0 for any n % N.(b) Give an example for each of the cases Q = ±1 to show that Un = 0

for some n % N.(c) Assume that |Q| > 1, and m % N. Prove that if Um = Un, and

Vm = Vn, then m = n.


8.2 Generalized Ramanujan–Nagell Equations

All generalizations are dangerous, even this one.Alexandre Dumas (Dumas fils) (1824–1895)

French dramatist, novelist, and principalcreator of the 19-th century comedy of manners—illegitimate sonof Dumas Pere, also named Alexandre Dumas (1802–1870), authorof The Count of Monte Cristo and The Three Musketeers.

Recall from Definition 1.10 on page 13 that the generalized Ramanu-jan–Nagell equation is given by

x2 !D = pn. (8.5)

In Theorem 1.8 on page 14, we provided all solutions for p = 2 and D = !7,which were known by Ramanujan and later proved by Nagell to indeed be allof them. For the odd prime p case, the history is varied–see [62, p. 70!] fordetails.

We may use the result of §8.1 to solve certain of the equations in the titleof this section. We begin with a result due to Alter and Kubota from 1973 [2],albeit they use di!erent methods than the Lucas–Lehmer theory coupled withideal theory that we employ below. Some of the following is adapted from [65].

Remark 8.2 With reference to Exercises 2.2–2.4 on page 66, a primitive R-ideal I = (a, (' +

&D)/2) with ' + D (mod 2) is called invertible if

gcd$

a, ','2 !D

4a

&= 1.

Then the multiplication formulas on page 59 hold for such ideals, given thediscussion therein. It can be shown that the invertible ideals form a group inthe same fashion as in Definition 3.7 on page 109, and equivalence of such idealsis similarly denoted by I $ J . Also, the order d of an ideal I in this class groupof OD = Z[4D] is defined by the property that Id $ 1 and In '$ 1 for anyn < d. Furthermore, if In $ 1, then d

## n. Moreover, as with the ideal theorydeveloped in Chapter 2, invertible ideals can be uniquely factored into productsof prime ideals–see [62, §1.5] for the general development of these notions. Wewill use these facts for our special case below to prove our desired result on theequations in the title, and pave the way for the use of Lucas-Lehmer theory.

Theorem 8.2 Generalized Ramanujan–Nagell Equations: Solutions

Suppose that p is an odd prime, D % Z with D < 0, and D + 5(mod 8). Ifd % N is the least value such that

a2 !Db2 = 4pd (8.6)

8.2 Generalized Ramanujan–Nagell Equations 277

for some a, b % N with gcd(bD, 2p) = 1, and (D, p) '= (!3, 7), then the general-ized Ramanujan–Nagell equation

x2 !D = pn (8.7)

has a solution x, n % N if and only if b = 1 and D = !3a2 ± 8. The uniquesolution is given by

x =####a(a2 + 3D)

8

#### and n = 3d.

Proof. By Exercise 2.4 on page 66, I = (p, (a +&

b2D)/2) is an ideal in thering Z[(1+

&b2D)/2]. Since d is the least natural number such that (8.6) holds,

then we have Id = (pd, (a +&

b2D)/2) = (pd) and for no smaller value m do wehave Im equal to a principal ideal. Thus, d is the order of I. If (8.7) holds forsome x % N, then (pn) = (x!

&D)(x!

&D). Therefore, (pn) = In(I %)n, where

I % = (p, (a!&

b2D)/2). Hence,

(x +&

D)(x!&

D) = In(I %)n, (8.8)

and we claim that (x +&

D) and (x !&

D) are relatively prime. If not, thenby Remark 8.2, there is a prime ideal P dividing both of them. Hence, both(x +

&D) and (x !

&D) are in P, by the same reasoning as in the proof of

Corollary 2.5 on page 76. Thus, both p % P and D % P. However, gcd(p, D) =1, so there exist r, s % Z such that pr + Ds = 1 % P. Hence, P = OD, acontradiction so they are indeed relatively prime. Therefore, by (8.8), we mayassume that In = (x +

&D) and (I %)n = (x !

&D), without loss of generality

since p is prime and the only units in OD are ±1—see Exercise 8.18 on page 285.Thus, In $ (I %)n $ 1, so by Remark 8.2, d

## n.Now we may invoke the Lucas–Lehmer theory. Let # = b2D, R = a2,

V1 = a, and U1 = 1 in the notation of §8.1. Then

pd = N((V1 +&

#)/2),

so

N(x +&

D) = pn = N([(V1 +&

#)/2]n/d) = N([(Vn/d + Un/d

&#)/2]),

where the last equality follows from part (g) of Theorem 8.1 on page 272. Also,since x +

&D and x!

&D are relatively prime, then

(Vn/d + Un/d

&#)/2 = ±(x +

&D) or ± (x!

&D). (8.9)

Claim 8.1 n > d.

Suppose that n = d. Then by (8.9), U1b = b = ±2, but b is odd, so b = 1.Therefore, a2 !D = 4pd and x2 !D = pd. By subtracting the two equations,we get a2 ! x2 = 3pd. If a! x = 3pr and a + x = ps, then 2a = 3pr + ps. Sincep ! 2a, then r = 0, and s = d. Therefore, x = a! 3, and pd = a + x = 2a! 3, so

D = x2 ! pd = (a! 3)2 ! 2a + 3 = a2 ! 8a + 12 < 0.


The latter can hold only when a < 6. Since a is odd and D + 5(mod 8), thenonly a = 5 works, namely when D = !3, x = 2, p = 7, and d = 1. This isthe entire analysis. The reason is that 3 can only divide one of a! x or a + x,and since pd can only divide one of the factors, we would have a + x = 3pd anda! x = 1 otherwise, which one can easily show to be impossible.

By Claim 8.1, n > d so bUn/d = ±2, and Vn/d = ±2x. Since b is odd, thenb = 1. By Exercises 8.7–8.8 on page 275,

n/d + 0 (mod 3) and U3|Un/d.

However, U3 = (3a2+D)/4. Therefore, (3a2+D)/4 divides ±2. If (3a2+D)/4 =±1, then 3a2 + 7(mod 8), so a2 + 5(mod 8), a contradiction. It follows that

D = !3a2 ± 8. (8.10)

Now we consider

Vn/d + Un/d

&#

2=

*Vn/(3d) + Un/(3d)

&#

2

+3

=

18[V 3

n/(3d) + 3U2n/(3d)Vn/(3d)D + (3V 2

n/(3d)Un/(3d) + U3n/(3d)D)

&#].

Hence,(3V 2

n/(3d)Un/(3d) + U3n/(3d)D)/4 = Un/d,

and(V 3

n/(3d) + 3U2n/(3d)Vn/(3d)D)/4 = Vn/d.

In other words,3V 2

n/(3d)Un/(3d) + U3n/(3d)D = ±8, (8.11)

andV 3

n/(3d) + 3U2n/(3d)Vn/(3d)D = ±8x. (8.12)

From (8.11), Un/(3d) = ±1 or ±2. If Un/(3d) = ±1, then (8.11) becomes

3V 2n/(3d) + D = ±8. (8.13)

In view of (8.10), we must have Vn/(3d) = a = V1. Since Un/(3d) = ±1, then bypart (d) of Theorem 8.1, 4pn/3 = a2 ! D. However, a2 ! D = 4pd, by (8.6).Hence, n = 3d. Furthermore, from (8.12), x = |a(a2 + 3D)/8|, as required.

If Un/(3d) = ±2, then (8.10) forces (8.11) to become

(Vn/(3d)/2)2 ! a2 = ±3,

for which only a = 1 is possible. From (8.10), we get D = !11, pd = 3 = n, so±2 = Un/(3d) = U1 = 1, a contradiction. (Notice that the case where D = !11is covered by Un/(3d) = ±1.)

The converse is clear. !


Example 8.1 If D = !19, a = 3, b = 1, and p = 5, then d = 1 and

x2 + 19 = 7n

has the unique positive solution x = 18 and n = 3.

Example 8.2 If D = !83, then the unique positive solution to

x2 + 83 = 3n

is x = 140 where b = 1, n = 9, d = 3, and a = 5.

Example 8.3 If D = !41075 = !52 · 31 · 53, then

x2 + 41075 = 13961n

has the unique positive solution x = 1601964 with n = 3, b = d = 1, anda = 117.

Remark 8.3 In Theorem 8.2, we only considered the case where D + 5(mod 8)and D < 0. We observe that if D + 3(mod 4), then (8.6) cannot hold sinceD '+ a2 (mod 4) in that case. Also, if D + 1(mod 8), then (8.6) implies thatp = 2. Hence, we need an equation di!erent from (8.6) to treat other cases. Wehave a partial solution to the remaining cases in what follows.

Theorem 8.3 More Solutions to Ramanujan–Nagell

Let p > 2 be prime, D % Z, D < 0, and p ! D. Suppose that d % N is thesmallest solution to

a2 !Db2 = pd,

for a, b % N. Then the Diophantine Equation

x2 !D = pn (8.14)

has a solution x, n % N with n > d, if and only if b = 1, and n = dq, whereq > 2 is prime. In particular, if n = 3m, then (8.14) has a solution x, m % Nwith n > d if and only if

d = m = 1, D = !3a2 ± 1, p = 4a2 ± 1, and x = 8a3 ± 3a.

Proof. Suppose that q is any prime dividing n % N. Also, let m = n/q bethe least value such that x2 ! D = pqm has a solution x % N. The first partof this proof employs essentially the same reasoning as that of Theorem 8.2 onpage 276, namely there is a primitive O4D = [1,

&D]-ideal I = [p, c+

&D], with

Id $ 1 and d|n. The only di!erence here is that we are working in the orderO4D rather than the order [1, (1+

&D)/2] used in Theorem 8.2. Now we invoke

Lucas–Lehmer theory again.


Set # = 4b2D, R = 4a2, Q = pd, U1 = 1, and V1 = 2a. Then

pd = a2 ! b2D = N [(V1 +&

#)/2]

andN(x +

&D) = pn = N [(Vn/d + Un/d

&#)/2].

Hence, bUn/d = ±1 and Vn/d = ±2x. Thus, b = 1. If n/d '+ 0(mod q), thenq|d, and so by the minimality of m, we must have qm = n = d, contradictingthe hypothesis. If q = 2, then U2 = V1 = 2a|Un/d by Exercise 8.9 on page 275.This contradiction ensures that q > 2, and n/d + 0(mod q). By Exercise 8.9,again Uq|Un/d, so Uq = ±1. Thus (Vq/2)2 !D = pdq. By the minimality of m,we must have n = dq.

If q = 3, then U3 = Un/d. Since U3 = 3a2 + D, then D = !3a2 ± 1.Therefore, pd = a2 !D = 4a2 ± 1. If pd = 4a2 ! 1, then pd = (2a! 1)(2a + 1),which is possible only for a = 1 = d, since p is an odd prime. Hence, D = !2.The solutions are

a2 !D = 12 + 2 = 3 = pd with x2 !D = 52 + 2 = 33 = pn.

This exhausts the case where p = 4a2 ! 1, namely D = !3a2 + 1. We assumehenceforth that pd = 4a2 + 1, and D = !3a2 ! 1.

Claim 8.2 Since pd = 4a2 + 1, then d = 1.

By repeated use of the equation for sums of two squares given in Remark 1.12on page 27, and a simple induction argument, we see that no prime power pd

can be a sum of two squares with 12 as one of the summands unless d = 1.If n is even, then U2|Un, by Exercise 8.9 again. However, by part (e) of

Theorem 8.1, U2 = V1 = 2a divides Un = ±1, a contradiction. Hence n is odd.By Exercise 8.10, (2a3 + 6aD) = V3|Vn = ±2x. Thus,

±2x = 2a3 + 6aD = 2a3 + 6a(!3a2 ! 1) = !16a3 ! 6a.

Therefore, since x % N, x = 8a3 + 3a. It remains to show that n = 3. If n '= 3,then by Exercise 8.4, 2p3Un#3 = UnV3 ! VnU3. However, U3 = !1, V3 = !2x,Un = ±1, and Vn = ±2x, from the above analysis. Hence, p3Un#3 = 0 or±2x, a contradiction in any case since p ! 2x, and by part (a) of Exercise 8.12,Un#3 '= 0. Hence, n = 3. !

The following is immediate from Theorem 8.3.

Corollary 8.1 Suppose that p > 2 is a prime not dividing D % Z, where D < 0.If there exist a, b % N such that p = a2 ! b2D, then

x2 !D = p3d

has a solution x, d % N if and only if b = d = 1, D = !3a2 ± 1, p = 4a2 ± 1,and x = 8a3 ± 3a.


Example 8.4 Let D = !2. By Corollary 3.10 on page 151, p = a2 + 2b2 issolvable for any prime p such that p + 1, 3(mod 8). Therefore, by Corollary 8.1,

x2 + 2 = p3m

is solvable if and only if b = m = 1, x = 5, and p = 3, namely

1 + 2 = 3 and 52 + 2 = 33.

Here a = 1, D = !3a2 + 1, x = 8a3 ! 3a, and p = 4a2 ! 1.

Example 8.5 If D = !5, and p is a prime with p + 1, 9(mod 20), then by part(a) of Exercise 3.46 on page 153, p = a2 + 5b2 for some a, b % N. Therefore, byCorollary 8.1,

x2 + 5 = p3m

has no solutions x, m % N.

Remark 8.4 Note that in [12], Bugeaud and Shorey look at the generalizedRamanujan–Nagell equations of the form D1x2 + D2 = kn in unknowns x, n %N. They provide necessary and su"cient conditions on D1, D2, and k for theequation to have at most 2.(k)#1 solutions where 4(k) denotes the number ofdistinct prime divisors of k. It follows that when k is prime the necessary andsu"cient conditions determine when the equation has at most one solution.They also completely solve the related equation x2 + 7 = 4yn, demonstratingthat there are no solutions for y > 2, n > 1, and x % N. There are a couple oferrors however in the paper, corrected by this author in [69], which closes thedoor on the equation in the title.

Exercises

8.13. If D = !43, and x2 + 43 = 473d find solutions if they exist.



8.16. If D = !2209, and x2 + 2209 = 17n find solutions if they exist.

8.17. Find solutions of x2 + 161047 = 11n if they exist.


8.3 Bachet’s Equation

Science is one thing, wisdom is another. Science is an edged tool, with whichmen play like children, and cut their own fingers.

Arthur Eddington (1882–1944)British astrophysicist

We covered instances of Bachet’s equation—see [68, Biography 7.2, p. 279],

y2 = x3 + k (8.15)

in §1.4, and [68, §7.3, pp. 277–280]. We extend that investigation by lookingat more advanced use of techniques to solve Bachet’s equation. In a beginningcourse in number theory Bachet’s equation is solved via elementary congruenceconditions. Now that we have algebraic numbers at our disposal, we may pro-ceed to show how those techniques may be applied. This falls in line with §8.2,where we applied the ideal theory and Lucas–Lehmer theory to solve instancesof the generalized Ramanjuan–Nagell equations. The reader should prepare bylooking at Exercises 8.18–8.20 on page 285 to be reminded of the theory wedeveloped in Chapters 1–2 and the facts we will use in the following.

Theorem 8.4 Solutions of Bachet’s Equation

Let F = Q(&

k) be a complex quadratic field with radicand k < !1 such thatk '+ 1(mod 4), and hOF '+ 0(mod 3). Then there are no solutions of (8.15) inintegers x, y except in the following cases: there exists an integer u such that

(k, x, y) = (±1! 3u2, 4u2 2 1, 3u(32 8u2)),

where the ± signs correspond to the 2 signs and 3 = ±1 is allowed in eithercase.

Proof. Suppose that for k as given in the hypothesis, (8.15) has a solution.

Claim 8.3 gcd(x, 2k) = 1.

Given that y2 + 0, 1(mod 4), and k + 2, 3(mod 4), then

x3 = y2 ! k + 1, 2, 3 (mod 4).

However, x3 + 2(mod 4) is not possible. Hence, x is odd. Now let p be a primesuch that p

## gcd(x, 2k), where p > 2 since x is odd. Since k is a radicand, it issquarefree, so

p||k = y2 ! x3. (8.16)

However, p## x so p

## y, which implies that p2## (y2 ! x3), a contradiction to

(8.16), that establishes the claim.By Claim 8.3, there exist integers r, s such that

rx + 2ks = 1. (8.17)

8.3 Bachet’s Equation 283

Claim 8.4 The OF -ideals (y +&

k) and (y !&

k) are relatively prime.

If the claim does not hold, then there is a prime OF -ideal P dividing bothof the given ideals by Theorem 2.13 on page 80. Therefore, by Corollary 2.5 onpage 76, y ±

&D % P. Therefore, 2

&k = y +

&k ! (y !

&k) % P, so

2&

k ·&

k = 2k % P. (8.18)

Given that(y +

&k)(y !

&k) = (y2 ! k) = (x3) = (x)3,

then by Corollary 2.5 again, since (x)3 ( P, then P## (x)3. However, since P is

prime P## (x), and once more by Corollary 2.5, we conclude that

x % P. (8.19)

Now we invoke (8.17)–(8.19) to get that both rx and 2ks are in P so 1 =rx + 2ks % P, a contradiction that establishes the claim.

By Theorem 2.9 on page 73, OF is a Dedekind domain, so by Claim 8.17and Exercise 8.18 on page 285, there exists an integral OF -ideal I such that(y +

&k) = I3. In other words, I3 $ 1, but hOF '+ 0(mod 3), so by Exercise

8.19, I $ 1. Thus, by Theorem 1.3 on page 6, there exist integers u, v such thatI = (u + v

&k). Hence, (y +

&k) = (u + v

&k)3 =

.[u + v

&k]3

/.

By Exercise 8.20, there is a unit w in OF such that

y +&

k = w(u + v&

k)3, (8.20)

and by Theorem 1.4 on page 8, w = ±1. Now we conjugate (8.20) to get

y !&

k = w(u! v&

k)3. (8.21)

Hence,

x3 = y2 ! k = (y !&

k)(y +&

k) = w2(u + v&

k)3(u! v&

k)3 = (u2 ! v2k)3.

Therefore,x = u2 ! v2k. (8.22)

Now by adding (8.20)–(8.21), we get

2y = w,(u + v

&k)3 + (u! v

&k)3

-= 2w(u3 + 3uv2k), (8.23)

and by subtracting (8.21) from (8.20), we get

2&

k = w,(u + v

&k)3 ! (u! v

&k)3

-= 2w

&k(3u2v + v3k). (8.24)

Hence, from (8.23)–(8.24), we get, respectively, that

y = w(u3 + 3uv2k) (8.25)


and1 = w(3u2v + v3k) = wv(3u2 + v2k). (8.26)

From (8.26), we get that v = ±w, so from (8.22), (8.25)–(8.26), we have,

x = u2 ! k, y = w(u3 + 3uk), and 1 = ±(3u2 + k).

It follows that k = ±1!3u2, x = 4u221, and y = 3(3u28u2), where 3 = ±1 isallowed in either case. Therefore, the two cases are encapsulated in the following,(k, x, y) = (±1! 3u2, 4u2 2 1, 3u(32 8u2)), and

x3 + k = (4u2 2 1)3 ± 1! 3u2 = 64u6 2 48u4 + 9u2 = (3u(32 8u2))2 = y2,

as required. !

Remark 8.5 Note that in Theorem 8.4, u is odd when k = 1 ! 3u2 and u iseven when k = !1 ! 3u2 by the hypothesis that k '+ 1(mod 4), and the factthat k is a radicand, which precludes that k + 0(mod 4)—see Definition 3.11on page 121.

Example 8.6 We may now easily achieve a result that we proved about Ba-chet’s equation in Chapter 1 via Theorem 8.4 as follows. If k = !2, then wehave (x, y) = (3,±5) are the only solutions of (8.15), which is Theorem 1.19 onpage 47.

Example 8.7 We may also invoke some results from §8.2 to illustrate Theorem8.4 as follows. In Example 8.4 on page 281, we looked at y2 +2 = p3m, changingthe notation to suit our current situation, when p is a prime of the form p =a2 + 2b2. We saw that the only solutions are for b = m = 1, y = 5, and p = 3.In terms of Theorem 8.4, k = !2, x = pm = 3 = 4u2 ! 1, where u = 1. Thisbrings us back to Example 8.6 for yet another interpretation.

Example 8.8 Corollary 8.1 on page 280 in §8.2 may be illustrated here as well.That result told us that, in our current notation,

y2 = p3d + k

for a prime p = u2 ! kv2 and k < 0 has a solution if and only if v = d = 1,k = ±1! 3u2, y = 8u3 ± 3u, so

p = 4u2 ± 1,

which we see is the conclusion of Theorem 8.4 with the relevant sign associations.

See Exercises 8.21–8.24 for more examples. Also, see Exercise 8.25 for resultssimilar to Theorem 8.4 on page 282 for the case where k > 0.

8.3 Bachet’s Equation 285

Exercises

8.18. Suppose that I, J are nonzero integral R-ideals where R is a Dedekinddomain with I and J relatively prime—see Definition 2.15 on page 79.Prove that if K is an R ideal and n % N such that IJ = Kn, then thereexist R ideals I, J such that I = In, J = Jn, and K = IJ.(Hint: use Theorem 2.12 on page 77.)

8.19. Let OF be the ring of integers of an algebraic number field F with classnumber hOF . Prove that if I is an integral OF -ideal such that In $ 1 forsome n % N with gcd(hOF , n) = 1, then I $ 1.

8.20. Let %,& be nonzero elements in a Dedekind domain R. Prove that theprincipal R-ideals (%) = (&) if and only if % = &u where u is a unit in R.

8.21. Suppose that p is a prime of the form p = u2 + 13v2 for some u, v % N.Find all solutions to y2 = p3m ! 13, for m % N if any exist.(Note that !13 is the smallest value of |k| of the form k = !1! 3u2 suchthat the hypothesis of Theorem 8.4 is satisfied. Also, hZ[

$#13] = 2.)

8.22. Find all solutions of y2 = x3 ! 193 if they exist.(With reference to Exercise 8.21, the next smallest |k| of the form k =!1!3u2 such that the hypothesis of Theorem 8.4 is satisfied is k = !193.Also, hZ[

$#193] = 4.)

8.23. Find all solutions of y2 = x3 ! 47 if they exist. (Note that hZ[$#47] = 5.)

8.24. Find all solutions of y2 = x3 ! 57 if they exist. (Note that hZ[$#57] = 4.)

8.25. Suppose that k % N is a radicand of a real quadratic field F = Q(&

k) andk '+ 1(mod 4), such that hOF '+ 0(mod 3), with F having fundamentalunit 3k—see page 259. Let 3 = 3k if 3k has norm 1, let 3 = 32

k otherwise,and set 3 = T + U

&k. Prove that (8.15) has no solutions if k + 4(mod 9)

and U + 0(mod 9).(Hint: Assume there is a solution (x, y) to (8.15). Then you may assumethat y +

&k = w(u + v

&k)3 for a unit w % OF and some u, v % Z, since

the argument is the same as in the proof of Theorem 8.4.)(Note that more results for k > 0 of this nature, which typically involvecongruences on T and U , may be found, for instance, in Mordell’s classictext [73] on Diophantine equations.)


8.4 The Fermat Equation

There are no such things as applied sciences, only applications of science.from an address given on the inauguration of the Faculty of Science,

University of Lille, France on December 7, 1854.Louis Pasteur (1822–1895)

French chemist and bacteriologist

In this section, we look at FLT, and its related prime Fermat equation

xp + yp + zp = 0, (8.27)

solved for the case of p = 3 in Theorem 1.18 on page 41. It su"ces to solve(8.27) in order to solve the general Fermat equation (1.44) on page 41. Thefollowing uses our techniques from Chapters 1 and 2, including factorization inprime cyclotomic fields F = Q(!p), where !p is a primitive p-th root of unityfor a prime p > 2 when p ! hOF , in which case p is called a regular prime. Theproof is due to Kummer—see Biography 3.2 on page 124. Some of the followingis adapted from [64].

Theorem 8.5 Kummer’s Proof of FLT for Regular Primes

If p > 2 is prime and p ! hOF for F = Q(!p), then (8.27) has no solutionswith p ! xyz '= 0.

Proof. Assume that (8.27) has a solution xyz '= 0 for x, y, z % Z. Withoutloss of generality, we may assume that x, y, z are pairwise relatively prime.Furthermore, we may write (8.27) as the ideal equation

p#17

j=0

(x + !jpy) = (z)p. (8.28)

Claim 8.5 (x + !jpy) and (x + !k

p y) are relatively prime for 0 # j '= k # p! 1.

Let P be a prime OF -ideal dividing both of the above ideals. Therefore, Pdivides

(x + !kp y)! (x! !j

py) = y!kp (1! !j#k

p ).

By Exercise 8.26 on page 291, . = 1! !p and 1! !j#kp are associates for j '= k,

and clearly !kp is a unit, so P

## (y.). By primality, P## (y) or P

## (.). If P## (y),

then P## (z) from (8.28). Since gcd(y, z) = 1, there exist u, v % Z such that

uy + vz = 1. Since y, z % P, then 1 % P, a contradiction. Hence, P## (.). By

Theorem 2.3 on page 58 and Exercise 8.26, (.) is a prime OF -ideal. Therefore,P = (.), so (.)

## (z). By Exercise 2.29 on page 96, NF (.)## NF (z). However,

by Exercise 8.27,NF (z) = zp#1,

8.4 The Fermat Equation 287

so p = NF (.)## z, contradicting the hypothesis. This completes Claim 8.5.

By Claim 8.5 and Theorem 2.12 on page 77,

(x + !py) = Ip,

for some OF -ideal I. Since p ! hF , then by Exercise 8.19 on page 285, I $ 1.Hence, there exists an % % OF such that

x + !py = u1%p, (8.29)

where u1 % UF . Our next task is to show that u1!sp % R for some s % Z. This

first requires establishing the following.

Claim 8.6 OF = Z[!p].

Clearly Z[!p] ( OF . If % % OF , by Theorem 1.5 on page 10, there existqj % Q for j = 0, 1, . . . , qp#2 such that

% =p#2!

j=0

qj!jp. (8.30)

Now we show that pqj % Z for each such j. Let TF be as given in Definition 2.19on page 91. Then TF (!k

p ) = !1 for any k relatively prime to p by Exercise 1.54on page 46. Therefore, for any k = 0, 1, . . . , p! 2,

TF (%!#kp ) =

p#2!

j=0

qjTF (!j#kp ) = !

k#1!

j=0

qj + (p! 1)qk !p#2!

j=k+1

qj = !p#2!

j=0

qj + pqk.

Hence,

TF (%!#kp ! %!p) = TF (%!#k

p )! TF (%!p) = !p#2!

j=0

qj + pqk +p#2!

j=0

qj = pqk,

for any such k. Since%!#k

p ! %!p % OF ,

then by Exercise 2.25 on page 96, pqk % Z. Thus, from (8.30),

p% =p#2!

j=0

pqj%jp

with pqj % Z for all such j. However, since !p = 1! ., then using the binomialtheorem, we may write

p% =p#2!

j=0

zj.j (8.31)


with zj % Z for all such j. However, by Exercise 8.26, .## p, since

p =p#17

j=1

(1! !jp),

so from (8.31), .## z0. However,

p = NF (.)## NF (z0) = zp#1

0

so p## z0 as well.

Now, by Exercise 8.26, 1 ! !jp are associates for j = 1, 2, . . . , p ! 1, so the

following equation involving principal OF -ideals holds,

(.)p#1 =p#17

j=1

(1! !jp) =

B

Cp#17

j=1

[1! !jp]

D

E = (NF (.)) = (p), (8.32)

where the last equality also holds by Exercise 8.26. Hence, this implies that.p#1

## z0.Now considering (8.31) modulo .2, we get that .2

## z1., so .## z1, and as

above p## z1. Continuing in this fashion, we see that p

## zj for j = 0, 1, . . . , p!2.Then dividing (8.31) by p yields

% % Z[.] = Z[!p],

so OF ( Z[!p]. We have shown that OF = Z[!p] thereby securing Claim 8.6.In the following, the reader is reminded of the notion of congruence modulo anideal, explored in Exercises 8.32–8.39.

Claim 8.7 If z is a unit in Z[!p], then z!sp % R for some s % Z.

If z is a unit in Z[!p], then so is its complex conjugate z, and

, = z/z % Z[!p]. (8.33)

By Exercise 8.27, and Theorem 2.19 on page 88, the only roots of unity in Q(!p)are !t

p for t % Z. Also, since for any F -monomorphism +,

+(,) = +(z)/+(z) = +(z)/+(z),

so |+(,)| = 1. By Exercises 2.23 on page 96 and 8.27 on page 291, Q(!p) canhave only finitely many complex units, and |,k| = 1 for all k % N, so ,k = , !

for some k < '. Thus, , !#k = 1, which implies that , is a root of unity. Set

, = ±!tp.

Since!jp + 1 (mod .) for all j, (8.34)


then letting

z =p#2!

j=0

aj!jp

and using the fact that +(!p) = !kp for some k we get that

z + +(z) (mod .).

In particular, z + z (mod .). In the case that

, = !!tp, which implies that z = !!t

pz, by (8.33),

then by (8.34),z + !z (mod .),

which implies that 2z + 0(mod .), an impossibility. Therefore,

z = !tpz = !#2s

p z,

where !2s + t(mod p). Hence,

!spz = !s

pz,

which says that z!sp % R, which is the claim.

Now returning to (8.29) on page 287, using Claim 8.7, there is an k % Z andw % R * UF , with

x + !py = w!kp %p. (8.35)

By Exercise 8.39 on page 293 there exists a z1 % Z such that

% + z1 (mod (.)).

By taking norms on the latter, we get

%p ! zp1 =

p#17

j=0

(%! !jpz1).

Since !p + 1(mod (.)), then for each j = 0, 1, . . . , p! 1,

%! !jpz1 + %! z1 (mod (.)).

Hence,%p + zp

1 (mod (.)p),

so (8.35) becomesx + !py + wzp

1!kp (mod (.)p).

However, (p)## (.)p#1 by (8.32), so

x + !py + wzp1!k

p (mod (p)).


Since !kp is a unit, then

!#kp (x + !py) + wzp

1 (mod (p)). (8.36)

By taking complex conjugates in (8.36), we get

!kp (x + !#1

p y) + wzp1 (mod (p)). (8.37)

Subtracting (8.37) from (8.36), we get

!#kp x + !1#k

p y ! !kp x! !k#1

p y + 0 (mod (p)). (8.38)

Claim 8.8 2k + 1(mod p).

If p## k, then !k

p = 1, so (8.38) becomes

0 + y(!p!!#1p ) + y!#1

p (!2p!1) + y!#1

p (!p!1)(!p+1) + y!#1p .(!p+1) (mod (p)).

However, by setting x = !1 in

p#1!

j=0

xj =p#17

j=1

(x! !jp),

we get that 1 + !p % UF , so

y. + 0 (mod (p)).

Also, from (8.32), and the fact that p ) 3, we get that .## y. By Exercise

2.29 again, NF (.)## NF (y), so we get that p

## y, contradicting the hypothesis.Therefore, k '+ 0(mod p). By (8.38) there exists an %1 % OF such that

%1p = x!#kp + y!1#k

p ! x!kp ! y!k#1

p . (8.39)

If k + 1(mod p), then (8.38) becomes

x(!#1p ! !p) + 0 (mod (p)).

In the same fashion as in the elimination of the case k + 0(mod p), we get thatp

## x, contradicting the hypothesis. Since k '+ 0, 1(mod p), then

%1 =x

p!#kp +

y

p!1#kp ! x

p!kp !

y

p!k#1p . (8.40)

By Claim 8.6,{1, !p, . . . , !

p#1p }

is a Z-basis of OF . Thus, if all exponents !k, 1!k, k, and k!1 are incongruentmodulo p, then x/p % Z, contradicting the hypothesis. Thus, two of the afore-mentioned exponents are congruent modulo p. The only possibility remainingafter excluding k + 0, 1(mod p) is

2k + 1 (mod p).


This establishes Claim 8.8.Hence, (8.39) becomes

%1p!kp = x + y!p ! x!2k

p ! y!2k#1p = (x! y)..

By taking norms, we get p## (x ! y), namely x + y (mod p). Thus, by (8.27)

y + z (mod p) as well. Therefore, since p ! x,

0 + xp + yp + zp + 3xp (mod p).

Thus, p = 3, which was eliminated in Theorem 1.18, so we have completed theproof. !

Remark 8.6 The case where p ! xyz for a regular prime is called case I in FLT.Kummer conjectured that there exist infinitely many regular primes, but thisproblem remains open to this day. However, it is possible to show that thereare infinitely many primes p

## hOF for F = Q(!p), called irregular primes—see[64, §3.6]. This is done using Bernoulli numbers and polynomials—see §5.1. ForKummer’s proof of FLT for regular primes p

## xyz, called Case II for FLT, see[64, Theorem 4.124, p. 251].

In §8.5, we look at a related equation to the Fermat equation, which hasalso been relatively recently solved, the Catalan equation and the combinedequations for the Fermat–Catalan conjecture and the impact of the ABC con-jecture on the latter, which remains an open problem, as of course does theABC conjecture.

Exercises

8.26. Prove that for a prime p > 2 and F = Q(!p), NF (1! !p) = p. Also, showthat 1! !p and 1! !i

p are associates for i = 1, 2, . . . , p! 1.

8.27. Prove that if n % N with n > 2, then

|Q(!n) : Q| = #(n).

(Hint: Use Exercise 2.24 on page 96 in conjunction with Theorem 1.7 andDefinition 1.9 on page 11.)

For the following exercises, the reader should be familiar with the basicsconcerning “actions on rings” such as presented in [68, Appendix A, pp. 303–306].

8.28. Prove that if R is a Dedekind domain and I, J are R-ideals, then

R

I$=

J

IJ


as additive groups.(Hint: Use Exercises 2.10–2.11 on page 85, and employ the FundamentalIsomorphism Theorem for Rings which says: If R and S are commutativerings with identity and 5 : R ./ S is a homomorphism of rings, thenR/ ker(5) $= img(5).)

8.29. Let OF be the ring of integers of a number field F , P a prime OF -ideal,and n % N. Prove that ####

OF

Pn

#### =####OF

P

####n

.


8.30. Let R be a Dedekind domain, and let I be an R-ideal with

I =r7

j=1

Paj

j , (8.41)

for distinct prime R-ideal Pj . Prove that####R

I

#### =r7

j=1

####R

Pj

####aj

.

8.31. Let R be a commutative ring with identity, and let I be an R-ideal. Provethat the additive abelian group R/I is a ring with identity, and whosemultiplication is given by (a + I)(b + I) = ab + I.

8.32. Let F be a number field and I a nonzero OF -ideal. If %,& % OF , we saythat % and & are congruent modulo I if %! & % I, denoted by

% + & (mod I).

The set of those % % OF which are congruent to each other modulo I iscalled a residue class modulo I. Prove that the number of residue classesis equal to the norm of I, defined by N(I) = |OF /I|.(Note that by Exercise 8.30, we know that |OF /I| is finite. Also, if I isgiven by (8.41), then Exercise 8.30 tells us that

N(I) =r7

j=1

####OF

Pj

####aj

.

It follows that if I, J are R-ideals, then N(IJ) = N(I)N(J).)

The balance of the exercises are in reference to Exercise 8.32. The readershould recall the developments in Chapter 2 for the terminology used in whatfollows.


8.33. Let R be a Dedekind domain. Prove that if gcd((%), I) = 1, then for any& % R, there is a * % R, uniquely determined modulo I, such that %* + &(mod I). Furthermore, prove that this congruence is solvable for some* % OF if and only if gcd((%), I)

## (&).

8.34. In view of Exercise 8.33, two elements of OF that are congruent modulo Ihave the same gcd with I. Hence, this is an invariant of the class, since itis a property of the whole residue class. We denote the number of residueclasses relatively prime to I, by the symbol !(I). Let I, J be relativelyprime OF -ideals. Prove that

!(I) = N(I)7

P##I

$1! 1

N(P)

&,

where the product runs over all distinct prime divisors of I. Concludethat if I, J are relatively prime OF -ideals, then !(IJ) = !(I)!(J).

8.35. Suppose that I =8r

j=1 Paj

j , where the Pj are distinct OF -ideals. Provethat

!(I) = N(I)r7

j=1

$1! 1

N(Pj)

&.

Note that when F = Q, then ! is the ordinary Euler totient function #.

8.36. Let %j % OF for j = 1 . . . , d, and let P be a prime OF -ideal. Prove thatthe polynomial congruence

f(x) = xd + %1xd#1 + · · · + %d#1x + %d + 0 (mod P)

has at most d solutions x % OF that are incongruent modulo P, or elsef(%) + 0(mod P) for all % % OF . (We also allow the case where deg(f) =0, in which case f(x) = %0 + 0(mod P) means that %0 % P.)

8.37. Prove that the residue classes modulo I, relatively prime to I, form anabelian group under the multiplication given in Exercise 8.31 on the pre-ceding page. Prove that this group has order !(I). In particular, showthat if I is a prime OF -ideal, then the group is cyclic.

8.38. Suppose that I is a nonzero OF -ideal and % % OF is relatively prime toI. Prove that

%"(I) + 1 (mod I),called Euler’s Theorem for Ideals. Conclude that if I = P is a primeOF -ideal, then

%N(P)#1 + 1 (mod P),called Fermat’s Little Theorem for Ideals.

8.39. Let P be a nonzero prime OF -ideal, and let % % OF . Prove that thereexists a z % Z such that

% + z (mod P) if and only if %p + % (mod P), where (p) = P * Z.


8.5 Catalan and the ABC Conjecture

The last thing one knows in constructing a work is what to put the first.translated from section 1, no. 19 of Pensees (1670) ed. L. Brunschvicg

(1909)Blaise Pascal (1623–1662)

French mathematician, physicist, and moralist

In 1844, Charles Catalan conjectured that

ab ! cd = 1 (8.42)

with all integers a, b, c, d bigger than 1 has solutions for only (a, b, c, d) =(3, 2, 2, 3). In an elementary course in number theory, one may look at thisequation for special cases and solve it via congruence conditions and other suchtechniques—see [68, Biography 3.1, p. 144]. Indeed, in the Middle Ages, He-braeus solved (8.42) for (a, c) = (3, 2). In 1738, Euler solved it for (b, d) = (2, 3),and in 1850, Lebesgue solved it for d = 2. Moving into the twentieth century,Nagell solved it in 1921 for (b, d) = (3, 3), and C. Ko for the case d = 2 in 1967.In 1976, R. Tijdeman proved that (8.42) has solutions only for

cd < exp(exp(exp(exp(730)))),

a monster of a bound, but this shows that it can have solutions for only finitelymany values. Not long later, M. Langevin proved that the bounds for solutionsto (8.42) must satisfy

b, d < 10110.

Then Mignotte improved this to

max{b, d} < 7.78 · 1016.

In the other direction, in 1997, Y. Roy and Mignotte, proved that a lower boundon such solutions must satisfy

min{b, d} > 105.

It seemed, therefore, that the bounds were closing in. As with the Fermat equa-tion, discussed in §8.4, a proof was eventually found. In 2002, Preda Mihailescudiscovered a proof, which employs wide use of cyclotomic fields and Galoismodules. In 2004, it was published in [58]. Thus, the Catalan conjecture is nowknown as Mihailescu’s theorem.

Now that both the Fermat equation and the Catalan equation have beenresolved, we may look at a problem that combines them both, and is still open.

8.5 Catalan and the ABC Conjecture 295

The Fermat–Catalan Conjecture

There are only finitely many powers xp, yq, zr satisfying

xp + yq = zr, (8.43)

where x, y, z % N and are relatively prime, and p, q, r % N with

1p

+1q

+1r

< 1. (8.44)

By the 1995 results of Darmon and Granville [20] it is known that for fixedp, q, r with (8.44) satisfied, (8.43) has only finitely many solutions.

Although the Fermat–Catalan conjecture remains unresolved, there is ameans of proving it under the assumption of yet another unresolved conjecture,a process that has become “fashionable” in the literature. In order to properlypresent these ideas, let us set the stage by looking at the very foundations ofsolving Diophantine equations from a historical perspective.

In 1900, Hilbert posed a list of 23 problems—see Biography 3.5 on page 127.Among them was the problem, which we would understand today as asking: Isthere a comprehensive algorithm which can determine whether a given Diophan-tine polynomial equation (with integral coe"cients) has a solution in integers?The very interpretation of this query and the resulting search for an answerultimately was resolved in 1970 by Matiyasevich [55] who provided a ratherdefinitive negative answer, to what we now call Hilbert’s tenth problem. Whatthis means for the modern mathematician is that we can never find an algorithmfor the decision problem: Does a given Diophantine equation have a solution ornot? However, this does not deter us from looking at certain classes of Dio-phantine equations, or as was done with the Catalan equation above, findingbounds on the number of solutions to determine whether or not such solutionsexist.

Matiyasevich’s aforementioned proof is based upon the notion of Diophan-tine sets. Without getting embroiled in the definitions and technical aspects ofthis phenomenon, su"ce it to say that in 1960, Putnam established that a set isDiophantine if and only if it coincides with the sets of positive values of a suit-able polynomial taken at nonnegative integers. Putting this together with theMatiyasevich result, we achieve that there exists a polynomial f(x1, x2, · · · , xn)whose positive values at integers nj ) 0 are primes, and every prime is repre-sentable in this fashion. Indeed, in 1976, Jones, Sato, Wada, and Wiens [44]explicitly found a polynomial of degree 25 in 26 variables which produces allprime numbers. It also takes on negative values and a given prime may berepeated. Yet it is open as to what the minimal possible degree and minimalnumber of variables for such a polynomial happen to be. Moreover, and perhapsmore striking, is the fact that the above implies that the set of prime numbersis Diophantine.


With the previous discussion in mind, it would be valuable to have a generalmethodology for solving Diophantine equations employing a theory that appliesto some certain selected sets of Diophantine equations. There is a conjecture,if proved, that would apply to a wide variety of such equations, and arguablyone of the most important unsolved problems in number theory, first posed,independently, by David Masser and Joseph Oesterle in 1985. In what follows,for any n % Z, S(n) denotes the largest squarefree divisor of n, also known asthe squarefree kernel of n, as well as the radical of n.

The ABC Conjecture

If a, b, c are relatively prime integers which satisfy the equation

a + b = c,

then for any - > 1, with finitely many exceptions, we have that

c < S(abc)/.

To illustrate the power of this conjecture, if resolved a"rmatively, the fol-lowing list shows several results that would fall to the ABC conjecture.

# Consequences of the ABC Conjecture

The ABC conjecture implies each of (1)–(8):

(1) The Fermat–Catalan conjecture—see (8.43) on page 295.

(2) FLT–see (1.44) on page 41.

(3) The Thue–Siegel–Roth Theorem—see (4.2) on page 160.

(4) The Diophantine equation ym = xn + k for x, y,m, n, k % Z with m > 1and n > 1 has only finitely many solutions. This is a generalization ofTijdeman’s theorem, which is the case k = 1.

(5) Hall’s conjecture, which says that if there are integer solutions x, y to theBachet equation

y2 = x3 ! k,

then for any 3 < 1/2, there exists a constant K(3) > 0 such that

|x3 ! y2| > K(3)x&.

In other words, the nonzero di!erence in absolute value, x3 ! y2, cannotbe less than x1/2. This was posed by Marshall Hall in [37] in 1971 for anyk '= 0.


(6) The existence of infinitely many non-Wieferich primes, where aWieferichprime p is one that satisfies

2p#1 + 1 (mod p2).

(7) The Erdos–Woods Conjecture which says: There exists an integer k suchthat, for m, n % N, the conditions

S(m + j) = S(n + j) for 0 # j # k ! 1

imply m = n. This arose from [24], where Erdos asked how many pairs ofproducts of consecutive integers have the same prime factors.

(8) There are only finitely many triples of consecutive powerful numbers. (Apowerful number n % N satisfies that p2

## n whenever a prime p## n.) The

above is a weak form of the Erdos–Mollin–Walsh conjecture, which statesthat there are no consecutive triples of powerful numbers—see Granville[35], as well as Mollin–Walsh [71].The ABC conjecture is equivalent to

(9) the Granville–Langevin conjecture, which says that if

f(x, y) % Z[x, y]

is a square-free binary quadratic form of degree n > 2, then for every& > 2, there exists a constant C(f, &) > 0 such that

S(f(x, y)) ) C(f, &) max{|x|, |y|}n#% for every x, y % Z

with gcd(x, y) = 1, f(x, y) '= 0.

The above list is by no means exhaustive, since numerous other results followfrom, or are equivalent to, the ABC conjecture. However, we see that there isample reason to believe that this is one of the most important outstandingproblems in number theory. Now we are in a position to prove what we assertedearlier, namely number (1) on the above list.

Theorem 8.6 Fermat–Catalan Follows From ABC

The ABC conjecture implies the Fermat–Catalan conjecture.

Proof. Considering the right-hand side of (8.44) on page 295, we note that thelargest possible choices for p, q, r are given by

12

+13

+17

=4142

,


so replacing < 1 by # 41/42, and applying the ABC conjecture with - = 1.01,observing that,

S(xpyqzr) = S(xyz) # xyz = (xp)1p (yq)

1q (zr)

1r # (zr)

1p + 1

q + 1r ,

we have, with finitely many possible exceptions,

zr < z/(r/p+r/q+1).

Hence,r < -(r/p + r/q + 1),

which in turn implies1 < -(1/p + 1/q + 1/r).

However, (1/p + 1/q + 1/r) # 41/42 and - = 1.01, so

1 < -(1/p + 1/q + 1/r) < 1.014142

= 0.9859523819 · · · ,

a contradiction. Hence, there can only be finitely many solutions to (8.43). !

The following are the only known examples of solutions to the Fer-mat–Catalan equation (8.43), the last five of which were discovered by F. Beuk-ers and D. Zagier—see [19, pp. 382–383]:

1p + 23 = 32

25 + 72 = 34

132 + 73 = 29

27 + 173 = 712

35 + 114 = 1222

338 + 15490342 = 156133

14143 + 22134592 = 657

92623 + 153122832 = 1137

177 + 762713 = 210639282

438 + 962223 = 300429072

We now show how item (8) in the list on page 297 follows from ABC. Thereader should solve Exercises 8.42–8.43 on page 300, which we will use in thefollowing.


Theorem 8.7 ABC Implies Weak Erdos–Mollin–Walsh Conjecture

The ABC conjecture implies there exist only finitely many consecutive triplesof powerful numbers.

Proof. By Exercises 8.42–8.43, if (n! 1, n, n + 1) are powerful, then

n = x2y3

is even and n2 ! 1 is powerful. Let

a = 1, b = n2 ! 1, and c = n2

in the ABC conjecture. Then since c = a + b,

S(abc) #&

bn < n3/2,

so for any - > 1, with finitely many possible exceptions, we have

n2 = c < S(abc)/ < n3//2.

In particular, if - = 1.01, then

1 < n0.485 = n2#3.03/2 < 1,

which is a contradiction. We have shown there are at most finitely many con-secutive triples of powerful numbers. !

# Concluding comments

In 1994, Bombieri [9] proved that that ABC conjecture implies theThue–Siegel–Roth Theorem, (3) in the list on page 297. A more far-reaching re-sult was proved in 1999 by Frankenhuysen [26] that included not only Bombieri’sconclusion from ABC, but also Elkies’ [23] derivation of Mordell’s conjecturefrom ABC. In 1922, Mordell posed that any curve of genus bigger than 1 definedover a number field F has only finitely many rational points in F . It is beyondthe scope of this book to go into any depth on this topic. Su"ce it to say thatElkies’ proof was based upon recasting the ABC conjecture in terms of a spec-ified rational point in the one-dimensional projective line. Then the Mordellconjecture is boiled down to the ABC conjecture via the Riemann–Hurwitzformula which describes the relationship of what is known as the Euler char-acteristic of two surfaces when one is a ramified covering of the other. For anice explanation including terminology and methodology, see [42]. Of course,there is an unconditional proof of Mordell’s conjecture for which Faltings [25]won the Fields medal in 1983 using techniques from algebraic geometry. ButBombieri [10] provided an elementary proof in 1990, which the reader may alsofind presented in [42].


There are numerous other applications of the ABC conjecture upon which wehave not touched such as that proved by Granville and Stark in [36], which estab-lishes that the ABC conjecture implies that there do not exist any Siegel zeros,also called called Landau–Siegel zeros, of Dirichlet L-functions for characters ofcomplex quadratic fields, where a Siegel zero is a potential counterexample tothe Riemann hypothesis in that it is a value s % C with ;(s) '= 1/2 such thatL(s, 7) = 0—see §7.2. There are also generalizations of the ABC conjecture tonumber fields which was introduced by Vojta in [99]. However, we have covereda su"cient amount to demonstrate that the ABC conjecture is indeed one of themain open problems in number theory and may remain so well into the future.

Exercises

8.40. Prove that for su"ciently large n % N the ABC conjecture implies FLT.In other words, there exists an N % N such that

xn + yn = zn

has no nontrivial integer solutions for all n > N .

8.41. Prove that the ABC conjecture implies that the Erdos–Woods conjectureholds for k = 3, with finitely many possible exceptions. This is (7) of thelist on page 297.

8.42. With reference to item (8) on the list on page 297, prove that the conjec-ture is equivalent to the following statement. There are only finitely manyeven powerful numbers n such that

n2 ! 1

is also powerful (with gcd(n! 1, n + 1) = 1.)

8.43. With reference to Exercise 8.42, prove that n % N is powerful if and onlyif

n = x2y3

for some x, y % N.

8.44. Show that the ABC conjecture implies that the largest prime factor of1 + x2y3 goes to infinity as x + |y| goes to infinity.

8.45. Given any even a % N prove that the ABC conjecture implies the existenceof infinitely many m % N such that

a2m ! 1

is not powerful.(Hint: Use Exercise 8.42.)

Chapter 9

Elliptic Curves

Is it so bad, then, to be misunderstood? Pythagoras was misunderstood, andSocrates, and Jesus, and Luther, and Copernicus, and Galilieo, and Newton,and every pure and wise spirit that ever took flesh. To be great is to bemisunderstood.

from Self-Reliance in Essays 1841Ralph Waldo Emerson (1803–1882)

American philosopher and poet

Although the history of elliptic curves is well over a century old and wasinitially developed in the context of classical analysis, these essentially algebraicconstructs have found their way into other areas of mathematics in the modernday. Elliptic curves have had impact, at a deep level, on both applied mathe-matics, for instance in the area of cryptology, as well as in pure mathematics,such as in the proof of FLT. Indeed, a key ingredient in the resolution of Fer-mat’s equation, (1.44) on page 41, involved certain elliptic curves, which we willexplore in §10.3. Moreover, as we shall see later in this chapter, elliptic curvesare used in factoring algorithms, primality testing, as well as the discrete logproblem, upon which certain elliptic curve ciphers base their security. In fact,elliptic curve methods are widely considered to be some of the most powerfuland elegant tools available to the cryptographic community. To see the beauty,complexity, and power of this topic, we must begin with foundational material.Some of what follows is adapted from [64].

9.1 The Basics

In Chapter 8, we explored numerous applications of our methods, developedin earlier chapters, to a variety of Diophantine equations including the general-ized Ramanujan-Nagell equation (8.5) in §8.2, Bachet’s equation (8.15) in §8.3,the Fermat equation in §8.4, as well as the Catalan and related equations in

301

302 9. Elliptic Curves

§8.5. In particular, Bachet’s Equation motivates the very definition of ellipticcurves since it is a special case.

Definition 9.1 Elliptic Curves

Let F be a field with char(F ) '= 2, 3. If a, b % F are given such that

4a3 + 27b2 '= 0

in F , then the elliptic curve of

y2 = x3 + ax + b

over F , denoted by E(F ), is the set of points (x, y) with x, y % F such that theequation

y2 = x3 + ax + b (9.1)

holds in F together with a point o, called the point at infinity. The value

#(E(F )) = !16(4a3 + 27b2)

is called the discriminant of the elliptic curve E. (Elliptic curves can also bedefined for char(F ) = 2, 3 by an equation slightly di"erent from (9.1), but wewill not need those cases herein. We assume throughout that char(F ) '= 2, 3.)

Remark 9.1 In order to understand the term point at infinity, we look at howprojective geometry comes into play. Projective geometry studies the propertiesof geometric objects invariant under projection. For instance, projective 2-spaceover a field F , denoted by P2(F ), is the set

{(x, y, z) : x, y, z % F}!{ (0, 0, 0)}

of all equivalence classes of projective points

(tx, ty, tz) $ (x, y, z)

for nonzero t % F . So, if z '= 0, then there exists a unique projective point inthe class of (x, y, z) of the form (x, y, 1), namely (x/z, y/z, 1). Thus, P2(F ) maybe identified with all points (x, y) of the ordinary, or a!ne, plane together withpoints for which z = 0. The latter are the points on the line at infinity, whichone may regard as the horizon on the plane. With this definition, one sees thatthe point at infinity in Definition 9.1 is (0, 1, 0) in P2(F ). This is the intersectionof the y-axis with the line at infinity.

Remark 9.2 The historical significance of the very term “elliptic curve” is alsoworth exploring. The term elliptic curve is somewhat of a misnomer since theelliptic curves are not ellipses. The term comes from the fact that elliptic curvesmade their initial appearance during attempts to calculate the arc length of anellipse. The most appropriate name for elliptic curves comes from an area of

9.1. The Basics 303

mathematical inquiry called algebraic geometry. There they are classified asabelian varieties of dimension one. Furthermore, (9.1) is used rather than theseemingly more general

Y 2 = X3 + AX2 + BX + C

since we may make the translation

X ./ x!A/3

to get (9.1) with

a = B !A2/3 and b = A3/9!AB/3!A3/27 + C.

Moreover, once the translation is made, we may find a root of

x3 + ax + b = 0

from the formula:

x = 3

T! b

2+ c + 3

T! b

2! c,

where

c =

Tb2

4+

a3

27,

called Cardano’s Formula. Also see (10.26) on page 353 for another standardform of equations for elliptic curves.

We now motivate the discussion of the group structure arising from ellipticcurves by discussing some connections between elliptic curves and Diophantineequations that we studied earlier. As noted above, Bachet’s Equation is anexample of an elliptic curve. However, there are other, not so obvious, onessuch as the Fermat Equation

x3 + y3 = z3,

which is an elliptic curve after the transformations

X = 12z/(x + y) and Y = 36(x! y)/(x + y),

which yieldY 2 = X3 ! 432, (9.2)

having no rational solutions, except X = 12, |Y | = 36, by Exercise 9.1 onpage 309, in view of Theorem 1.18 on page 41 (see also Exercise 9.2). Hence,in his proof of Theorem 1.18, Gauss was essentially dealing with points in F =Q(&!3) on elliptic curves over F . In fact, it was through such connections

that Andrew Wiles used elliptic curves to motivate his solution of FLT for thegeneral case. Essentially, Wiles showed that the existence of a solution to theFermat Equation (1.44) would imply the existence of an elliptic curve which


would exhibit a special property called a modularity pattern. In 1990, KenRibet, whose work inspired Wiles, had already shown that such a curve cannotbe modular, and FLT fell to the contradiction—see §10.4 for a more detailedexplanation of the proof of FLT and the involvement of these contributors.Hence, we cannot have a greater motivator for looking at such curves thanthe felling of a century’s old problem. But this is not a sole motivator since,as mentioned at the outset, there are modern-day cryptographic applications,which are one of the main topics of this chapter.

Biography 9.1 Girolamo Cardano (1501–1576) was born in Pavia, Duchy ofMilan, now Italy, on September 24, 1501. In his early years, Cardano as-sisted his father, who was a lawyer and lecturer of mathematics primarily atthe Platti foundation in Milan. Then he entered his father’s alma mater, PaviaUniversity, to study medicine. The university was closed when war erupted,so Cardano went to the University of Padua to continue his studies. Shortlyafter the death of his father, Cardano squandered his small inheritance, andbecame addicted to gambling, where his knowledge of probability fared him well.However, the company he kept is told by the fact that he always carried a knife,and once slashed the face of an opponent over a question of cheating. Despitethe time wasted in these endeavors, he achieved his doctorate in 1525. Aftera series of attempts at medical practice and gambling, Cardano obtained hisfather’s former post at the Platti foundation.

In 1541, Niccolo Tartaglia (ca. 1500–1557 ) gained fame for solving the cubicequation. However, he was not the first to do so. That honour goes to Scipionedel Ferro (ca. 1465–1526 ), a name absent from many historical accounts ofthe matter. When Cardano learned of the solution, he invited Tartaglia to hishome and extracted the solution from him after Cardano promised, under oath,not to disclose it. In 1543, Cardano learned of Ferro’s solution, and felt thathe could therefore publish it despite his oath. In his book Ars Magna, publishedin 1545, he did that along with a solution of the quartic equation. The latterhad been solved by Ludovico Ferrari (1522–1569 ).

Cardano became a respected professor at Bologna and Milan, and a prolificwriter. He contributed to probability theory, hydrodynamics, mechanics, andgeology. He died on September 21, 1576, ostensibly at his own hand, havingcorrectly predicted the date of his demise some time earlier.

Example 9.1 Consider the elliptic curve

y2 = x3 + 3x + 4.

By observation we see that P = (!1, 0) and Q = (0, 2) are points on theintersection of the curve with a line. Let us find the third. Since

(2! 0)/((0! (!1)) = 2

is the slope of the line through P and Q, then the equation of the line is y =2(x + 1). The combined graphs are given in Diagram 9.1. To find the third

9.1. The Basics 305

point of intersection with the curve, we put

y = 2(x + 1) (9.3)

into y2 = x3 + 3x + 4 to get

4(x + 1)2 = x3 + 3x + 4,

which simplifies to x(x+1)(x! 5) = 0, so x = 5 and by plugging this into (9.3)we get y = 12.

Diagram 9.1

y2 = x3 + 3x + 4

and

y = 2(x + 1)

—10

—5

0

5

10

y

—2 1 2 3 4 5x

In Example 9.1 we used the geometry ofthe situation to find a third point from twogiven points. We observe that if we can in-deed find two rational points on a curve, thenthe third must also be rational since two ofthe three points (intersecting a straight line,possibly repeated) are roots of a quadraticequation, which is

x2 ! 4x! 5 = 0

in Example 9.1. If we know only one ratio-nal point, then we cannot guarantee that theother two points on a line through that point,intersecting the curve, will be rational. Forinstance, if

y2 = x3 + x + 4, (9.4)

then (0, 2) is a point on the curve. However,if we take a line through this point with slope1 say, then the equation of that line is

y = x + 2.

If we plug this into (9.4), we get

(x + 2)2 = x3 + x + 4,

which simplifies to x(x2 ! x ! 3) = 0. Bythe quadratic formula, x2 ! x ! 3 = 0 hasthe solutions x = (1 ±

&13)/2, which are not rational. Thus, in our quest to

find rational points on elliptic curves, we should choose a straight line that goesthrough two rational points on an elliptic curve, since then the third point isguaranteed to be rational by the quadratic formula. This process is illustratedby Example 9.1.


Figure 9.1: y2 = x3 ! 4x

—3

—2

—1

0

1

2

3

y

—2 —1 1 2x

As seen earlier in (9.2) on page 303, there are elliptic curves with no non-trivial rational points, arising out of Diophantine problems. The following di-agram illustrates another elliptic curve with no nontrivial rational points byExercise 9.2 on page 309.

If one wishes to form a group out of the points of an elliptic curve, onemust have a well-defined operation, such as addition. Let us look at adding twopoints P and Q on an elliptic curve E(F ). If P '= o, and P '= ±Q where !Q isthe reflection of Q in the x-axis, then there must be a third point R on E(F ),uniquely determined as the intersection point E(F ) of the line through P andQ. Note that !Q is just the third point on the line joining Q and o, namelythe vertical line through Q. This means that if

Q = (x, y),

then!Q = (x,!y).

Observe, as well, that if P = (x, z), then necessarily y = ±z, namely

P = ±Q.

As discussed above, if we require that all points be rational, then the existenceof two distinct rational points P and Q guarantees that the third point must berational.

Now the issue is to define the meaning of P + Q. It is tempting to setP + Q = R. However, suppose that we do this, namely we define the sum of

9.1. The Basics 307

two distinct points P and Q on an elliptic curve E to be the third point R ofintersection of E with the line joining the P and Q. Suppose that this definitionof addition leads to a group structure. Then in order to get P + 0 = P , where0 is the additive identity, the line through any point P and 0 must intersect thecurve as a tangent at P . However, by definition, this means that P + P = P ,since this is the only point of intersection. Hence, given the existence of additiveinverses !P , we get P = 0 for all P . Hence, the assumption of two distinctpoints on the curve leads to a contradiction. Instead, we define P +Q = !R, thereflection of R about the x-axis. The following figure illustrates this discussion.

Figure 9.2: Addition of distinct points on y2 = x3 ! 5x + 2

P+Q=-R=(–2,–2)

Q=(1/4,7/8)x=–2

R=(–2,2)

P=(2,0)

y=(2-x)/2

–2

–1

0

1

2

y

–3 –2 –1 1 2x

On the other hand, if P = Q '= o and P '= !Q, then we take the tangentline at P , which gives rise to a third point R = (x3, y3), uniquely determined asthe intersection point of E(F ) with the tangent line. Then the reflection aboutthe x-axis gives us:

P + P = 2P = !R.

Thus, 2P is the reflection of the point R about the x-axis, namely the otherintersection !R of the line x = x3 with E(F ). Lastly, if P = !Q, then the linethrough P and !Q is vertical, so o is the third point of intersection, in whichcase

P + Q = o.

In the above fashion, E(F ) becomes an additive abelian group with identity o.This is an easy exercise, except for proving the associativity, for which the readermay want to use some mathematical software package. The following illustrates


the discussion for addition of nondistinct points P = Q, but P '= !Q, namelyP is not on a vertical tangent line.

Figure 9.3: Addition of a point to itself on y2 = x3 ! 4x + 1

2P=-R=

(33/16,–79/64)

x=33/16

R=(33/16,79/64)

P=(–1,2)y=(7 - x)/4

–2

–1

0

1

2

y

–2 –1 1 2x

The following definition, motivated by the preceding discussion, provides asummary by giving the addition of points in parametric form.

Definition 9.2 (Addition of Points on Elliptic Curves)

Let E(F ) be an elliptic curve with char(F ) '= 2, 3. For any two points P =(x1, y1) and Q = (x2, y2) on E(F ), define

P + Q =

13

4

o if x1 = x2 and y1 = !y2,Q if P = o,(x3, y3) otherwise,

wherex3 = m2 ! x1 ! x2, (9.5)

y3 = m(x1 ! x3)! y1, (9.6)

andm =

0(y2 ! y1)/(x2 ! x1) if P '= Q,(3x2

1 + a)/(2y1) if P = Q. (9.7)

The preamble to Definition 9.2 provided a motivation for that definitionin geometric terms. Now we have an algebraic explanation to supplement thegeometry. Let E(F ) be given by

y2 = x3 + ax + b. (9.8)

9.1. The Basics 309

If P = (x1, y1), Q = (x2, y2) on E(F ) with x1 '= x2, so P '= ±Q, then !(P +Q)is the third point of intersection, R = (x3,!y3), of E(F ) with the line joiningP and Q. The equation of this line has slope m = (y1 ! y2)/(x1 ! x2), whichis (9.7) for the case P '= Q. This may be rewritten as y = m(x! x1) + y1, andplugged into (9.8) to get:

m2(x! x1)2 + 2m(x! x1)y1 + y21 = x3 + ax + b,

which simplifies tox3 !m2x2 + Ax + B = 0, (9.9)

where A = a + 2m2x1 ! 2my1 and B = b ! y21 + 2m1x1 !m2x2

1. However, byExercise 2.25 on page 96, m2 = x1 + x2 + x3, or by rewriting,

x3 = m2 ! x1 ! x2,

which is (9.5). Thus P + Q = (x3, y3), where

y3 = m(x1 ! x3)! y1,

which is (9.6). If P = Q = (x1, y1) and P '= !Q, namely y1 '= 0, then the slopeof the tangent at P is given by 2yy% = 3x2 + a, namely by

m =3x2

1 + a

2y1,

which is the case (9.7) for P = Q. Lastly, if P = !Q, then the line through Pand !Q is vertical, so the third point of intersection is o, as noted above, andP + Q = !Q + Q = o.

Remark 9.3 All of the above can be summarized in a single equation thatcovers all cases including the possibility that P = o, and the possibility that thepoints are nondistinct. It is that if P,Q, R are three collinear points (all in thesame straight line) on E(F ), then

P + Q + R = o.

Exercises

9.1. Prove that x3 + y3 = z3 has solutions x, y, z % Z with xyz '= 0 if and onlyif Y 2 = X3 ! 432 has solutions X, Y % Q with |Y | '= 36.

9.2. Prove that Y 2 = X3 ! 4X has nonzero solutions X, Y % Q if and only ifx4 + y4 = z2 has nonzero solutions x, y, z % Z.


9.2 Mazur, Siegel, and Reduction

Mathematics, the non-empirical science par excellence. . .the science of sci-ences, delivering the key to those laws of nature and the universe which areconcealed by appearances.

from contributions to The New YorkerHannah Arendt (1906–1975)

Geman-born American political philosopher

The principal thrust of this section is the presentation of the celebratedresults by Mazur on torsion points, of Siegel on the finiteness of integer pointson elliptic curves, and Mordell’s result on elliptic curves over Q being finitelygenerated. First, we need to define some terms.

If we consider rational points on an elliptic curve E(Q), then they are clas-sified into two types as follows, with Definition 9.2 on page 308 in mind.

Definition 9.3 Torsion Points on Elliptic Curves

If E(Q) is an elliptic curve over Q, and P is a point on E(Q) such that

nP = P + P + · · · + PU VW Xn summands

= o

for some n % N, then P is called a torsion point or a point of finite order. Thesmallest such value of n is called the order of P . We call o the trivial torsionpoint. If P is not a torsion point, then P is said to be a point of infinite order.

Remark 9.4 In 1922, Mordell proved that if E(Q) is an elliptic curve over Q,then E(Q) is finitely generated—see Biography 9.2 on page 315. This remarkableresult had been assumed without proof by Poincare in 1901—see Biography 3.8on page 147. Essentially this result says that the points of infinite order can berepresented as an integral linear combination of some finite set of points {Pj}n

j=1

on E(Q). The value n is called the rank of E(Q). The study of the rank ofelliptic curves is one of the most active research areas in modern mathematics.In 1928, Weil generalized the Mordell result to elliptic curves E(F ), where F isan arbitrary number field—see Biography 9.3 on page 316. Thus, today we callthe generalized result the Mordell–Weil Theorem. For a proof of this celebratedresult see [88, Theorem 6.7, p. 220]. There are many deep results such as this,which we will state without proof in this section in order to give the reader someflavour of the richness of the subject. There is a vast literature on the subjectfor the interested reader to pursue.

Example 9.2 Let E(Q) be defined by

y2 = x3 + 1,

illustrated in Figure 9.4 on the next page. Consider the point P = (2, 3). ByDefinition 9.2, we calculate that

2P = (0, 1), 3P = (!1, 0), 4P = (0,!1), 5P = (2,!3), and 6P = o.

9.2. Mazur, Siegel, and Reduction 311

These points are illustrated in Figure 9.4. Notice that we begin with the tangentline T at P , which intersects the curve at (0,!1), so 2P = (0, 1), the reflectionof (0,!1) about the x-axis. Then the line L through P and (0, 1) intersects thecurve at (!1, 0), which is 3P since it is its own reflection in the x-axis. Theintersection of L with E(Q) is (0, 1), so 4P = (0,!1), the reflection of (0, 1)about the x-axis. Since (0,!1) is on T , then the intersection of T with E(Q) is(2, 3), so 5P = (2,!3), again the reflection of (2, 3) about the x-axis. Since Pand 5P lie on the vertical line V , then 6P = o. Thus, P is a torsion point oforder 6.

Figure 9.4: Multiples of Torsion Points on y2 = x3 + 1

V

T

L

P=(2,3)

5P=(2,–3)

6P=point at infinity

4P=(0,–1)

2P=(0,1)

3P=(–1,0)

–4

–2

2

4

y

–2 –1 1 2 3x

Example 9.2 illustrates a microcosm of a fact that is contained in theMordell–Weil Theorem described in Remark 9.4 on the facing page, namelythat every rational point on E(Q) can be obtained from a finite set of pointsby repeatedly taking lines through pairs of them, intersecting with E(Q), andreflecting about the x-axis to create new points. The torsion points, such asthose given in Example 9.2, form a finite subgroup E(Q)t ( E(Q), called thetorsion subgroup. Thus, by the method illustrated in Example 9.2, we have ane!ective method for computing E(Q)t. A pertinent result proved by Lutz andNagell in the mid 1930’s is given in the following.

Theorem 9.1 Nagell–Lutz Theorem

If P = (x1, y1) % E(Q)t, where E(Q) is given by y2 = x3 + ax + b, a, b % Z,then x1, y1 % Z and either y1 = 0 or y2

1

## (4a3 + 27b2).


Proof. See [88]. !

Theorem 9.1 says that all elements of E(Q)t must have rational integer coor-dinates, called integer points, where the ordinate (y-value) divides the discrimi-nant of E(Q). Thus, the Nagell–Lutz Theorem determines all integer points Psuch that 2P is also an integer point. (Nagell was the first to prove the result.Then Lutz later refined the proof.) Thus, we may conclude that if a multiple ofan integer point is not an integer point, then that point has infinite order. Forinstance, in Figure 9.3 on page 308 the integer point (!1, 2) must be of infiniteorder since

2P = (33/16,!79/64).

The deeper problem of actually determining the cardinality |E(Q)t| as E(Q)varies over all elliptic curves over Q was solved by B. Mazur who proved in1976:

Theorem 9.2 Mazur’s Theorem

If E(Q) is an elliptic curve over Q, then either

E(Q)t$= Z/nZ,

for some n % {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12} or

E(Q)t$= Z/2Z: Z/2nZ,

where n % {1, 2, 3, 4}.

Thus, the torsion group cannot have order bigger than 16 for elliptic curvesover Q. For instance, by Exercise 9.3 on page 315, E(Q)t for the elliptic curve

y2 = x3 + 1

in Example 9.2 is made up of

(2,±3), (0,±1), (!1, 0), and o,

so |E(Q)t| = 6 in that case. Figure 9.1 on page 306 provides an instance where|E(Q)t| = 2 since (0, 0) is the only nontrivial torsion point and it has order 2.An example of the case where n = 1 is given by the elliptic curve E(Q) givenby

y2 = x3 ! 2

since o is the only torsion point by Exercise 9.4. The problem of determining|E(F )t|, as E(F ) varies over all elliptic curves for an arbitrary number fieldF , remains open. However, in 1996 L. Merel [57] proved what is called thestrong uniform boundedness conjecture (UBC) of Mazur and Kamienny, namelythat for an elliptic curve E(F ) over F , |E(F )t| # BF , where BF is a constantdepending only on |F : Q|. For instance, Mazur’s Theorem tells us that

|E(Q)t| # BQ = 16.


We have seen that the number of torsion points is finite, in fact quite smallfor a given elliptic curve by Mazur’s Theorem. However, we have also seeninstances where an integer point is not a torsion point. Thus the questionnaturally arises: Are there infinitely many integer points on a given ellipticcurve? In 1926, C.L. Siegel solved the problem by proving the following—seeBiography 4.4 on page 170.

Theorem 9.3 Siegel’s Theorem

The equation y2 = x3 + ax + b, with a, b, c % Z and 4a3 + 27b2 '= 0, has onlyfinitely many solutions x, y % Z.

Remark 9.5 The nonvanishing condition on the discriminant in the hypothesisof Theorem 9.3 is necessary since, for instance, y2 = x3 has infinitely manyinteger solutions, which may be seen by letting n % N and setting y = n3,x = n2.

Now that we have some basic knowledge of elliptic curves, we may turn ourattention to elliptic curves over finite fields, since this is the gateway to theapplications of elliptic curves to factoring and primality testing. To do this, thecanonical approach is to begin with an elliptic curve E(Q) over Q and reduce itmodulo a prime p. To understand how this is done, we must first make precisewhat we mean by reduction of rational points.

Definition 9.4 Reduction of Rationals on Elliptic Curves

Let n % N and x1, x2 % Q with denominators prime to n. Then

x1 + x2 (mod n) means x1 ! x2 = a/b where gcd(a, b) = 1, a, b % Z, and n|a.

For any x = c/d % Q with gcd(d, n) = 1 = gcd(c, d), there exists a unique r % Z,with 0 # r # n! 1, such that x + r (mod n), denoted by

r = x (mod n).

Note that we may taker + cd#1 (mod n),

where d#1 is the unique multiplicative inverse of d modulo n. Hence, if P =(x, y) is a point on an elliptic curve E = E(Q) over Q, with denominators of xand y prime to n, then

P (mod n) means (x (mod n), y (mod n)).

Also, E (mod n) denotes the curve reduced modulo n, namely the curve definedby

y2 = x3 + a (mod n)x + b (mod n),


withx = x (mod n), and y = y (mod n).

The cardinality of the set E (mod n) is denoted by

|E (mod n)|.

It turns out that E (mod n) in Definition 9.4 may not be a group, sincecertain elements may not be invertible. However, we may still use it for practicalcomputational purposes, as illustrated below.

Example 9.3 If x = 5/4 and n = 7, then x(mod 7) = 3 = r is the uniqueinteger (least positive residue) modulo 7 such that x + r (mod 7), since 5/4!3 =!7/4. Note that 5/4 + 5 · 4#1 + 5 · 2 + 3(mod 7).

The following result tells us how to add and reduce points on rational ellipticcurves, and will be the chief tool in the description of the elliptic curve factoringmethod in §9.3.

Theorem 9.4 (Addition and Reduction of Points on Elliptic Curves)Let n % N, gcd(6, n) = 1, and let E = E(Q) be an elliptic curve over Q with

equationy2 = x3 + ax + b, a, b % Z,

andgcd(4a3 + 27b2, n) = 1.

Let P1, P2 be points on E where

P1 + P2 '= o,

and the denominators of P1, P2 are prime to n. Then P1 + P2 is on E withcoordinates having denominators prime to n if and only if there does not exista prime p|n such that

P1 (mod p) + P2 (mod p) = 0 (mod p)

on the elliptic curve E (mod p) over Fp, with equation

y2 = x3 + a (mod p)x + b (mod p).

Proof. See [47, Proposition VI.3.1, pp. 172–174]. !


Biography 9.2 Louis Joel Mordell (1888–1972) was born in Philadelphia,Pennsylvania on January 28, 1888. He was educated at Cambridge, and lec-tured at Manchester College of Technology from 1920 to 1922. In 1922, hewent to Manchester University where he remained until 1945 when he held theSadleirian Chair at the College of St. John’s in Cambridge. The topic for hisinaugural lecture to the chair was the equation y2 = x3 +k. Although he retiredfrom the chair in 1953, his mathematical output remained high. Indeed, roughlyhalf of his 270 publications were published after he left the chair. In 1971, hewas still traveling and lecturing, including an extensive tour of Asia after heattended a number theory conference in Moscow. Yet, he fell ill a few monthslater and died in Cambridge on March 12, 1972.

Among his honours were being elected as a member of the Royal Society in1924, winning the De Morgan Medal in 1941, being president of the LondonMathematical Society from 1943 to 1945, and winning the Sylvester Medal in1949.

Exercises

9.3. Prove that the torsion points computed in Example 9.2 on page 310 areall of the points in E(Q)t. (Hint: Use the Nagell–Lutz Theorem.)

9.4. Prove that there are no nontrivial torsion points on the elliptic curve E(Q)given by y2 = x3 ! 2. (Hint: Look at Theorem 1.19 on page 47, and usethe Nagell–Lutz Theorem.)

9.5. Suppose that the equation defining an elliptic curve E(Fpk) over Fpk , p aprime, is

y2 = x3 + ax + b, a, b % Z.

Prove that the number of elements on E, counting the point at infinity, is

pk + 1 +!

x'Fpk

7(x3 + ax + b),

where 7 is a quadratic Dirichlet character modulo pk. In other words,7(y) = !1, 0, 1 according as y is a quadratic nonresidue, 0, or a quadraticresidue respectively for y % Fpk .

In Exercises 9.6–9.9, use the Nagell–Lutz Theorem 9.1 and Mazur’s Theorem9.2 both on page 312 to do the calculations.

9.6. If y2 = x3 ! 432 defines the elliptic curve E(Q), calculate E(Q)t.

9.7. If E(Q) is given by y2 = x3 ! 2x + 1, determine E(Q)t.

9.8. If E(Q) is given by y2 = x3 ! x, determine E(Q)t.

9.9. If E(Q) is given by y2 = x3 + 1, determine E(Q)t.


Biography 9.3 Andre Weil, pronounced vay (1906–1998), was born on May6, 1906 in Paris, France. As he said in his autobiography, The Apprenticeshipof a Mathematician, he was passionately addicted to mathematics by the ageof ten. He was also interested in languages, as evidenced by his having readthe Bhagavad Gita in its original Sanskrit at the age of sixteen. After graduat-ing from the Ecole Normal in Paris, he eventually made his way to Gottingen,where he studied under Hadamard. His doctoral thesis contained a proof of theMordell–Weil Theorem, namely that the group of rational points on an ellipticcurve over Q is a finitely generated abelian group. His first position was at Ali-garh Muslim University, India (1930–1932), then the University of Strasbourg,France (1933–1940), where he became involved with the controversial Bourbakiproject, which attempted to give a unified description of mathematics. Thename Nicholas Bourbaki was that of a citizen of the imaginary state of Pol-davia, which arose from a spoof lecture given in 1923. Weil tried to avoid thedraft, which earned him six months in prison. It was during this imprisonmentthat he created the Riemann hypothesis—see Conjecture 5.1 on page 223. Inorder to be released from prison, he agreed to join the French army. Then hecame to the United States to teach at Haverford College in Pennsylvania. Healso held positions at Sao Paulo University, Brazil (1945–1947), the Universityof Chicago (1947–1958), and thereafter at the Institute for Advanced Study atPrinceton. In 1947 at Chicago, he began a study, which eventually led him toa proof of the Riemann hypothesis for algebraic curves. He went on to for-mulate a series of conjectures that won him the Kyoto Prize in 1994 from theInamori Foundation of Kyoto, Japan. His conjectures provided the principlesfor modern algebraic geometry. His honours include an honorary membershipin the London Mathematical Society in 1959, and election as a Fellow of theRoyal Society of London in 1966. However, in his own o!cial biography helists his only honour as Member, Poldavian Academy of Science and Letters.He is also known for having said, “In the future, as in the past, the great ideasmust be the simplifying ideas,” as well as, “God exists since mathematics isconsistent, and the devil exists since we cannot prove it.” This is evidence ofhis being known for his poignant phrasing and whimsical individuality, as wellas for the depth of his intellect. He died on August 6, 1998 in Princeton, andis survived by two daughters, and three grandchildren. His wife Eveline died in1986.

9.3. Applications: Factoring & Primality Testing 317

9.3 Applications: Factoring & Primality Testing

In mathematics you don’t understand things. You get used to them.from The Dancing Wu Li Masters—see [106]

John von Neumann (1903–1945)Hungarian-born American mathematician and computer pioneer

§9.1 and §9.2 put us in a position to describe Lenstra’s factorization methodusing elliptic curves—see Biography 9.4 on the next page.

! Lenstra’s Elliptic Curve Factoring Method

The following is the algorithm for factoring an odd composite n % N.

(1) In some random fashion, we generate a pair (E,P ), where E = E(Q) isan elliptic curve over Q with equation

y2 = x3 + ax + b, a, b % Z,

and P is a point on E.

(2) Check that gcd(n, 4a3 + 27b2) = 1. If so, go to step (3). If not, then wehave a factor of n, unless gcd(n, 4a3 + 27b2) = n, in which case we choosea di!erent pair (E,P ).

(3) Choose M % N and bounds A, B % N such that the canonical primefactorization of M is

M =!7

j=1

papj

j ,

for small primes p1 < p2 < . . . < p! # B, where

apj = 0loge A/ loge pj1

is the largest exponent such that papj

j # A.

(4) For a sequence of divisors s of M , compute

sP (mod n)

as follows. First compute

sP = pk1P (mod n),

for 1 # k # ap1, then

sP = pk2p

ap11 P (mod n),

for 1 # k # ap2, and so on, until all primes pj dividing M have been

exhausted or the following occurs.


(5) If the calculation of either (x2 ! x1)#1 or (2y1)#1 in (9.7) on page 308,for some s|M in step (4), shows that one of them is not prime to n, thenthere is a prime p|n such that

sP = o (mod p), (9.10)

by Theorem 9.4 on page 314. This will give us a nontrivial factor of nunless (9.10) occurs for all primes p|n. In that case gcd(s, n) = n, and wego back and try the algorithm with a di!erent (E,P ) pair.

The value of B in step (3) of the above algorithm is the upper bound onthe prime divisors of s, from which we form sP . If B is large enough, then weincrease the probability that sP = o(mod p) for some prime p

## n. On the otherhand, the larger the value of B, the longer the computational time. Hence, wemust also choose B to minimize running time. Moreover, A is an upper boundon the prime powers that divide s, so similar considerations apply. Lenstrahas some convincing conjectural evidence that n % N can be factored by hisalgorithm in expected running time

O.e&

(2+0) loge p(loge loge p)(loge n)2/

,

where p is the smallest prime factor of n and 2 goes to zero as p gets large. (Acorollary of this fact is that the elliptic curve method can be used to factor n inexpected time

O(e&

(1+0)(loge n)(loge loge n)),with 2 as above.)

Biography 9.4 Hendrik Willem Lenstra Jr. (1949–) was born in Zaandam,Netherlands. His father was a mathematician, and his brothers, Arjen and Jan,are also well-known mathematicians. Hendrik studied at the University of Am-sterdam. He was an extraordinary student whose brilliance was demonstratedby his solution of a problem of Emmy Noether which he published in Inven-tiones Mathematicae—see Biography 2.1 on page 73. In 1977, he obtained hisdoctorate under the direction of Frans Oort. Then, when only twenty-eight, hewas appointed full professor at the University of Amsterdam. In 1987, he wentto the United States, where he was appointed a full professor at Berkeley. In2003, he retired from Berkeley to take a full-time position at the University ofLeiden, the oldest university in the Netherlands.

Among his honours include the Fulkerton Prize in 1985, plenary lecturerat the International Congress of Mathematicians in 1986 at Berkeley, an hon-ourary doctorate at the Universite de Franche-Comte, Besancon in 1995, andKloosterman-lecturer at the University of Leiden in 1995. Also, he receivedthe Spinozapremie (Spinoza Prize) in 1998. The latter is an annual award bythe Netherlands Research Council of 1.5 million Euros, to be spent on new re-search. The award, named after the philosopher Baruch Spinoza, is the highestscientific award in the Netherlands—see the quote on page 331.


In the next example, which illustrates the Lenstra’s algorithm, we will makeuse of the following renowned result proved by Hasse.

Theorem 9.5 Hasse’s Bound for Elliptic Curves Over Fp

If E is an elliptic curve over Fpk for a prime p > 3, and k % N, then##|E (mod pk)|! pk ! 1

## # 2'

pk.

Note that Exercise 9.5 on page 315 is related to the following inequalityemanating in Theorem 9.5 for the case where k = 1,

(&

p! 1)2 = p + 1! 2&

p < |E (mod p)| < p + 1 + 2&

p = (&

p + 1)2. (9.11)

Indeed, (9.11) represents the order of magnitude of the distance from p for thepossible orders of E (mod p). Statistically speaking, the distance from the originafter addition over p elements of the Legendre symbol, the k = 1 case of Exercise9.5, is proportional to &p, so Theorem 9.5 gives an expected statistical result:

##|E (mod p)|! p! 1##

&p

# 2.

Based upon Hasse’s Theorem 9.5, for k = 1 and the above expected runningtime, Lenstra concludes that if we take

A = p + 1 + 2&

p, and B = e&

(loge p)(loge loge p)/2,

where p is the smallest prime factor of n, then about one out of every B iterationswill be successful in factoring n. Of course, we do not know a prime divisor pof n in advance, so we replace p by 0

&n1 and look at incremental values up to

that bound.Once the values of A and B have been chosen, then for a given prime p, the

set E (mod p) is a finite abelian group, since this is an elliptic curve over a finitefield. Also, if the order g of E (mod p) is not divisible by any primes larger thanB, and if p is a prime such that

p + 1 + 2&

p < A,

then Hasse’s Theorem 9.5 tells us that g## m in the algorithm, so

mP = o (mod p).

When E (mod n) is not a group, then this is not a problem in the algorithm.The reason is that, even if P1 and P2 were points on such a curve and if P1 +P2

were not defined, then n must be composite! The noninvertibility that wouldresult in step 5 of the algorithm would then give us a factor of n. This is indeedthe underlying key element in the elliptic curve algorithm.


Remark 9.6 There is also the following valuable result on the group structureof E = E(Fp). If p > 3 is prime, then there are m, n % N such that E isisomorphic to the product of a cyclic group of order m with one of order n,where m| gcd(n, p! 1). See [47].

The following example is chosen to best illustrate the algorithm for peda-gogical purposes, wherein we choose relatively small values of n to factor. Eventhough modular reduction at each stage keeps the size of the rational points toa minimum, the larger the number, the higher the likelihood of a large numberof stages before the algorithm terminates. Thus, we keep the value small so thatthe process may be illustrated without filling pages with calculations.

Example 9.4 Let n = 3551. Choose a family of elliptic curves

y2 = x3 + ax + 1,

each of which has the point P = (0, 1) on it. We now choose successive naturalnumbers a until the process described above is successful in factoring n. Wetake B = 3, and since

0&

n1 = 59 ) p,

then by Hasse’s Theorem 9.5 on the preceding page, we may choose

A = 59 + 1 + 20&

n1 = 178.

Thus,M = 27 · 34,

where7 = 0loge 178/ loge 21,

and4 = 0loge 178/ loge 31.

Using (9.5)–(9.7), we tabulate the following for a = 1. First we verify that thediscriminant of E is prime to n. We have

#(E(Q)) = !16(4 · 13 + 27 · 12) = !16 · 31,

which is prime to n, so we may proceed. We therefore begin with the (E,P )pair

(y2 = x3 + x + 1, (0, 1)).

In Table 9.1, the value m is given by (9.7) on page 308.


Table 9.1

s m sP1 !! (0, 1)2 1776 (888, 3106)22 2860 (3422, 796)23 1218 (3015, 1341)24 704 (3099, 3441)25 3396 (72, 3208)26 2022 (1139, 1877)27 1977 (151, 1900)263 1700 (148, 3200)273 1085 (1548, 1179)2632 3476 (525, 218)2732 2939 (639, 2081)2633 3287 (2932, 3152)2733 117 (723, 3180)2634 3297 (2612, 792)2734 11 (1999, 2400)

We now abandon the above (E,P ) pair since we have exhausted all divisorsof M without achieving a point at infinity modulo any prime p dividing n.Notice that on line nine of the column for s, we have

s = 263 = (2 + 1) · 26 = 27 + 26.

We are adding the two distinct points, the ones on lines seven and eight. Thenon line ten, s = 273 is twice s = 263 on the previous line. Similarly, this alsooccurs for

s = 2632 = 3 · 26 + 3 · 27, s = 2633 = 32 · 27 + 32 · 26, 2634 = 33 · 27 + 33 · 26.

This natural process of doubling and reduction signifies the method in the algo-rithm that we are illustrating. (This method of repeated doubling is a methodof multiplying a point P on an elliptic curve E by a given s % N. This is theanalogue of raising an element of a finite field Fq to the power s. It is knownthat this can be accomplished in O((loge s)(loge q)3) bit operations.)

The reader may now go to Exercise 9.10 on page 325 which verifies that wealso exhaust all divisors of M for each (E,P ) pair

(y2 = x3 + ax + 1, (0, 1)) with 2 # a # 8.

We now move to the next (E,P ) pair which is

(y2 = x3 + 9x + 1, (0, 1)).

Observe thatgcd(#(E), n) = gcd(!2433109, 3551) = 1,

so we may proceed.


Table 9.2s m (x3, y3)1 !! (0, 1)2 1780 (908, 3015)22 !! !!

We terminate the calculations at m = 2476943/6030 since gcd(6030, 3551) =67. This gives us the factorization 3551 = 53 · 67. Thus, we have reached step(5) of the algorithm where y#1

1 = 3015#1 does not exist modulo n for the pair(x1, y1) = (908, 3015), so we cannot use (9.7) to compute the (x3, y3) pair for22P , and the algorithm terminates with a factorization.

Example 9.4 provides ample illustrations of one reason for having to choosea new elliptic curve from the family, namely running out of divisors of M .The other reason for having to choose another such curve is the obtaining ofthe trivial factorization during the implementation of the algorithm. In otherwords, before exhaustion of the divisors of M , we could encounter a value whosegcd with n is n, as indicated in step (5) of the algorithm.

Lenstra’s algorithm is exceptional at finding small prime factors (those withno more that forty digits) of large composite numbers. However, since it requiresrelatively little storage space, it can be used as a subroutine in conjunction withother methods. For this reason, among many others, the elliptic curve methodsenjoy great favour among modern-day cryptographers.

We now show how Lenstra’s algorithm may be modified to obtain a primailitytesting algorithm. The primality test is based upon the following result.

Theorem 9.6 Elliptic Curve Primality Test

Let n % N with gcd(n, 6) = 1, and let E = E(Q) be an elliptic curve over Q.Suppose that

(a) n + 1! 2&

n # |E (mod n)| # n + 1 + 2&

n.

(b) |E (mod n)| = 2p, where p > 2 is prime.

If P '= o is a point on E and pP = o on E (mod n), then n is prime.

Proof. See [18, Lemma 14.23, p. 324]. !

Theorem 9.6 is employed by picking in some random fashion points Pj forj = 1, 2, . . . ,m % N on an elliptic curve E and, for a given prime p, calculatingpPj for each such j. If the outcome is that pPj = o for some j = 1, 2, . . . ,m,then n is prime. For instance, a suitable choice for P1 is 2Q1, where Q1 israndomly chosen. If P1 '= o, but pP1 = o, then n is prime. If P1 '= o '= pP1,then n is composite.

The following illustration is again chosen for pedagogical reasons. A “real-istic” value of n cannot be chosen, given the depth of calculations that wouldbe involved.


Example 9.5 Let n = 1231. Since we enjoyed success in Example 9.4 onpage 320 with the elliptic curve E given by y2 = x3 + 9x + 1, we use it here.First we observe that

gcd(n, 6) = gcd(#(E), n) = gcd(2433109, 1231) = 1.

Now we proceed to check n for primality. If n were prime, then Exercise 9.5 onpage 315 tells us that |E (mod n)| = 2 · 619. Also,

1161 < n + 1! 2&

n < |E (mod n)| < 1302 < n + 1 + 2&

n.

Therefore, conditions (a)–(b) of Theorem 9.6 are satisfied. To test n for primal-ity, we begin with a primitive element that has a chance of generating enoughpoints on E. Let P = (0, 1) and observe that

619 = 29 + 26 + 25 + 23 + 21 + 20,

so we calculate up to 29 and test 619P . Again, in what follows, m is the valuein (9.7) on page 308.

Table 9.3

s m sP1 !! (0, 1)2 620 (328, 985)22 1213 (899, 676)23 1156 (134, 1037)24 226 (337, 1094)25 302 (667, 188)26 996 (958, 492)27 1173 (217, 846)28 1201 (466, 469)29 457 (1109, 1120)

576 = 29 + 26 852 (9, 520)608 = 576 + 25 557 (592, 964)616 = 608 + 23 954 (912, 275)618 = 616 + 2 3 (0, 1230)619 = 618 + 1 !! o

Observe that via (9.7), (0, 1) + 618P has a zero denominator so we cannotinvert in Z/nZ, thereby yielding that 619P = o, so 1231 is prime by Theorem9.6.

We observe that if part (a) of Theorem 9.6 fails to hold, then we have acompositeness test by Hasse’s Theorem 9.5 on page 319. Also, part (b) ofTheorem 9.6 is very special and does not hold for many elliptic curves. Thereader may get a sense of this by checking a few examples via Exercise 9.5 onpage 315. Moreover, our n in Example 9.5 was su"ciently small such that wewere able to calculate |E (mod n)| with relative ease. However, as n gets large,|E (mod n)| gets large, so we may not be able to determine its value. In fact,


calculating this cardinality may be as di"cult as proving that n is prime. Theseproblems were overcome in a primality test by Goldwasser and Kilian [34]. Inorder to discuss it, a primality proving algorithm upon which Goldwasser andKillian based their primality test is within our reach and provides a basis fordiscussing the latter. Recall that a primality proving algorithm is one thatgiven an input n, verifies the hypothesis of a theorem whose conclusion is “n isprime”—see [68, §1.8].

Theorem 9.7 Goldwasser–Killian Primality Proving Algorithm .

Let n > 1 be an integer with gcd(6, n) = 1, and let m, r % N with r## m.

Furthermore, assume E = E(Q) is an elliptic curve over Q. If there exists apoint P on E such that mP = o, and for every prime p

## r we have that$

m

p

&P '= o,

then for every prime q## n we have

|E (mod q)| + 0 (mod r). (9.12)

Also, ifr > (n1/4 + 1)2,

then n is prime.

Proof. Let q be a prime divisor of n and let d be the order of P on E (mod q). Itfollows that r

## d, so (9.12) follows. Now assume that r > (n1/4 +1)2. However,by Hasse’s Theorem 9.5,

|E (mod q)| < (q1/2 + 1)2.

Hence,(q1/2 + 1) >

%##E (mod q)## > r1/2 > (n1/4 + 1),

so q >&

n. Yet, n = qt for some t % N, so if t ) 2, then q2 ) 2q, a contradiction,which yields that n is prime. !

Goldwasser and Killian employed Theorem 9.7 to provide a primality testwhere an input n % N could be tested in an expected number of operationO(logC

e n) for a constant C. The kernel of the idea in their test comes in twoparts. One is to randomly select elliptic curves modulo n for a large number ofn % N. Then whenever we get

|E (mod n)| = 2p,

where p is a probable prime, then use Theorem 9.6 on page 322 to check forprimality of p. If this test succeeds in demonstrating that p is indeed prime, then


it follows from probabilistic compositeness tests that n is provably prime—see[68, §2.7, pp. 121–126].

The second idea is to make the above process recursive. They do this byproving p is prime using Theorem 9.6 on an elliptic curve over Z/pZ of order2r, where r is a probable prime in Theorem 9.7. In this fashion, the primality ofr implies the primality of p. Moreover, since each iteration reduces the size bya half, since p 9 n/2, then it follows that the numbers will get su"ciently smallso that trial division may be used to prove it to be prime. Then by this process,the original n may be shown to be, in the last iteration, (provably) prime. If, inany iteration, the probable prime is shown to be composite,then one goes backto the initial iteration with another candidate—see [49] for more details. Also,see [18] for other interesting and deep connections.

In §9.4, we will look at applications of elliptic curves to cryptography as afitting close to this chapter where we may employ what we have learned hereinthus far.

Exercises

9.10. Perform the calculations in Lenstra’s Elliptic Curve Factoring Method foreach (E,P ) pair (y2 = x3+ax+1, (0, 1)) where 3 # a # 8. This shows thatall divisors of M are exhausted in each case without achieving a nontrivialfactor of 3551.

9.11. Use Lenstra’s Elliptic Curve Factoring Algorithm to factor each of thefollowing.(a) 16199 (b) 13261(c) 53059 (d) 10403

9.12. Use Lenstra’s Elliptic Curve Method to factor each of the following.(a) 2201 (b) 16199(c) 9073 (d) 32107

9.13. Use the Elliptic Curve Primality Test to test each of the following forprimality.(a) 7489 (b) 8179(c) 9533 (d) 26869


9.4 Elliptic Curve Cryptography (ECC)

Quod gratis assertiur, gratis negatur—What is asserted without reason (orproof ), may be denied without reason (or proof ).

Latin Maxim

For this section, the reader should be familiar with the basics on cryptologyas set out for instance in [68, §2.8, pp. 127–138]. Part of the following is adaptedfrom [66].

In the 1980s, there was a development of the notion of public-key cryptogra-phy in the realm of elliptic curves. In particular, in 1985, Miller (see [59]) andKoblitz (see [46]) independently proposed using elliptic curves for public-keycryptosystems. However, they did not invent a cryptographic algorithm for usewith elliptic curves, but rather implemented then-existing public-key algorithmsin elliptic curves over finite fields. These types of cryptosystems are more ap-pealing than cryptosystems over finite fields since, rather than just the groupof a finite field F)p, one has many elliptic curves over Fp from which to choose.Also, whenever the elliptic curve is properly chosen, there is no known subex-ponential time algorithm for cryptanalyzing such cryptosystems, where such analgorithm is defined as one for which the complexity for input n % N is

O(exp((c + o(1))(loge n)r((loge n)(loge loge n))1#r),

where r % R with 0 < r < 1 and c is a constant

–see [68, Appendix B: Complexity]. Such algorithms are faster than exponential-time algorithms and slower than polynomial time algorithms. An example ofa pioneer subexponential time algorithm is the Brillhart–Morrison continuedfraction factoring method—see [68, §5.4, pp. 240–242].

The security of Elliptic Curve Cryptosystems depends upon the intractabilityof the following problem.

Definition 9.5 (Elliptic Curve Discrete Log Problem (ECDL))

If E is an elliptic curve over a field F , then the Elliptic Curve Discrete LogProblem to base Q % E(F ) is the problem of finding an x % Z (if one exists)such that P = xQ for a given P % E(F ).

Currently, the Discrete Log Problem in elliptic curve groups is several ordersof magnitude more di"cult than the Discrete Log Problem in the multiplicativegroup of a finite field (of similar size)—see [68, §3.5, p. 167]. What this meansexplicitly is that for a suitably chosen elliptic curve E over Fq, the discretelog problem for the group of E(Fq) appears to be (given our current state ofknowledge) of complexity exponential in the size <log2 q= of the field elements,whereas there exist subexponential algorithms in <log2 q= for the discrete logproblem in F)q , where <4= is the ceiling function—see [68, §2.5]. The canonicalchoices for F in ECC are F = Fp for a prime p > 3 or F2k for k % N. We focusupon the odd prime case.

9.4. Elliptic Curve Cryptography (ECC) 327

Remark 9.7 In 1991, Menezes, Okamoto, and Vanstone found a new meansof attacking the ECDL (appearing two years later in [56]). Their method, cur-rently called the MOV attack in the literature, involves the use of what is calleda Weil Pairing—see [88, Section 3.8, pp. 95–99], which embeds an elliptic curveover a finite field into the multiplicative group of some finite extension field ofthe given finite field. Hence, their method reduces the problem to the discretelog problem in that extension field, called an MOV reduction. To be of anyuse, the degree of the extension field must be small, and essentially the onlyelliptic curves for which this degree is small are of a special type called super-singular—see [88, p. 137]. They demonstrated that if we have a supersingularcurve, then the discrete log problem in an elliptic curve group can be reduced inexpected polynomial time to the discrete log problem in the extension field ofdegree no more than 6 over the finite field. However, the vast majority of ellip-tic curves are not supersingular, called nonsupersingular or ordinary. For thenonsupersingular curves, the MOV reduction virtually never leads to a subex-ponential time algorithm. What this suggests is that one of the basic openquestions in ECC is whether or not we can find a subexponential time algo-rithm for the ECDL on some set of nonsupersingular elliptic curves—a di"cultquestion at the present time. The MOV attack was generalized by Frey andRuck [28] in 1994. Also, there is a useful test for approximating the securitylevel of an ECC, called the MOV threshold—see [90] which may be accessedonline at http://grouper.ieee.org/groups/1363/.

Another attack on elliptic curves E with |E| = p involves p-adic arith-metic, called the Semaev–Smart–Satoh–Araki attack—see [83], [86] and [89].Also, there is the Silver–Pohlig–Hellman algorithm, which reduces the problemto subgroups of prime order—see [67, §D.2, p. 530]. Other attacks includeShanks’ baby-step-giant-step method—see [67, §D.3, p. 533]; Pollards’s meth-ods including his rho method—see [68, §4.3, pp. 206–208]; and the Frey–Ruckattack using the Weil Pairing, described above. Of all of these, only theSemaev-Smart-Satoh-Araki attack runs in polynomial time, while the othersare, at best, subexponential. Up to the modern day, the ECDL remains a veryhard computational problem. Indeed, evidence of the power of ECC is thefact that the NSA had adopted ECC, saying that it “provides greater securityand more e"cient performance than the first generation public key techniques(RSA and Di"e-Hellman) now in use. As vendors look to upgrade their sys-tems they should seriously consider the elliptic curve alternative for the com-putational and bandwidth advantages they o!er at comparable security.”—seehttp://www.nsa.gov/business/programs/elliptic curve.shtml.

Now we are in a position to present an explicit ECC whose security is basedupon the assumption that the ECDL is intractable, in particular, in the cyclicsubgroup of the elliptic curve group.

! Menezes–Vanstone Elliptic Curve Cryptosystem

Let E be an elliptic curve over Fp where p > 3 is prime and let H be asubgroup of E(Fp) generated by a point P % E(Fp). Assume that randomly

http://www.nsa.gov/business/programs/elliptic_curve.shtml

http://grouper.ieee.org/groups/1363/


chosen k % Z/|H|Z and a % N are secret. If entity A wants to send message

m = (m1, m2) % (Z/pZ)) 5 (Z/pZ)),

then A does the following.

Enciphering stage:

(1) & = aP , where P and & are public.

(2) (y1, y2) = k&.

(3) c0 = kP .

(4) cj + yjmj (mod p) for j = 1, 2.

Then A sends the following enciphered message to B,

Ek(m) = (c0, c1, c2) = c,

and upon receipt, B calculates the following to recover m.

Deciphering stage:

(1) ac0 = (y1, y2).

(2) Dk((c1, c2)) = (c1y#11 (mod p), c2y

#12 (mod p)) = m.

Example 9.6 Let E be the elliptic curve given by

y2 = x3 + 4x + 4

over F13, and let P = (1, 3). Then by Exercise 9.5 on page 315, |E(Fp)| = 15,which is necessarily cyclic. Also, P = (1, 3) is a generator of E. If the privatekeys are k = 5 and a = 2, then given a message

m = (12, 7) = (m1, m2),

entity A computes& = aP = 2(1, 3) = (12, 8),

(y1, y2) = k& = 5(12, 8) = (10, 11),

c0 = kP = 5(1, 3) = (10, 2),

c1 + y1m1 = 10 · 12 + 3 (mod 13), and c2 + y2m2 = 11 · 7 + 12 (mod 13).

Then A sends

Ek(m) = E5(12, 7) = (c0, c1, c2) = ((10, 2), 3, 12) = c

9.4. Elliptic Curve Cryptography (ECC) 329

to B. Upon receipt, B computes

ac0 = 2(10, 2) = (10, 11) = (y1, y2)

and

Dk((c1, c2)) = D5(3, 12) = (3·10#1 (mod 13), 12·11#1 (mod 13)) = (12, 7) = m.

(See Exercise 9.18.)

Exercises

9.14. A given n % N is called a congruent number or simply congruent if it is thearea of a right-angled triangle. Prove that the following are equivalent.

(1) n = ab/2 is congruent, where (a, b, c) is a Pythagorean triple. (Recallthat such triples are solutions (x, y, z) % N3 to

x2 + y2 = z2.

Furthermore, such a solution with gcd(x, y, z) = 1, called a primitivePythagorean triple, exists with x even, if and only if

(x, y, z) = (2uv, v2 ! u2, v2 + u2)

for relatively prime natural numbers u and v of opposite parity—see[68, Theorem 7.6, p. 281].)

(2) There exists an integer x such that x, x!n, and x+n are all perfectsquares of rational numbers.

9.15. Let E be an elliptic curve over Q given by

y2 = (x! %1)(x! %2)(x! %3),

where %j % Q for j = 1, 2, 3. Assume that for a given point (x2, y2) '= oon E, there exists a point (x1, y1) on E such that

2(x1, y1) = (x2, y2).

Prove that x2 ! %j are squares of rational numbers for j = 1, 2, 3.

9.16. Let E be an elliptic curve over Q defined by

y2 = x3 ! n2x

for some squarefree n % N. Prove that the conditions in Exercise 9.14 areequivalent to E having a rational point other than (±n, 0), (0, 0), and o.


In other words, n is congruent if and only if E has a rational point otherthan (±n, 0), (0, 0), and o.(It can be shown (see [45, Theorem 5.2, p. 134]) that when E is given by

y2 = x3 + Ax

with A % Z assumed to be fourth-power free, then

E(Q)t = Z/2Z: Z/2Z

if !A is a perfect square,

E(Q)t = Z/4Z

when A = 4, andE(Q)t = Z/2Z

otherwise. Thus, for the case given in this exercise, n is congruent if andonly if E has a point of infinite order.)

9.17. Let n % N be squarefree. Prove that the following are equivalent.

(1) n is a congruent number.(2) The simultaneous (homogeneous Diophantine) equations

x2 + ny2 = z2 and x2 ! ny2 = t2

have a solution in integers x, y, z, t with y '= 0. (A polynomial ofdegree d is said to be homogeneous if each term has degree d. Forexample,

x3 + xyz = z3

is a homogeneous polynomial of degree d = 3 and x + y = z is one ofdegree d = 1.)

9.18. Given the same curve E and point P as in Example 9.6, decipher

c = ((12, 8), 2, 8)

assuming that it was enciphered using the Menezes-Vanstone EllipticCurve Cryptosystem with k = 2 and a = 5.

Chapter 10

Modular Forms

There is no hope without fear, and no fear without hope.from part one paragraph six of Ethics (1677)

Baruch Spinoza (1632–1677)Dutch philosopher

—see Biography 9.4 on page 318

10.1 The Modular Group

In Remark 3.1 on page 98, we discussed unimodular transformations in thecontext of binary quadratic forms involving SL(2, Z). Also, in Exercise 2.5 onpage 66, the content therein is that two Z-modules having the same basis areconnected by a unimodular transformation, namely via those A % GL(2, Z) withdet(A) = ±1.

In order to discuss modular forms, and their connection with elliptic curvesstudied in Chapter 9, we need to expand this discussion into the analytic realm.

First, we letSL(2, R)

be the generalization of SL(2, Z) to R, namely the group of 25 2-matrices withcoe"cients in R and determinant 1.

Then we letC = C 3 {"},

called the Riemann sphere.

331

332 10. Modular Forms

Definition 10.1 Mobius Transformations

Define an action of SL(2, R) on C via the fractional linear transformation, also

called a Mobius transformation, where % =$

a bc d

&% SL(2, R):

( : z ./ %z = ((z) =

1223

224

(az + b)/(cz + d) if z % C and z '= !d/c," if z = !d/ca/c if z = " and c '= 0," if z = " and c = 0.

A value ((") = a/c '= " is called a cusp of %.

By Exercise 10.1 on page 335, the imaginary part of %z % C is given by

>(%z) =>(z)

|cz + d|2 . (10.1)

Now setH = {z % C : >(z) > 0},

namely the upper half plane. Thus, by (10.1), the Mobius transformation (maps H ./ H, which says that H is stable, meaning H is preserved under theaction of SL(2, R). Also, since

((z) = %z = !%z,

namely % and !% represent the same transformation, then

!1 =$!1 00 !1

&

acts trivially on H, so the group

PSL(2, R) = SL(2, R)/{±1},

called the projective special linear group, is actually isomorphic to the group offractional linear transformations. When we specialize to Z, we have the topicin this section’s header.

Definition 10.2 The Modular Group

The group% = PSL(2, Z) = SL(2, Z)/{±1}

is called the modular group.

Note that % in Definition 10.2 is the image of SL(2, Z) in PSL(2, R). More-over, the following describes properties of the modular group in detail.

10.1 The Modular Group 333

Theorem 10.1 Generation of the Modular Group

Let % be the modular group given in Definition 10.2, and set

T =$

1 10 1

&and S =

$0 !11 0

&.

Then % is generated by S and T . In other words, every % % % may be expressed(not uniquely) in the following form

% = T a1ST a2S · · ·ST an ,

for integers aj, j = 1, 2, . . . , n.

Proof. Suppose that % =$

a bc d

&% %. If c < 0 # |a|, then

$a bc d

&= S2

$!a !b!c !d

&, (10.2)

so we may assume that c ) 0, since the right-hand side of (10.2), with !c ) 0,tells us that this case su"ces. If c = 0, then

1 = ad! bc = ad,

so a = d = ±1. Hence,$

a bc d

&=

$±1 b0 ±1

&=

$1 ±b0 1

&= T±b.

Now we use induction on c > 0. If c = 1, then

1 = ad! bc = ad! b,

so b = ad! 1. Thus,

% =$

a ad! 11 d

&=

$0 a1 0

& $0 !11 0

& $1 d0 1

&= T aST d.

so we may now assume that the result holds for all % % % with lower left-handelement < c for some c > 1. Since ad! bc = 1, we have gcd(c, d) = 1, so with

q = b/a, and r = 1/a,

then d = cq + r where 0 < r < c, with

%T#q =$

a bc d

& $1 !q0 1

&=

$a !aq + bc r

&,

where we note that a '= 0 since c > 1. Also,

%T#qS =$

a !aq + bc r

& $0 !11 0

&=

$!aq + b !a

r !c

&. (10.3)

The right-hand side of (10.3) is now available to the induction hypothesis sincer < c, so this completes the induction. !


Remark 10.1 We have shown that % has generators S and T with relations(ST )3 = (TS)3 = 1. One can show that % is the product of the cyclic group oforder 2 generated by S and the cyclic group of order 3 generated by ST—seeExercise 10.4 on the next page. Indeed, T and S are matrix representations ofthe linear transformations

T : z ./ z + 1

andS : z ./ !1

z,

where clearlyS2 = 1 and (ST )3 = 1.

Thus, the argument to prove the above comment is essentially a topologicalargument that shows % has a presentation of the form

% =YS, T ;S2, (ST )3

Z,

which is another way of stating that it is a free product of the cyclic groupsmentioned above. Recall that a “presentation” of a group is defined to be agroup G, generated by a subset S and some collection of relations R1, R2, . . . , Rn,where Rj is an equation in the elements from S 3 {1}, and is denoted by

G = ,S;R1, R2, . . . , Rn- .

Also, a “free product” is a product of two or more groups G and H such that,given presentations of G and of H, we take the generators of G and of H, fromthe disjoint union of those, and adjoin the corresponding relations for G andfor H. This is a presentation of the product of G and H, with the property thatthere should be no “interaction” between G and H, justifying the term “freeproduct.”

Also, there is a correspondence between positive definite binary quadraticforms and points of H as follows. If


is a positive definite binary quadratic form, then

f(x, y) = a(x! 4y)(x! 4y)

with 4 % H. Hence, the association

f ./ 4

is a one-to-one correspondence between the positive definite binary quadraticforms with fixed discriminant D = b2! 4ac and the points of H. Moreover, twoforms are equivalent if and only if the points lie in the same SL(2, Z) orbit, wherean orbit means the equivalence relation given in Definition 3.1 on page 98 forproperly equivalent forms. As well, Theorem 10.1 on the previous page impliesthat every positive definite binary quadratic form is equivalent to a reducedform, and two reduced forms are equivalent if and only if they are equal—seeTheorem 3.1 on page 100.

10.1 The Modular Group 335

Exercises

10.1. Verify equality (10.1) on page 332.

10.2. Let % be the modular group give in Definition 10.2 on page 332, and set

D = {z % C : |z| ) 1 and |;(z)| # 1/2}.

Prove that for every z % H, there exists an * % % such that *z % D.(Hint: Use Theorem 10.1 and Equation (10.1).)

10.3. With reference to Exercise 10.2, prove that if z % D and % % %, with % notthe identity, such that %z % D, then either |;(z)| = 1/2 and %z = z ± 1,or else |z| = 1 and %z = !1/z.(Note that D is called a fundamental domain for the action of % on H,with the properties in Exercises 10.2–10.3 being the two main propertiesthat a fundamental domain must satisfy. Typically, the approach to prov-ing Theorem 10.1 is the use of facts concerning D. However, the moreelementary approach provided herein is more constructive and informative.Exercises 10.2–10.4 are designed to provide information on fundamentaldomains for the edification of the reader, since we will be using these factsin §10.2.)

10.4. With reference to Exercise 10.2, prove that if z % D, then % % % satisfies%z = z if and only if one of the following holds, where S, T are given inTheorem 10.1 on page 333.

(a) % is the identity.(b) z =

&!1, in which case % = S.

(c) z = !23 = ((!1 +

&!3)/2)2, in which case % = (ST )j for j % {1, 2}.

(d) z = !3, in which case % = (TS)j for j % {1, 2}.


10.2 Modular Forms and Functions

The Answer to the Great Question of. . .Life, the Universe, and Everything. . .is forty-two.

from Chapter 27 of The Hitchhiker’s Guide to the Galaxy (1979)Douglas Adams (1951–2001)English science fiction writer

We now build upon the modular group % introduced in §10.1 by presentingand studying forms related to it. The reader will need to have solved Exercises10.2–10.4 before proceeding.

Definition 10.3 Modular Forms and Functions

A function f(z) defined for z % H is called a modular function of weight k % Zassociated with the modular group % if the following properties hold.

(a) f is analytic in H.

(b) f satisfies the functional equation:

f(z) = (cz + d)#kf

$az + b

cz + d

&= (cz + d)#kf(*z),

with z % H and * =$

a bc d

&% %.

(c) The Fourier series of f in the variable q = exp(2"iz) is given by:

f(z) ="!

n=n0(f)

cnqn, (10.4)

where n0(f) % Z —see §5.1.

A modular function of weight k is called a modular form of weight k if, inaddition, n0(f) = 0. In this case, we say that f is analytic at " and writef(") = c0. In the case where f(") = c0 = 0, we say that f is a cusp form.

In the literature modular functions of weight k are sometimes called weaklymodular functions of weight k or an unrestricted modular form of weight k.However, the definition of modular form or cusp form of weight k appears to beuniform. Sometimes the cusp form is called a parabolic form.

Remark 10.2 If * =$!1 00 !1

&in Definition 10.3, then *z = z for all

z % H. Therefore, if f is a modular form of weight k = 2m + 1 for m % Z, then

f(z) = (!1)#kf(*z) = !f(z),

10.2. Modular Forms and Functions 337

so if f(z) '= 0, then dividing through the equation by f(z), we get 1 = !1, acontradiction. Thus, f is just the zero map, sometimes called identically zero.Hence, a nontrivial modular form on % must necessarily be of even weight. Also,

by taking * =$

1 10 1

&= T in Definition 10.3, we obtain that

f(z + 1) = f(z), (10.5)

namely f is invariant under the transformation z ./ z + 1. This is what allowsus to expand f into the expansion (10.4), which is called the q-expansion off . (If we went into the details, we could invoke the Cauchy integral theoremusing (10.5) to show symmetry in a certain line integral on f(z) exp(!2"iz),and the interested reader with knowledge of this area can derive the q-expansionin this fashion.) Note that condition (c) implies that if z = x + yi and y /",then q / 0 as y / ". Thus the q-expansion (10.4) may be considered asan expansion about z = ", which justifies the reference to f being calledholomorphic at ". The condition above for a cusp form tells us, therefore, thatf vanishes as y /".

Example 10.1 The Eisenstein series of weight k ) 2 are defined by the infiniteseries

G2k(z) =!

m,n#Z"(0,0)

(nz + m)#2k, for >(z) > 0 (10.6)

where the notation m, n % Z! (0, 0) means that m and n run over all integersexcept that m = n = 0 is not allowed. The Eisenstein series of even weight arethe first nontrivial examples of modular forms on %. Indeed, the following, whichestablishes this fact, is of interest from the viewpoint of arithmetic functionsstudied in Chapter 5.

Theorem 10.2 Eisenstein Series as Modular Forms

For q = exp(2"iz) and >(z) > 0, the Eisenstein series given in (10.6) hasFourier expansion given by

G2k(z) = 2!(2k) + 2(2"i)2k

(2k ! 1)!

"!

n=1

(2k#1(n)qn,

where k ) 2, !(s) is the Riemann !-function, and (a(n) ="

d|n da is a sum ofa-th powers of positive divisors of n. Accordingly, G2k(z) is a modular form ofweight 2k.

Proof. We know from elementary calculus that the following identity holds

" cot("z) =1z

+"!

m=1

$1

z + m+

1z !m

&(10.7)


see [101, p. 344]. For >(z) > 0 (so |q| < 1) we get

" cot("z) = "cos("z)sin("z)

= i"q + 1q ! 1

= i" ! 2"i

1! q= i" ! 2"i

"!

c=0

qc, (10.8)

where the second equality comes from the fact that

cot("z) =i(e2(iz + 1)e2(iz ! 1

, (10.9)

and the last equality follows from the standard geometric formula

limN!"

N!

c=0

qc = limN!"

qN+1 ! 1q ! 1

=1

1! q,

where the last equality follows from the fact that |q| < 1—see [68, Theorem 1.2,p. 2]. Therefore, (10.7)–(10.8) imply that

1z

+"!

m=1

$1

z + m+

1z !m

&= i" ! 2"i

"!

c=0

qc. (10.10)

Now di!erentiating (10.10) 2k ! 1 times with respect to z we get

(!1)2k#1(2k ! 1)!z#2k + (!1)2k#1(2k ! 1)!"!

m=1

$1

(z + m)2k+

1(z !m)2k

&

= !(2"i)2k"!

c=1

c2k#1qc,

which implies that

z#2k +"!

m=1

$1

(z + m)2k+

1(z !m)2k

&=

(2"i)2k

(2k ! 1)!

"!

c=1

c2k#1qc,

so"!

m=#"

1(z + m)2k

=(2"i)2k

(2k ! 1)!

"!

c=1

c2k#1qc. (10.11)

However, since

G2k(z) =!

m,n#Z"(0,0)

(nz + m)#2k =!

m(=0

1m2k

+!

n (=0

"!

m=#"

1(nz + m)2k

, (10.12)

and we know from (5.26) on page 218 that

1!

m=#"

1m2k

="!

m=1

1m2k

= !(2k),


as well as the fact that the sum over nonzero values of n is twice the sum overpositive values of n in the second summand of (10.12), then

G2k(z) = 2!(2k) + 2"!

n=1

"!

m=#"

1(m + nz)2k

. (10.13)

Hence, by replacing z by nz in (10.11) and applying it to the last summand in(10.13), we achieve that

G2k(z) = 2!(2k)+2(2"i)2k

(2k ! 1)!

"!

c=1

"!

a=1

c2k#1qac = 2!(2k)+2(2"i)2k

(2k ! 1)!

"!

n=1

(2k#1(n)qn.

For the last statement, we note that it follows that

G2k(*z) = (cz + d)2kG2k(z),

for* =

$a bc d

&% %,

so G2k(z) is a modular form of weight 2k. !

Corollary 10.1 G2k(") = 2!(2k).

Proof. We have

limz!"

G2k(z) = 2!(2k) + 2(2"i)2k

(2k ! 1)!

"!

n=1

(2k#1(n) limz!"

qn,

but by Remark 10.2 on page 336, limz!" q = 0, which is the result. !

Example 10.2 From Theorem 10.2 on page 337, we get

G2k(z) = 2!(2k)E2k(z),

with

E2k(z) =G2k(z)2!(z)

= 1 + %k

"!

n=1

(2k#1(n)qn

where, via (5.4)–(5.5) on pages 197–198,

%k = (!1)k 4k

|B2k|,

and Bk is the k-th Bernoulii number given in Definition 5.1 on page 192. Themodular form E2k is called the weight k Eisenstein series, which is not a cuspform.


Thus, for k = 2,

E4(z) = 1 + 240"!

n=1

(3(n)qn,

and for k = 3,

E6(z) = 1! 504"!

n=1

(5(n)qn.

A few more examples are for k = 4,

E8(z) = 1 + 480"!

n=1

(7(n)qn,

for k = 5,

E10(z) = 1! 264"!

n=1

(9(n)qn,

and for k = 6,

E12(z) = 1 +65520691

"!

n=1

(11(n)qn.

Remark 10.3 The first two cases in Example 10.2 motivate a basic notionwhich we now develop. The weight k Eisenstein series are foundational elementsfor the development of all modular forms in the sense that any modular form canbe expressed as a polynomial in E4 and E6. For instance, |C : M8(%)| = 1, byRemark 10.4, so M8 is one-dimensional space spanned by E8. Moreover, E2

4 hasweight 8 and constant term 1 by Example 10.2, so E2

4 = E8—see Exercise 10.15on page 346, as well as more information in Example 10.4 on page 342.

First we letg2 = 60G4 and g3 = 140G6,

where the need for the coe"cients will become clear when we link modular formsto elliptic curves in §10.3, as will the contents of the following.

Definition 10.4 Modular Discriminant Function and j-Invariant

The function # : H ./ C given by

# = g32 ! 27g2

3

is called the discriminant function, and the j-invariant is given by

j(#) =1728g3

2

#.


Example 10.3 The discriminant function given in Definition 10.4 was provedby Jacobi to be of the form

#(q) = (2")12q"!

n=1

(1! qn)24,

with q % C with |q| < 1—see Exercise 10.16 on page 346. Indeed, the n-th coe"cients of the cusp form F (z) = (2")#12#(z) are values of ,(n), thedistinguished Ramanujan’s , -function:

"!

n=1

,(n)qn = q"!

n=1

(1! qn)24

where , : N ./ Z. Note that since g2(") = 120!(4) and g3(") = 280!(6), thenusing Exercise 10.14 on page 346,

g2(") =4"4

3, and g3(") =

8"6

27.

Thus,

#(") =$

4"4

3

&3

! 27$

8"6

27

&2

= 0,

which means that # is a cusp form and by Exercise 10.16, it is of weight 12.Another formula for the discriminant function that lends itself more readily

to computations than that given above is in terms of the Dedekind-1 functiondefined by:

1(z) = q1/24"7

n=1

(1! qn),

where q = exp(2"iz) and q1/24 = exp("i/12). Thus,

#(z) = (2")121(z)24,

where by Exercise 10.18,

1(z + 1) = exp("i/12)1(z) and 1(!z#1) = (!iz)1/21(z), (10.14)

where we take the branch of the square root is chosen to be positive on theimaginary axis. Also, by Exercise 10.17, the j-invariant is a modular functionof weight 0, namely a modular function, which has q-expansion given by

j(z) =1q

+ 744 +"!

n=1

cnqn,

where z % H and q = exp(2"iz). It can be shown that

j :H

%./ C

is an isomorphism (of Riemann surfaces) and that any modular function ofweight 0 must be a rational function of j—see [87, Propositions 5–6, p. 89].


We now look at spaces of forms and how they fit into the picture we havebeen painting.

Definition 10.5 Space of Modular Forms

The set of modular forms of weight k on % forms a complex vector space denotedby Mk(%). The subspace of cusp forms is denoted by M0

k (%).

Remark 10.4 It can be shown that the following dimensions hold—see [87].

|C : Mk(%)| =00k/121+ 1 if k '+ 2 (mod 12),0k/121 if k + 2 (mod 12).

Also,

|C : M0k (%)| =

00k/121 if k '+ 2 (mod 12),0k/121 ! 1 if k + 2 (mod 12).

Example 10.4 With reference to Theorem 10.2 and Remark 10.4, for k = 14,

M14 = CE14.

Moreover, in terms of Eisenstein series and cusp forms we have the followingdirect sum for k even k ) 4,—see [68, p. 305],

Mk = CEk :M0k .

Observe, by Remark 10.4 that M014(%) = 0. Further, with reference to Re-

mark 10.3 on page 340, it may be shown that the space Mk has for basis the fam-ily of monomials G#

2 G%3 for all nonnegative integers %,& with 2% + 3& = k—see

[87, Corollary 2, p. 89]. Moreover, it can be shown that multiplication by thediscriminant function # defines an isomorphism of Mk#12 onto M0

k , which isequivalent to the following. If M =

""k=0 Mk, called a graded algebra, the direct

sum of the Mk, and h : C[x, y] ./ M is the homomorphism sending x to G2 andy to G3, then h is an isomorphism—see [87, Theorem 4, p. 88!].

Remark 10.5 In the area of algebraic geometry,10.1 most of the interestingentities come into view when we look at arithmetically defined subgroups offinite index in %. One such class of groups is called Hecke congruence subgroupsdenoted by %0(n) for any n % N, defined by

%0(n) =0$

a bc d

&% % : c + 0 (mod n)

L.

10.1Algebraic geometry is a branch of mathematics combining methods in use in abstractalgebra, especially commutative algebra, with the language of geometry. It has interconnec-tions with complex analysis, topology, and number theory. At its most basic level, algebraicgeometry deals with algebraic varieties, which are geometric manifestations of solutions ofpolynomial equations. For instance, plane algebraic curves, which include circles and parabo-las for instance, comprise one of the most investigated classes of algebraic varieties.


It is known that the index of %0(n) in % is given by

|% :% 0(n)| = n7

p|np=prime

$1 +

1p

&,

the product over distinct primes divding n. See Exercises 10.6–10.8 on page 344for applications of this fact.

An example of a modular form related to %0(n) is given by

f(z) = 1(z)21(11z)2, (10.15)

which is a cusp form of weight 2 related to the group %0(11). Here 1 is theDedekind-1 function introduced in Example 10.3 on page 341.

Hecke groups defined in Remark 10.5 allow us to add another “level” to thenotion of a modular form.

Definition 10.6 Levels of Modular Forms

If f is an analytic function on H with

f(*z) = (cz + d)kf(z) for all * % %0(n),

and has a q-expansion

f(z) ="!

j=n0(f)

aj(f)qj where q = exp(2"iz) with n0(f) % Z, (10.16)

then f is called a modular function of weight k and level n. A modular functionof weight k and level n is called a modular form of weight k and level n ifn0(f) = 0. Moreover, if a0(f) = 0, we call f a cusp form of weight k and leveln. When a1(f) = 1, and a0(f) = 0, we say that f is a normalized cusp form ofweight k and level n.

Spaces of modular and cusp forms of weight k and level n are denoted byMk(%0(n)), respectively Sk(%0(n)).

Example 10.5 It can be shown that S2(%0(11)) is a one-dimensional spacespanned by Equation (10.15)—see [88, Remark 12.17, p. 351]. This examplewill have significant implications for a celebrated conjecture—see Example 10.9on page 360. Also, S2(%0(2)) is the zero space and this too will have implicationsfor the proof of FLT—see Theorem 10.4 on page 365.

Note that Definition 10.1 on page 332 and Exercises 10.6–10.8 tell us (10.16)implies that a modular function of weight k and level n is holomorphic at thecusps.

In §10.4, we will see that, roughly speaking, all rational elliptic curves arisefrom modular functions of a certain level and weight, and explore the intercon-nections, including critical implications for the proof of Fermat’s Last Theorem.We begin in §10.3 with linking elliptic curves and modular forms.


Biography 10.1 Erich Hecke (1887–1947) was born in Buk, Posen, Germany(now Pozan, Poland) on September 20. His studies at university included theUniversity of Breslau, the University of Berlin where he studied under Landau,and finally Gottingen, where Hilbert was his supervisor—see Biographies 3.1on page 104 and 3.5 on page 127. In 1910, he was awarded his doctorate, andremained at Gottingen as assistant to Hilbert and Klein. After a brief stintat Basel, he returned to a chair of mathematics at Gottingen, but left again,this time for a chair at Hamburg in 1919. One of the reasons for leaving wasthat the university at Hamburg was founded in that year and he felt he couldinfluence its development. Indeed he did and remained there for the rest of hisprofessional life.

Hecke is probably best remembered for his work in analytic number theory,where he proved results that simplified theorems in class field theory, a branchof algebraic number theory that deals with abelian extensions of number fields,namely those with an abelian Galois group—see [64]. He studied Riemann’s!-function and its generalization to any number field. He also introduced theconcept of a Grossencharakter and its corresponding L-series. He then usedthe properties of analytic continuation he had proved for the !-function andextended them to his L-series. One of his most renowned results was achievedin 1936 when he introduced the algebra of what we now call Hecke operatorsand the Euler products associated with them.

Hecke died of cancer in Copenhagen, Denmark on February 13, 1947 in hisfifty-ninth year.

Exercises

10.5. Let f be a function that is analytic on H. Prove that condition (b) ofDefinition 10.3 on page 336 is equivalent to the conditions

(1) For all z % H, f(z + 1) = f(z).(2) For all z % H, and some k % Z, f(!1/z) = (!z)kf(z).

(Hint: Prove that conditions (1)–(2) imply that the subset of % generatedby the elements for which (b) hold is a subgroup of %. Consequently, thissubgroup must be all of % since S and T are in this subgroup. Do this bydefining

d(*, z) = cz + d,

for * =$

a bc d

&% %. Then prove that d(%*, z) = d(%, *z)d(*, z) and

d(*#1, z) = (d(*, *#1z))#1 for all %, * % %, and z % H. The converse isstraightforward given Remark 10.2 on page 336.)

10.6. In Remark 10.5 on page 343, the index of the congruence subgroup %0(n)in % was given. If n = p a prime, find left coset representatives *j for


j = 0, 1, 2, . . . , p such that

% = 3pj=0*j%0(p).

10.7. With reference to Exercise 10.6, find coset representatives *j for

% = 3pa+pa"1

j=0 *j%0(pa),

where p is prime and a > 1.

10.8. With reference to Exercise 10.2 on page 335 and Remark 10.5, let

% = 3nj=0*j%0(n) (10.17)

be a left coset decomposition of %0(n) in %. Then

Dn = 3nj=0*jD

is a fundamental domain for %0(n), where D is a fundamental domain for%. Find the decomposition for D2.

10.9. Let % have a decomposition as in (10.17) above. Prove that every *j(")represents a cusp as given in Definition 10.1 on page 332.

10.10. With reference to Exercise 10.9, prove that if i '= j and b % Z, then

*j*#1i = ±

$1 b0 1

&

implies that both *#1i (") and *1

j (") represent the same cusp, namely forsome % % %0(n), we have that *#1

j (") = *#1i %("). Apply the condition

to the case in Exercise 10.6.

10.11. Is the condition in Exercise 10.10 necessary?(Hint: Look at the case n = 8 in Exercise 10.7.)

10.12. Prove that the function f , defined by f(x) = %(x)%(1!x) sin("x), satisfiesf(x) = f(x + 1), where the Gamma function is given in Definition 5.6 onpage 224.(Hint: Use Formula (5.34) on page 224. )

10.13. Prove that

sinx = x"7

j=1

$1! x2

j2"2

&.

(Hint: Use Exercise 10.12. Also, you may use the formula

%(x)%(1! x) ="

sin("x)(10.18)


—see [101, Formula (25), p. 697], as well as the Weierstrass productformula for the Gamma function,

%(x) = e#'x 1x

"7

j=1

ex/j

1 + x/j, (10.19)

where * is Euler’s constant given by (4.13) on page 172—see Biography 4.6on page 179.)

10.14. Establish (5.5) on page 198.(Hint: Use Exercise 10.13 by di"erentiating and compare the result withthe formula

z cot z = 1!"!

n=1

Bn22nz2n

(2n)!,

which follows from Definition 5.1 on page 192 by putting x = 2iz.)

10.15. With reference to Remark 10.3 on page 340, prove that

(7(n) = (3(n) + 120n#1!

j=1

(3(n)(3(n! j).

10.16. Prove that # given in Example 10.3 on page 341 is a modular form ofweight 12, namely that:

#(q) = (2")12q"!

n=1

(1! qn)24.

10.17. Prove that the j-invariant of Definition 10.4 on page 340 is a modularfunction of weight 0 with q-expansion

j(z) =1q

+ 744 +"!

n=1

cnqn,

where z % H and q = exp(2"iz).

10.18. Establish (10.14) on page 341.

10.3. Applications to Elliptic Curves 347

10.3 Applications to Elliptic Curves

I believe that if mathematicians on any other planet, anywhere in the uni-verse, have su!ciently advanced knowledge of arithmetic and geometry, theywill know the Pythagorean theorem, that pi is 3.14+, and that 113 is prime.Of course, they will express these truths in their own language and symbols.Within formal systems, mathematical theorems, unlike a culture’s folkwaysand mores, and even its laws of science, are absolutely certain and eternal.

see [22, pp. 274–275]Martin Gardner (1914–)

American science writer specializing in recreational mathematics

In this section, we apply the knowledge gained in Chapter 9 and in §10.1–10.2 to elliptic curves to show the wealth of results emanating from our journey.We begin with a link between elliptic curves and modular functions.

Definition 10.7 Elliptic Modular Functions

If f is a function analytic on C such that for n % N and z % C,

f(*z) = f(z) for all * % %(n),

then f is called an elliptic modular function, where

%(n) =0$

a bc d

&% % : b + c + 0 (mod n)

L

is called the principal congruence subgroup of %.

Note that%(n) ( %0(n) ( %.

In general, any analytic function that is invariant under a group of linear trans-formations is called an automorphic function. The classic elliptic modular func-tion has already been encountered in §10.2.

Example 10.6 The j-invariant

j(#) =1728g3

2

#=

1q

+ 744 +"!

n=1

cnqn,

where z % H and q = exp(2"iz) is an elliptic modular function.

The j-invariant is linked to elliptic curves in a natural way as follows.


Definition 10.8 Weierstrass Equations for Elliptic Curves

If F is a field of characteristic di!erent from 2 or 3 and E(F ) is an elliptic curveover F , then

y2 = 4x3 ! g2x! g3

where g2, g3 % F , and# = g3

2 ! 27g23 '= 0

is called the Weierstrass equation for E.

In order to give our first example of Weierstrass equations, we need thefollowing concept. We encountered real lattices in Definition 4.4 on page 182.We now look at a complex version. Recall, for the following that, in general, asingularity of a complex function is a point at which the function is not defined.Also, an isolated singularity z0 is one for which there are no other singularitiesof the function “close” to it, which means that there is an open disk

D = {z % C : |z ! z0| < r % R+}

such that f is holomorphic on D ! {z0}.

Definition 10.9 Lattices in C and Elliptic Functions

A lattice in C is an additive subgroup of C which is generated by two complexnumbers 41 and 42 that are linearly independent over R, denoted by

L = [41, 42].

Then an elliptic function for L is a function f defined on C, except for isolatedsingularities, satisfying the following two conditions:

(a) f(z) is meromorphic on C.

(b) f(z + 4) = f(z) for all 4 % L.

Remark 10.6 Condition (b) in Definition 10.9 is equivalent to

f(z + 41) = f(z + 42) = f(z),

for all z, a property known as doubly periodic. Hence, an elliptic function for alattice L is a doubly periodic meromorphic function and the elements of L arecalled periods.


Definition 10.10 Lattice Discriminant and Invariant

The j-invariant of a lattice L is the complex number

j(L) =1728g2(L)3

g2(L)3 ! 27g3(L)2, (10.20)

whereg2(L) = 60

!

w#L"{0}

1w4

,

andg3(L) = 140

!

w#L"{0}

1w6

.

The discriminant of a lattice L is given by

#(L) = g2(L)3 ! 27g3(L)2.

One of the most celebrated of elliptic functions is the following.

Definition 10.11 Weierstrass $-Functions

Given z % C such that z '% L = [41, 42], the function

$(z;L) =1z2

+!

(#L"{0}

$1

(z ! 4)2! 1

42

&(10.21)

is called the Weierstrass $-function for the lattice L.

Remark 10.7 The Weierstrass $-function is an elliptic function for L whosesingularities can be shown to be double poles at the points of L. This is doneby showing that $(z) is holomorphic on C ! L and has a double point at theorigin. Then one may demonstrate that since

$%(z) = !2!

(#L

1(z ! 4)3

,

which can be shown to converge absolutely, then $%(z) is an elliptic function for

L = [41, 42].

Since $(z) and $(z + 4j) have the same derivative, given that $%(z) is periodic,then they di!er by a constant which can be shown to be zero by the fact that$(z) is an even function. This demonstrates the periodicity of $(z) from whichit follows that the poles of $(z) are double poles and lie in L—see [18, Theorem10.1, p. 200].


Example 10.7 By Exercise 10.22 on page 352, the Laurent series expansion(generally one of the form

""n=#" anzn) for $(z) about z = 0 is given by

$(z) =1z2

+"!

n=1

(2n + 1)G2n+1(L)z2n, (10.22)

where for a lattice L, and an integer r > 2,

Gr(L) =!

(#L"{0}

14r

.

From this, by Exercise 10.23, it follows that if x = $(z;L) and y = $%(z;L),

y2 = 4x3 ! g2(L)x! g3(L), (10.23)

where gj(L) for j = 2, 3 are given in Definition 10.10 on the preceding page.

Remark 10.8 If E is an elliptic curve over C given by the Weierstrass equation

y2 = 4x3 ! g2x! g3,

with g1, g2 % C and g32 ! 27g2

3 '= 0, then there is a unique lattice L ( C suchthat

g2(L) = g2 and g3(L) = g3

—see [18, Proposition 4.3, p. 309].

The j-invariant may be used with elliptic curves as follows.

Definition 10.12 j-Invariants for Elliptic Curves

If E is an elliptic curve defined by the Weierstrass equation in Definition 10.8on page 348, then

j(E) = 1728g32

g32 ! 27g2

3

= 1728g32

#% F

is called the j-invariant of E.In Definition 10.12, # '= 0 and 1728 = 26 · 33. Since we are not in charac-

teristic 2 or 3, then j(E) is well defined. If F = C, then when E is the ellipticcurve defined by the lattice L ( C,

j(L) = j(E). (10.24)

By Exercise 10.19 on the facing page isomorphic elliptic curves have thesame j-invariant. Also, Definition 10.12 provides a means of looking at classesof elliptic curves.


Definition 10.13 Weierstrass and Elliptic Curves

Suppose thatEj = Ej(F ) for j = 1, 2

are elliptic curves over F defined by Weierstrass equations

y2 = 4x3 ! g(j)2 x! g(j)

3 for j = 1, 2.

Then E1 and E2 are isomorphic over F if there is a nonzero % % F such that

g(2)2 = %4g(1)

2 and g(2)3 = %6g(1)

3 .

This is denoted byE1

$= E2,

induced by the map(x, y) ./ (%2x, %3y).

In §10.4, we will be able to use the concepts developed thus far to be ableto state the Shimura–Taniyama–Weil conjecture that was proved in the lastcentury and whose solution implies Fermat’s Last Theorem. The proof of thisconjecture is arguably the most striking and important mathematical develop-ment of the twentieth century and it will be a fitting conclusion to the maintext of this book.

Exercises

10.19. Prove that isomorphic elliptic curves have the same j-invariant.

10.20. Prove that the discriminant of a lattice L satisfies

#(L) = 16(e1 ! e2)2(e1 ! e3)2(e2 ! e3)2,

where the ej for j = 1, 2, 3 are the roots of

4x3 ! g2(L)x! g3(L).

(Hint: Use Exercise 2.25 on page 96. Then compare the coe!cients of

(g32 ! 27g2

3)/16

with those of 7

1&ei<ej&3

(ei ! ej)2.)


10.21. Prove that for |x| < 1, we have that

1(1! x)2

! 1 ="!

n=1

(n + 1)xn.

(Hint: You may use the fact from standard geometric series that

"!

n=0

xn = (1! x)#1.)

10.22. Establish (10.22) on page 350.(Hint: Use Exercise 10.21 to get a series expansion for

(4 ! z)#2 ! 4#2,

then plug this into the representation for $ given in Definition 10.11 onpage 349.)

10.23. Establish (10.23) on page 350.(Hint: Use Exercise 10.22. Then employ what is called Liouville’s theo-rem for elliptic functions which says: An elliptic function with no poles(or no zeros) is constant. This theorem may be found in any standard texton complex analysis. More generally, Liouville’s theorem is often statedas: A bounded entire function on C is constant, often called Liouville’sboundedness theorem from which the fundamental theorem of algebra fol-lows as a simple consequence.)

10.24. Prove that the discriminant of a lattice given in Definition 10.10 onpage 349 is nonzero.(Hint: Use Exercise 10.23 and the fact given in Remark 10.7 on page 349,that $%(z) is an odd elliptic function.)

10.25. Two lattices Lj for j = 1, 2 are called homothetic if there exists a . % Csuch that L1 = .L2. Prove that if Ej are elliptic curves with respect toLj for j = 1, 2, respectively, then

E1$= E2 if and only if L1 and L2 are homothetic.


10.4. Shimura–Taniyama–Weil & FLT 353

10.4 Shimura–Taniyama–Weil & FLT

Casually in the middle of a conversation this friend told me that Ken Ribethad proved a link between Taniyma–Shimura and Fermat’s Last Theorem. Iwas electrified. I knew that moment that the course of my life was changingbecause this meant that to prove Fermat’s Last Theorem all I had to do wasto prove the Taniyama–Shimura conjecture. . . . Nobody had any idea how toapproach Taniyma–Shimura but at least it was mainstream mathematics. . . .So the romance of Fermat, which had held me all my life, was now combinedwith a problem that was professionally acceptable. . . . It was one morning inlate May. . . . I was sitting around thinking about the last stage of the proof.. . . I forgot to go down for lunch. . . . My wife, Nada, was very surprised thatI’d arrived so late. Then I told her I’d solved Fermat’s Last Theorem.

from an interview with NOVA—for the full interview see http://www.pbs.org/wgbh/nova/proof/wiles.html

Andrew Wiles (1953–)—see Biography 5.5 on page 225British mathematician living in the U.S.A.

In order to display the force of the Shimura–Taniyama–Weil (STW) conjec-ture, it is an important motivator to set the stage by briefly outlining the eventsleading to its proof and the connections with FLT. We begin with the latter.FLT would seem on the face of it to have no connections with elliptic curvessince xn + yn = zn is not a cubic equation. However, in 1986 Gerhard Freypublished [27], which associated, for a prime p > 5, the elliptic curve

y2 = x(x! ap)(x + bp) (10.25)

with nontrivial solutions to ap + bp = cp. We call elliptic curves, given by equa-tion (10.25), Frey curves. It turns out that this curve is of the type mentionedin the STW conjecture. In other words, existence of a solution to the Fermatequation would give rise to elliptic curves which would contradict STW. Nowwe need to describe the technical details.

In general, an elliptic curve E defined over a field F may be given by theglobal Weierstrass equation

y2 + a1xy + a3y = x3 + a2x2 + a4x + a6, (10.26)

where aj % F for 1 # j # 6. Then when F has characteristic di!erent from 2,we may complete the square, replacing y by (y ! a1x ! a3)/2 to get the morefamiliar Weierstrass equation

y2 = 4x3 + b2x2 + 2b4x + b6 (10.27)

withb2 = a2

1 + 4a2,

b4 = 2a4 + a1a3,

http://www.pbs.org/wgbh/nova/proof/wiles.html

354 10. Modular forms

andb6 = a2

3 + 4a6.

In this case the discriminant #(E) = # is given by

#(E) = !b22b8 ! 8b3

4 ! 27b26 + 9b2b4b6, (10.28)

whereb8 = a2

1a6 + 4a2a6 ! a1a3a4 + a2a23 ! a2

4.

Also, the j-invariant is given by

j(E) = c34/#(E), (10.29)

wherec4 = b2

2 ! 24b4 (10.30)

andj(E) = 1728 + c2

6/# (10.31)

wherec6 = !b3

2 + 36b2b4 ! 216b6. (10.32)

By Exercise 10.26 on page 366, these definitions for #(E) and j(E) coincidewith Definition 10.12 on page 350 for the special case of the Weierstrass equationcovered in §10.3. We may further simplify Equation (10.27) by replacing (x, y)with ((x! 3b2)/36, y/108) to achieve

y2 = x3 ! 27c4x + 54c6. (10.33)

By Exercise 10.27,

#(E) =c34 ! c2

6

1728. (10.34)

Remark 10.9 Note, however, that if we begin with Equation (10.33), then thediscriminant is

#(E) = 26 · 39(c34 ! c2

6),

which di!ers from (10.34) by a factor of 212 · 312, and this is explained by thescaling introduced in change of variables in going from (10.26) to (10.27), thento (10.33).

Remark 10.9 shows that a change of variables may “inflate” a discriminantwith new factors. Thus, for our development, we need to find a “minimaldiscriminant.” In order to proceed with this in mind, we need the followingconcept.


Definition 10.14 Admissible Change of Variables

If E = E(Q) is an elliptic curve over Q, given by (10.26) where we may assumethat aj % Z for j = 1, 2, 3, 4, 6, then an admissible change of variables is one ofthe form

x = u2X + r and y = u3Y + su2X + t,

where u, r, s, t % Q and u '= 0 with resulting equation

Y 2 + a%1XY + a%3Y = X3 + a%2X2 + a%4X + a%6 (10.35)

wherea%1 =

a1 + 2s

u, a%2 =

a2 ! sa1 + 3r ! s2

u2,

a%3 =a3 + ra1 + 2t

u3, a%4 =

a4 ! sa3 + 2ra2 ! (t + rs)a1 + 3r2 ! 2st

u4,

anda%6 =

a6 + ra4 + r2a2 + r3 ! ta3 ! t2 ! rta1

u6.

Remark 10.10 From the projective geometry viewpoint discussed in Re-mark 9.1 on page 302, considering equivalence classes of points (x, y, z), anadmissible change of variables fixes the point at infinity (0, 1, 0) and carries theline for which z = 0 to the same line. The original Weierstrass form (10.26) getssent to the same curve in Weirstrass form (10.35). Modulo a constant, admis-sible changes of variables are the most general linear transformations satisfyingthese properties.

In the special case where r = s = t = 0, the admissible change of variablesmultiplies the ai by u#i for i = 1, 2, 3, 4, 6. In this case, we say ai has weighti. Indeed, Definition 10.13 on page 351 is just this special case of an admis-sible change of variables. In general, we may define two elliptic curves to beisomorphic if they are related by an admissible change of variables. Hence, byExercise 10.19 on page 351, two elliptic curves over Q are related by an admis-sible change of variables if and only if they have the same j-invariant. There isanother term in the literature used to describe this phenomenon as well. Twoelliptic curves over Q having the same j-invariant are said to be twists of oneanother.

Since the discriminant # is given by (10.34) in terms of c4 and c6, then #is una!ected by r, s, t in an admissible change of variables given that the newvariables for (10.35) are related by

c%4 = c4/u4 and c%6 = c6/u6.

Hence, the triple (#, c4, c6) is a detector for curves that are equivalent underan admissible change of variables. In fact, by the above discussion, two elliptic


curves E1 and E2 with discriminant #1 and #2, respectively, related by anadmissible change of variables, must satisfy

#1/#2 = u±12.

This now sets the stage for looking at elliptic curves with minimal discriminants.

For the ensuing development, the reader should be familiar with the notationand topics covered in §6.2, especially Theorem 6.1 on page 236 and the notation/p(x) ) 0 that characterizes the p-adic integers x % Op. Also, the notation ofDefinition 10.14 remains in force.

Definition 10.15 Minimal Equations for Elliptic Curves

If E = E(Q) is an elliptic curve over Q, given by (10.26) where aj % Z forj = 1, 2, 3, 4, 6, with discriminant #, then (10.26) is called minimal at the primep if the power of p dividing # cannot be decreased by making an admissiblechange of variables with the property that the new coe"cients a%j % Op. If(10.26) is minimal for all primes p with aj % Z for j = 1, 2, 3, 4, 6, then it iscalled a global minimal Weierstrass equation.

Remark 10.11 Since an equation for E(Q) given in Definition 10.15 can beassumed, without loss of generality, to have integral coe"cients, then |#|p # 1where |·|p is the p-adic absolute value given in Definition 6.3 on page 233. Hence,in only finitely many steps |#|p can be increased and still maintain |#|p # 1.Hence, it follows that in finitely many admissible changes of variables, we canget an equation minimal for E at p. In other words, there always exists a globalminimal Weierstrass equation for E(Q).

Note that|#|p = 1 if and only if p ! #.

Also, by Exercise 10.30 on page 366, if any of

(1) |#|p > p#12.

(2) |c4|p > p#4.

(3) |c6|p > p#6.

holds then (10.26) is minimal for p. Moreover, if p > 3, |#|p # p#12, and|c4|p # p#4, then (10.26) is not minimal for p.

For the following, the reader is reminded, via Exercise 9.5 on page 315, of

Np = p + 1 +!

x'Fp

7(x3 + ax + b), (10.36)

being the number of points on the elliptic curve E(Fp), including the point atinfinity, over a field of p elements for a prime p.


Definition 10.16 The Reduction Index for Elliptic Curves

Suppose that E is an elliptic curve over Q given by a minimal Weierstrassequation. If the E (mod p) '= 0 for a prime p, then p is said to be a prime ofgood reduction for E. Furthermore, if Np for a prime p is given by (10.36), thenlet

ap(E) = p + 1!Np.

If p is a prime of good reduction, then ap(E) is called the good reduction index forE at p, and the sequence {ap(E)}p indexed over the primes of good reduction iscalled the good reduction sequence for E. Primes that are not of good reductionare called primes of bad reduction for E, and ap(E) is called the bad reductionindex for E.

Note that there are only finitely many primes of bad reduction since theseare the primes dividing #. Also, by Theorem 9.5 on page 319, we know that|ap(E)| < 2&p. There is much more of interest in the reduction index.

Example 10.8 Consider the elliptic curve given by

y2 + y = x3 ! x2.

Via the formulas in (10.26)–(10.34) on pages 353–354, we have

a1 = 0, a3 = 1, a2 = !1, a4 = 0 = a6, b2 = !4, b4 = 0, b6 = 1, and b8 = !1.

Therefore,#(E) = !b2

2b8 ! 8b34 ! 27b2

6 + 9b2b4b6

= !(!4)2(!1)! 8(0)3 ! 27 · 12 + 9(!4)(0)(1) = !11,

so E has good reduction at all primes p '= 11. Now we compute the goodreduction index for this curve at various primes p '= 11, which we call a goodreduction table for E.

p 2 3 5 7 13 17 19 23 29 31 37 41Np 5 5 5 10 10 20 20 25 30 25 35 50

ap(E) !2 !1 !2 1 4 !2 0 !1 0 7 3 !8

See Exercise 10.28 on page 366 for more related illustrations. Also, seeExample 10.9 on page 360.

Remark 10.12 To say that p is a prime of good reduction for E is to say thatE is nonsingular over Fp, meaning that #(E (mod p)) is not divisible by p. Wenow explain this in detail. A point P = (x0, y0) on an elliptic E(F ) = E curveover a field F is called a singular point if P satisfies the equation, defining E,given by

f(x, y) = y2 + a1xy + a3y ! x3 ! a2x2 ! a4x! a6 = 0 (10.37)


with the partial derivatives satisfying

9f/9x(P ) = 9f/9y(P ) = 0.

Thus, to say that P is a singular point of E is to say that E is a singular curveat P . To say that E is nonsingular over F is to say that the curve has nosingular points. By Exercise 10.31 on page 366, E is nonsingular if and only if#(E) '= 0. Note that E never has a singular point at infinity also shown in thatexercise.

Singular points may be classified as follows. By Exercise 10.32 on page 367,

E has a node if and only if #(E) = 0 and c4 '= 0;

andE has a cusp if and only if #(E) = 0 = c4.

More explicitly, we may view the Taylor expansion of f(x, y) at P via

f(x, y)! f(x0, y0) = [(y ! y0)! %(x! x0)][(y ! y0)! &(x! x0)]! (x! x0)3.

Then P is a node if % '= & having tangent lines at P given by y!y0 = %(x!x0)and y ! y0 = &(x ! x0). An example is the curve given by y2 = x3 + x2, forwhich # = 0 and c4 = 16. Here P = (x0, y0) = (0, 0), and the two tangents arey = x and y = !x as in Figure 10.1.

Figure 10.1: y2 = x3 + x2

Also, P is a cusp if % = & with tangent line at P given by

y ! y0 = %(x! x0).


An example is the curve given by y2 = x3, where # = c4 = 0. The singletangent is y = 0 at P = (x0, y0) = (0, 0). See Figure 10.2.

Figure 10.2: y2 = x3

Remark 10.13 The good reduction index is a mechanism for representingarithmetic data about E that is captured in patterns of the good reduc-tion sequence {ap(E)}p. How it does this is contained in the subtext of theShimura–Taniyama–Weil conjecture. The pattern involves the normalized mod-ular cusp forms of weight 2 and level n % N that we introduced in Definition 10.6on page 343.

Definition 10.17 Modular Elliptic Curves

Let E(Q) be an elliptic curve over Q with good reduction sequence {ap(E)}p.If there exists an n % N and a normalized weight 2 cusp form of level n

f(z) = q +"!

j=2

aj(f)qj , where q = exp(2"iz),

such thatap(E) = ap(f),

then E is called a modular elliptic curve.

Now we may state the celebrated conjecture.


Conjecture 10.1 The Shimura–Taniyama–Weil (STW) Conjecture

If E is an elliptic curve over Q, then E is modular.

Example 10.9 By Example 10.5 on page 343, the function given in (10.15)spans S2(%0(11)) and is explicitly given by

f(z) = 1(z)21(11z)2 ="!

n=1

cnqn = q"7

n=1

(1! qn)2 · (1! q11n)2 =

q!2q2!q3 + 2q4 + q5 + 2q6!2q7 ! 2q9 ! 2q10 + q11 ! 2q12 + 4q13

+4q14 ! q15 ! 4q16!2q17 + 4q18 + 2q20 + 2q21 ! 2q22!q23 ! 4q25

!8q26 + 5q27 ! 4q28 + 2q30 + 7q31 + · · · + 3q37 + · · ·!8q41 + · · · .

We have highlighted the prime powers of q and their coe"cients to show thatthese coe"cients are exactly the nonzero values of the good reduction indexap(E) in Example 10.8 on page 357, thereby illustrating that E is a modularfunction.

Remark 10.14 The notion of a conductor of an elliptic curve must now comeinto play for our discussion. The technical definition involves a cohomologicaldescription that we do not have the tools to describe. However, we can talkabout it in reference to the discriminant and related prime divisors in orderto understand what it means. Given an elliptic curve E(Q) = E with globalminimal Weierstrass equation and discriminant #(E) = #, the conductor ndivides # and has the same prime factors as #. The power to which a givenprime appears in n is determined as follows. The power of a prime p dividingn is 1 if and only if E(Fp) has a node, which is characterized by having twocandidate tangents at the point, which in turn, means that (10.35) has a doubleroot. See Exercise 10.33 on page 367 for an illustration. Also, see Remark 10.12on page 357. If p > 3, then the power of p dividing n is 2 if and only if E(Fp)has a cusp. In the case where p = 2 or p = 3, which we selectively have ignoredfor the sake of simplicity of presentation, the conductor can be computed usingTate’s algorithm, which is uncomplicated, although the process of using it canbe somewhat protracted, see [94]. For p '= 2, 3, the power of p dividing theconductor n is at most 2, so for our purposes, the above discussion su"ces.

From the above, we conclude that the conductor of E is not divisible byany primes of good reduction, also called stable reduction. In other words,only primes of bad reduction divide the conductor. Moreover, a prime p tothe first power exactly divides the conductor precisely when E(Fp) has a node,in which case E is said to have multiplicative or semi-stable reduction at p.Hence, E has semi-stable reduction at all primes, in which case E is calledsemi-stable, precisely when the conductor n is squarefree. For instance, thecurve in Example 10.8 on page 357 has conductor 11, an instance of a semi-stable elliptic curve. The conductor of E is exactly divisible by p2 precisely


when E(Fp) has a cusp, in which case we say that E has additive or unstablereduction.

By Exercise 10.38 on page 367, the conductor is an “isogeny invariant,” aswell. The STW conjecture implies that we have the conductor n equal to thelevel n in %0(n) of weight 2 cusp forms—see the reformulation of STW in termsof L-functions on page 364.

Now we illustrate the modularity theorem in di!erent terms that will bringmore of the structure and interconnections to light. To do this, we concentrateupon the example n = 11, which will be a template for the general theory.

Example 10.10 From Example 10.5 on page 343, for n = 11, the group %0(11)can be shown to be generated by

T =$

1 10 1

&, U =

$8 1!33 !4

&, V =

$9 1!55 !6

&,

and if * % S2(%0(11)), then we map %0(11) to C, additively via #'(U) = 41,#'(V ) = 42, and #'(T ) = 0. Hence, L = [41, 42] is a lattice in C. It can beshown that C/L, called a complex torus, is analytically isomorphic to an ellipticcurve E(C), where L is determined up to homothety by E—see Exercise 10.25.For our purposes the “analytic isomorphism”

C/L ./ E(C)

is explicitly given by

z ./0

($(z), $%(z), 1) if z '% L,(0, 1, 0) if z % L

—see Remark 10.10. This is a holomorphic map carrying C/L one-to-one ontothe elliptic curve E = E(C) where E is given by the form

y2 = 4x3 ! g2x! g3,

with g2 and g3 given in Definition 10.10 on page 349. Altogether, we get aholomorphic map from X0(11) onto C/L, then onto E(C). Thus, it can beshown that this provides a holomorphic surjection

X0(11) =%0(11)

H)./ E(C) where H) = h 3Q 3 {"},

where X0(11) is called a compact Riemann surface, which is a complex one-dimensional manifold. Think of a Riemann surface as a “deformed” complexplane, which looks like the complex plane locally near a given point, but the globaltopology may be di"erent. The complex plane may be described as the most basicRiemann surface. C/L is also a complex manifold and the principal feature ofsuch surfaces is that holomorphic maps can be defined between them as we havedone above—see [88] for more details.


One may actually calculate the j-invariant via (10.20) to get

j(L) = ! (24 · 31)3

115, (10.38)

which demonstrates that E is defined over Q and gives more meaning to theabove mapping involving X0(11) and E over Q. However, from (10.31) on page354 via (10.24) on page 350, we have

j(L) =c34

#= 1728 +

c26

#, (10.39)

so by Exercise 10.39 on page 367, there is an integer k '= 0 such that

c4 = 24 · 31k2, c6 = 23 · 2501k3, and # = !115k6. (10.40)

By Exercise 10.40, (10.40) yields a global minimal Weierstrass equation exactlywhen k has no odd square factor, and

k + r (mod 16) where r % {1, 2, 5, 6, 9, 10, 12, 13, 14}. (10.41)

We call the association of X0(11) and E = E(Q) given by (10.41), with globalminimal Weierstrass equation provided by (10.40), a Q-structure of E. Thesimplest Q-structure occurs when k = 1 in which case we get the global minimalequation given by

E(C) : y2 + y = x3 ! x2 ! 10x! 20, (10.42)

which is the curve in Exercise 10.28 on page 366. What we have accomplishedis a mapping of X0(11) onto E(C).

Now, if we define$

4%24%1

&=

$1 30 5

& $42

41

&

and we letL% = [4%1, 4

%2],

it can be shown thatj(L%) = !163/11,

so a corresponding elliptic curve E% can be defined over Q, and this curve isgiven by

E% : y2 + y = x3 ! x2, (10.43)

which is the curve in Example 10.8 on page 357, with discriminant !11, and aswe saw above the discriminant of (10.42) is !115. In Exercise 10.38, this wasshown to be isogenous to the curve in (10.42). In Remark 10.14 on page 360,we saw that the conductor is an isogeny invariant, in this case n = 11.


We may reformulate the STW conjecture now in terms of the above, whichwe have illustrated for the case n = 11.

! STW Conjecture in Terms of Modular Parametrizations

Given an elliptic curve E over Q, there exists an n % N for which there is anonconstant surjective holomorphic map F : X0(n) ./ E, defined over Q, inwhich case E is said to have a modular parametrization modulo n, and E iscalled a Weil curve.

Remark 10.15 We have illustrated the above for the case n = 11 in Exam-ple 10.10 on page 361, but the theory, called Eichler–Shimura theory, holds forany of the compact Reimann surfaces X0(n) where n is the level of the weight 2cusp forms, so given the aforementioned proof of STW, the above is a statementof the modularity theorem.

The phrase “defined over Q” in the above interpretation of the STW con-jecture is important in that we may have holomorphic surjections without therationality property but for which the L-functions of the curves and the cuspforms do not agree. Now we must explain this comment by introducing thenotions of L-functions for elliptic curves and forms. Note that the constructionof the map from X0(11) to E(C) in Example 10.10 on page 361 is indeed definedover Q. In the literature, such maps are rational maps defined at every point,called morphisms—see [88].

We turn our attention to L-functions, a concept we introduced in §7.2, buthave not yet linked with elliptic curves. Elliptic curves that are isogenous overQ have the same L-functions which we now define and discuss.

Let E(Q) be an elliptic curve over Q given by a global minimal Weierstrassequation, which is no loss of generality by Remark 10.11 on page 356. Then theL-function for E, having discriminant #, is given by

L(E, s) =7

p|!

,91! ap(E)p#s

:#1- 7

p!!

,91! ap(E)p#s + p1#2s

:#1-.

It can be shown that L(E, s) converges for ;(s) > 2, and is given by an abso-lutely convergent Dirichlet series—see §5.3. Thus, we may write

L(E, s) ="!

n=1

cn

ns.

Now by Definition 10.6 on page 343, a normalized cusp form f % S2(%0(n))of weight 2 and level n satisfies

f(z) = q +"!

n=2

an(f)qn.


Thus, we may define the L-function of f by

L(f, s) ="!

n=1

an(f)ns

.

Now the STW conjecture may be reformulated in terms of L functions:

! STW Conjecture in Terms of L-Functions

For every elliptic curve E defined over Q, there exists a normalized cusp formof weight 2 and level n, f % S2(%0(n)), such that

L(f, s) = L(E, s),

and n is the conductor of E.

We have concentrated upon X0(11) in Example 10.10 on page 361 since it isthe simplest case, namely having what is called genus one with correspondingS2(%0(11)) having dimension one as we saw above. In general, the dimensionof S2(%0(n)) is called the genus of X0(n). To see the intimate connection withFLT, we return to the discussion of Frey curves (10.25) introduced on page 353.Suppose that

ap + bp = cp (10.44)

is a counterexample to FLT for a prime p ) 5. The Frey curve is given by

E : y2 = x(x! ap)(x! cp), (10.45)

for which# = 16a2pb2pc2p, (10.46)

andc4 = 16(a2p ! apcp + c2p). (10.47)

Then when a, b, c are pairwise relatively prime, by Exercise 10.41 on page 367,the conductor of E is the product of all primes dividing abc, which tells us, byRemark 10.14, that E is semi-stable.

Now we are in a position to return to a discussion of the STW conjecture andFLT. In 1995, Wiles and Taylor published papers [95] and [103], which provedthat every semi-stable elliptic curve is modular. In 1998, Conrad, Diamond, andTaylor [16] proved the STW conjecture for all elliptic curves with conductor notdivisible by 27. Then in 2001, Breuil, Conrad, Diamond, and Taylor publisheda proof of the full STW conjecture, which we now call the modularity theorem[11]. However, in 1990, Ribet proved the following, which via the a"rmativeverification of the STW conjecture, allowed a proof of FLT as follows.


Theorem 10.3 Ribet’s Theorem

Suppose that E is an elliptic curve over Q given by a global minimalWeierstrass equation and having discriminant # =

8p|! pfp and conductor

n =8

p|! pgp , both canonical prime factorizations. Furthermore, if E has amodular parametrization of level n with f % S2(%0(n)) having normalized ex-pansion

f(z) = q +"!

n=2

aj(f)qn,

then for a fixed prime p0, set

n% =n7

pp0|fpgp=1

p. (10.48)

Then there exists an f % % S2(%0(n%)) such that f % =""

n=1 bj(f %)qn with bj(f %) %Z satisfying aj(f) + bj(f %)(mod p0) for all n % N.

Proof. See [81]. !

Now we may state our target result, which follows [45, Corollary 12.13, p.399], where it is cited as a Frey–Serre–Ribet result.

Theorem 10.4 Proof of Fermat’s Last Theorem

The STW conjecture implies FLT.

Proof. Assume that FLT is false. Then by Theorem 10.3, the Frey curve givenin (10.45) has conductor n =

8p|abc p, which when compared to the coe"cients

in (10.48), yields n% = 2. However, by Example 10.5 on page 343, S2(%0(2)) isthe zero space, so bj(f %) = 0 for all n % N. Yet,

bj(f %) + aj(f) (mod p0)

for all n % N. In particular,

0 = b1(f %) + a1(f) = 1 (mod p0),

a contradiction. !

With the above, this completes the main text and demonstrates the powerof the tools we developed herein. It is an appropriate juncture to leave sincethe proof of FLT to the extent we have been able to demonstrate herein showsthe accomplishments of centuries of mathematical exploration.


Exercises

10.26. Prove that, in (10.28) and (10.29) on page 354, the definitions for discrim-inant and j-invariant agree with those given in §10.3, namely when b2 = 0,2b4 = !g2, b6 = !g3, and c4 = 12g2.

10.27. With reference to (10.30) and (10.32) on page 354, prove that the discrim-inant of E given by (10.28) is equal to #(E) = (c3

4 ! c26)/1728.

10.28. By a suitable transformation, show that y2 + y = x3 ! x2 ! 10x! 20 is ofthe form

y2 = x3 ! 27c4x! 54c6 (10.49)

with #(E) = !115. Conclude that E has good reduction for all primesp '= 11.(Hint: Use Exercise 10.27.)

10.29. For the elliptic curve given in Exercise 10.28, provide a good reductiontable for the same primes as given for the curve in Example 10.8.

10.30. Let E be an elliptic curve given by (10.26) on page 353, where |aj |p # 1for j = 1, 2, 3, 4, 6. With reference to Remark 10.11 on page 356, provethat (10.26) is minimal for E at p if any of the following hold.

(1) |#|p > p#12.(2) |c4|p > p#4.(3) |c6|p > p#6.

Moreover, if p > 3, prove that (10.26) is not minimal for E at p if both ofthe following hold

(1) |#|p # p#12.(2) |c4|p # p#4.

10.31. Prove that an elliptic curve E = E(F ) over a field F is always nonsingularat infinity. Then prove that E is nonsingular over F if and only if #(E) '=0.(Hint: To prove that E never has a singular point at infinity, consider thehomogeneous equation

F(X, Y, Z) = Y 2Z + a1XY Z + a3Y Z2 !X3 ! a2X2Z ! a4XZ2 ! a6Z

3,

so the point at infinity is o = (0, 1, 0). Then show 9F/9Z(o) '= 0—seeRemark 9.1 on page 302. Recall that we are always assuming characteristicnot 2 or 3, which simplifies computations although the result still holds inthe latter cases.)


10.32. Prove that an elliptic curve E = E(Fp) has a node if and only if c4 '= 0and #(E) = 0; and E has a cusp if and only if # = 0 = c4.(Hint: Prove that you may assume, without loss of generality, that thesingular point occurs at the origin. Then consider (10.37) at the originwith respect to partial derivatives.)

10.33. Prove that the curve in Exercise 10.29 has a node over F11 by displayinga graph, reduced modulo 11 to display the node. This illustrates Re-mark 10.14 on page 360 from which we may conclude that this curve hasconductor 11.(Hint: Use Exercise 10.31.)

10.34. Prove that the curve in Example 10.8 on page 357 has conductor 11 byreducing it modulo 11 and graphing its node there using Remark 10.14.

10.35. Prove that if p > 3 is prime then the elliptic curve given by y2 = x3+px2+1has good reduction at p.

10.36. Prove that the elliptic curve E = E(Fp) given by y2 = x3 + x2 + p for aprime p > 3 has a node.

10.37. Prove that the elliptic curve E = E(Fp) given by y2 = x3 + p for a primep > 3 has a cusp.

10.38. Given elliptic curves Ej = Ej(C) for j = 1, 2, an isogeny is defined tobe an analytic map h : E1 ./ E2, where the identity gets mapped to theidentity. Show that there is an isogeny between the curve given in Exercise10.28, which we will call E2 and the curve given in Example 10.8, whichwe will call E1. Two curves E1 and E2 are said to be isogenous if there isa nonconstant isogeny h between them.

10.39. Verify (10.40) on page 362.(Hint: Use (10.38) on page 362.)

10.40. In Example 10.10, show that k has no odd square factor and verify (10.41)on page 362.(Hint: Look at the elliptic curve that one gets from the elliptic curve ofthe form y2 = x3 + a2x2 + a4x + a6, satisfying (10.40) by an admissiblechange of variables with u = 2.)

10.41. Prove that the Frey curve has conductor n =8

p|abc p, where a, b, c aregiven in (10.44).(Hint: First verify that (10.46)–(10.47) on page 364 hold. Then provethat (10.45) is minimal Weierstrass for any prime p

## #. Also, check theodd primes p dividing ac and b separately, as well as p = 2. Then findan admissible change of variables at p = 2 for (10.45) so that the newequation is global minimal.)


Sieve Methods 369

Appendix

Sieve Methods

Work without hope draws nectar in a sieve,And hope without an object cannot live.

from Work without Hope (1828)Samuel Taylor Coleridge (1772–1834)

English poet, critic, and philosopher

The purpose of this appendix is to provide an overview of sieve methodsused in factoring, recognizing primes, finding natural numbers in arithmeticprogressions whose common di!erence is prime, or generally to estimate thecardinalities of various sets defined by the use of multiplicative properties. Recallthat use of a sieve or sieving is a process whereby we find numbers via searchingup to a prescribed bound and eliminate candidates as we proceed until only thedesired solution set remains. In other words, sieve theory is designed to estimatethe size of sifted sets of integers. For instance, sieves may be used to attack thefollowing open problems, for which sieve methods have provided some advances.

(a) (The Twin Prime Conjecture)

There are infinitely many primes p such that p + 2 is also prime.

(b) (The Goldbach Conjecture)

Every even integer n > 2 is a sum of two primes.

(c) (The p = n2 + 1 Conjecture)

There are infinitely many primes p of the form p = n2 + 1.

(d) (The q = 4p + 1 Conjecture)

There are infinitely many primes p such that q = 4p + 1 is also prime.

(e) (Artin’s Conjecture)

For any nonsquare integer a '% {!1, 0, 1}, there exist infinitely manyprimes p such that a is a primitive root modulo p.

Indeed, in 1986, Heath-Brown [39] used sieving methods to advance the Artinconjecture to within a hair of a solution when he proved that, for a given primep, with the possible exception of at most two primes, there are infinitely manyprimes q such that p is a primitive root modulo q. Thus, sieve methods areimportant to review for their practical use in number theory and the potentialfor solutions of outstanding problems such as the above. The methodology usedto prove these results could serve as a course in itself, so we have relegated thesefacts to an appendix without proofs.

370 Appendix

The fundamental goal of sieve theory is to produce upper and lower boundsfor cardinalities of sets of the type,

S(S,P, y) = {n % S : p## n implies p > y for all p % P}, (A.1)

where S is a finite subset of N, P is a subset of P, the set of all primes, and y isa positive real number.

Example A.1 Let

S = {n % N : n # x} and&

x < y # x.

Then

|S(S,P, y)| =##{n # x : p

## n implies p > y}## = "(x)! "(y) + 1,

one more than the number of primes between x and y.

To illustrate (A.1) more generally, we begin with what has been called “theoldest nontrivial algorithm that has survived to the present day.” From antiq-uity, we have the Sieve of Eratosthenes, which is covered in a first course innumber theory—see [68, Example 1.16, p. 31, Biography 1.6, p. 32], whichsieves to produce primes to a chosen bound. However, as discussed therein,this sieve is highly ine"cient. Indeed, since in order to determine the primesup to some bound using this sieve for n % N, one must check for divisibilityby all primes not exceeding

&n, then the sieve of Eratosthenes has complexity

O((n loge n)(loge loge n)), which even using the world’s fastest computers, is be-yond hope for large integers as a method for recognizing primes. Yet there isa formulation of this sieve that fits nicely into the use of arithmetic functions,and has applications as a tool for modern sieves, so we present that here forcompleteness and interests sake.

Recall the definition of the Mobius function µ(d), given by (5.22) on page214. Also, let 4(d) denote the number of distinct prime divisors of d.

Theorem A.1 Eratosthenes’ Sieve

Suppose thatP = {p1, p2, . . . , pn} ( P

is a set of distinct primes and

S ( N with |S| < ".

Denote by S the number of elements of S not divisible by any of the pj’s and bySd the number of elements of S divisible by a given d % N. Then

S =!

d|p1p2···pn

µ(d)Sd.

Sieve Methods 371

Moreover, for m = 1, 2, . . . , 0n/21, we have!

d|p1p2···pn((d)(2m"1

µ(d)Sd # S #!

d|p1p2···pn((d)(2m

µ(d)Sd, (A.2)

where (A.2) is called Eratosthenes’ sieve.

Proof. See [75, Corollary 2, p. 147]. !

For instance, an application of Theorem A.1 is that it may be used to provethe following result on the number of primes less than a certain bound, firstproved in 1919, by the Norwegian mathematician Viggo Brun (1882–1978).

Theorem A.2 Brun’s Theorem

If n % N and A2n(x) denotes the number of primes p # x for which |p + 2n|is also prime, then

A2n(x) = O(x(loge loge x)2 log#2e x).

Proof. See [75, Theorem 4.3, p. 148]. !

Theorem A.2 has, as a special case, implications for the twin prime conjec-ture as follows. Recall that the symbol << is synonymous with the “big Oh”notation.

Corollary A.1 Brun’s Constant

Let Q be the set of all primes p such that p + 2 is also prime, then

!

p#Qp(x

1 <<x(loge loge x)2

log2e x

,

and the series !

p'Q

1p

= B (A.3)

is convergent, where (A.3) is called Brun’s constant.

Proof. See [75, Corollary, p. 152]. !

Remark A.1 We do not know if Q in Corollary A.1 is finite or not since itsinfinitude would be the twin prime conjecture. We do know that the sum ofthe reciprocals of all primes diverges, but since the series (A.3) converges, thisis not a proof of the conjecture since we would need divergence to get theinfinitude. The behaviour of the two series does tell us that, although the twin

372 Appendix

prime conjecture may be true, the twin primes must be appreciably less densethan the entire set of primes. Brun’s result, that the reciprocals of twin primesconverges, is one of the centerpiece achievements of sieve theory.

The value of Brun’s constant is

B 9 1.9021605824,

with an error within ±0.000000003, computed by Thomas R. Nicely in 1999. Itis worth noting the now famous fact that, in 1995, Nicely was doing computa-tions on Brun’s constant which led him to discover a flaw in the floating-pointarithmetic of the Pentium computer chip, costing literally millions of dollars toits manufacturer Intel—see http://www.trnicely.net/twins/twins2.html.

Theorem A.1 on page 370 tells us that the sieve of Eratosthenes investigatesthe function

|S(S,P, x)| =!

n#Sgcd(n,")=1

1, where & =7

p#Pp<x

p

via the equality

|S(S,P, x)| =!

n'S

!

d|nd|"

µ(d) =!

d|"

µ(d)Sd.

The general basic sieve problem emanates from this, namely find arithmeticfunctions .!(d) : N ./ R and .u(d) : N ./ R with

!

d|nd|"

.!(d) #0

1 if gcd(n, &) = 1,0 if gcd(n, &) > 1,

and !

d|nd|"

.u(d) )0

1 if gcd(n, &) = 1,0 if gcd(n, &) > 1,

such that!

d|"

.!(d)Sd =!

n'S

!

d|nd|"

.!(d) # |S(S,P, x)| #!

n'S

!

d|nd|"

.u(d) =!

d|"

.u(d)Sd.

(A.4)Now we interpret the above in terms of what Selberg did to create his famous

sieve and how Theorem A.1 comes into play—see [68, Biography 1.21, p. 67].With the notation of Theorem A.1 still in force, we add that P denotes theproduct of the primes in P, |S| = N , and call the following Selberg’s conditionon S.

There exists a multiplicative function f(d) such that if d## P , then

Sd =f(d)

dN + R(d), (A.5)

http://www.trnicely.net/twins/twins2.html

Sieve Methods 373

where |R(d)| # f(d) and d > f(d) > 1. With the Selberg condition pluggedinto the right-hand side of (A.4), we have

|S(S,P, x)| #!

d|"

.u(d)f(d)Nd

+!

d|"

.u(d)R(d)

= N!

d|"

.u(d)f(d)d

+ O

B

C!

d|"

|.u(d)R(d)|

D

E . (A.6)

Selberg’s sieve arose from his attempts to minimize (A.6) subject to Selberg’scondition (A.5). Theorem A.1 on page 370 comes into play again in that it isused in the proof of the following, first proved by Selberg [85] in 1947. The fol-lowing is considered to be the fundamental theorem concerning Selberg’s sieve,which for the above-cited reasons, is often called Selberg’s upper bound sieve.

Theorem A.3 Selberg’s Sieve

Let P be a finite set of primes, P denoting their product, S ( N with |S| =N % N, where the elements of S satisfies Selberg’s condition (A.5), and let

S = |S(S,P, x)|

be the number of elements of S not divisible by primes p % P with p # x wherex > 1. If for p

## P , we have that f(p) > 1,

g(n) =7

d|n

µ(n/d)df(d)

,

andQx =

!

d|Pd(x

g#1(d),

then

S # N

Qx+ x2

7

p#Pp(x

$1! f(p)

p

&#2

.


An application of Theorem A.3 is the following, where "(x; k, ') denotes thenumber of primes p # x such that p + '(mod k). In the notation of TheoremA.3, we have that

P = {p % P : p ! k and p #&

x}.

Also,S = {y = kn + ' : n % N and y # x}.

374 Appendix

Then N = 0x/k1,

S(S,P, x) = "(x; k, ')! "(&

x; k, ') = "(x; , k, ') + O(&

x).

It follows that f(d) = 1, Sd = 0N/d1 + Rd with |Rd| # 1, g(n) = #(n), andQx =

"x-d|P ##1(d).

Theorem A.4 The Brun–Titchmarsh Theorem

There exists a C = C(3) % R+ such that for 1 # q < x and gcd(k, ') = 1,we have

"(x; k, ') # Cx

#(k) loge(x/q).

Proof. See [75, Corollary, p. 161] and see Biography A.2 on page 376. !

Remark A.2 Theorem A.4 is known to hold when the constant c = 2. More-over, if 1 # q # x1#& for 3 > 0, then the upper bound is at the expected orderof magnitude.

Another interpretation of Theorem A.4 is that if x, y are positive reals, andk, ' % Z with y/k /", then

"(x + y, k, ')! "(x, k, ') <(2 + o(1))y

#(k) loge(y/k).

Yet another formulation is given as follows. There exists an e!ective constantk > k0(3) such that

"(x + ky, k, ')! "(x, k, ') <(2 + 3)y

#(k) loge y,

for all y, x, ' with y > k. The amazing aspect of Brun–Titchmarsh is that ifwe could replace 2 by 2 ! ) for any ) > 0, then Landau–Siegel zeros cannotexist—see page 300.

Selberg’s sieve also has applications to some other classical problems. Forinstance, the twin-prime conjecture may be interpreted as follows. Suppose thatf(d) represents the number of elements of

{n(n + 2) : d## n(n + 2) where 1 # n # d}

which are divisible by d and for some m % N,

S = {j(j + 2) : j = m, m + 1, . . . ,m + N ! 1}.

Let "2(N) be the number of twin primes less than N , from which it follows that

"2(N) # |S(S,P, N1/3)| + N1/3

Sieve Methods 375

because if p # N has a twin prime, then either p # N1/3 or else p(p + 2) hasno prime factor # N1/3. Thus, using Selberg’s sieve to estimate |S(S,P, N1/3)|,we have f(2) = 1 and f(p) = 2 for odd primes p. We claim that

7

p&N1/3

$1! f(p)

p

&#1

<< (loge N)2.

This follows from the fact that for p > 3,$

1! 2p

&#1

#$

1! 1p

&#2 $1! 2

p2

&#1

and the fact that7

p&N1/3

$1! 1

p

&#1

<< loge N1/3,

which, in turn, follows from Merten’s Theorem 5.12 on page 222, keeping inmind that

8p&N1/3(1!2p#2)#1 converges. One may also deduce a lower bound

as follows,!

d(N1/3d odd

f(d)d

) (loge N)2.

Putting this all together via Theorem A.3 on page 373, we get the following.

Theorem A.5 Selberg’s Sieve on Twin Primes

The number "2(N) of twin primes less than N satisfies

"2(N) <<N

(loge N)2.

Remark A.3 With the above application of Selberg’s sieve, it is certainlyworth mentioning another highlight of sieve theory with respect to the twin-prime conjecture, namely Chen’s Theorem, which shows that there are infinitelymany primes p such that p+2 is either prime or a product of two primes. Again,sieve methods allowed a result that is within a hair of the a"rmation of anotherclassical conjecture.

Another of the list of conjectures from our discussion at the outset is theGoldbach conjecture. Now we look at applications of Selberg’s sieve to thisclassical problem. To this end, let N = 2m for m % N, and for some k % N,

S = {j(N ! j) : j = k, k + 1, . . . , k + N ! 1},

and let r(N) be the number of representations of N as a sum of two primes.Also, f(d) is the number of elements of

{n(N ! n) : n = 1, 2, . . . , d}

376 Appendix

divisible by d. It follows that

r(N) # |S(S,P, N1/3)| + 2N1/3.

Thus, f(p) = 2 if p ! N and f(p) = 1 if p## N . Applying Theorem A.3, and

arguing in a similar fashion to the above, we get the following, a complete proofof which may be found in [75, Theorem 4.6, p. 162].

Theorem A.6 Selberg’s Sieve on the Goldbach Conjecture

For N % N,

r(N) <<N

(loge N)27

p|d

$1 +

2p

&.

Biography A.2 Edward Charles Titchmarsh (1899–1963) was born in New-bury, Berkshire, England on June 1, 1899. At the early age of seventeen, hewon an Open Mathematical Scholarship to Balliol College, Oxford. In Octoberof 1917, he began his studies at that college. When he turned eighteen, he wasinducted into the service in World War I, becoming a dispatch rider in France.He served until after the war, and returned to his studies at Oxford in Octoberof 1919. While there he was taught by G.H. Hardy, who had a profound influ-ence on Titchmarsh, including their shared passion for cricket. He graduated in1922, and, in the following year, won the Prize Fellowship at Magdalen CollegeOxford. He also held a Senior Lecturer position at University College in Lon-don at the same time. Eventually, he was appointed to succeed Hardy for theSavilian chair at Oxford when Hardy left for Cambridge. All of Titchmarsh’swork was in analysis, including work on the Riemann !-function. Arguably, hismost important, and certainly his most popular, book was published in 1932,The Theory of Functions. His work had influence on diverse areas includingquantum mechanics, via his work on series expansions of eigenfunctions of dif-ferential equations. Indeed, the latter topic consumed a quarter century of hisprofessional life. He published a significant amount of that work in Eigenfunc-tion expansions associated with second-order di!erential equations in the late1940s and 1950s. Among the honours received in his lifetime were: electionto the Royal Society in 1931, being awarded the De Morgan Medal in 1953,winning the Sylvester Medal in 1955, and although he did not formally studyto receive a doctorate, he was awarded an honourary one by the University ofShe!eld in 1953. He died in Oxford, Oxfordshire on January 18, 1963.

We have amply illustrated the applications of Selberg’s sieve to a variety ofclassical problems. It is now time to look at other sieves and their contributions.One of these is due to Linnik [52] first produced in 1941—see Biography A.6 onpage 384. To understand what it says, we provide a preamble that takes intoaccount what we have learned thus far. Brun’s result, Theorem A.2 on page 371,may be interpreted as a generalization of Eratosthenes’ sieve as follows. Take

Sieve Methods 377

1, 2, . . . , n and for each prime p #&

n, we eliminate k residue classes modulo p,then the number remaining does not exceed C(k)N/(logk

e n), where C(k) > 0depends on k. Linnik considered a more general situation by considering foreach prime p #

&n, and eliminating f(p) classes modulo p where f(p) gets

large as p does. Linnik called this the large sieve. This is formalized in termsof the notation we have developed herein as follows.

Theorem A.7 The Large Sieve Inequality

Suppose that N % N and for every prime p #&

N , let f(p) residue classesmodulo p be given, where 0 # f(p) < p. If IN is any interval of natural numbersof length N , then in IN there are at most

(1 + ")N"p(!

Nf(p)/(p! f(p))

integers not lying in any of the given residue classes.

Proof. See [75, Corollary 2, p. 170]. !

The large sieve can be applied to Artin’s conjecture, one of the classicalproblems from our list at the outset. From the large sieve Theorem A.7, wehave the following.

Theorem A.8 The Large Sieve on Artin’s Conjecture

Let IN be an interval of natural numbers of length N % N and let

C(N) =###[

n % IN : n is not a primtive root modulo for any prime p #&

N\### .

ThenC(N) <<

&N loge(N).


Corollary A.2 Almost every n % N is a primitive root for some prime.

Using the large sieve, Bombieri [6] and Vinogradov [98] independently founda result on distribution of primes in arithmetic progression that is quite pleas-ant—see Biography A.3 on page 380. In the next result, we use the following.The (basic) Mangoldt function is given by

'(n) = loge p if n = pa for some prime and p, a % N, and '(n) = 0 otherwise.

In the

378 Appendix

Theorem A.9 The Bombieri–Vinogradov Theorem

For any real number A > 0, there is a constant B = B(A) such that, forQ =

&x(loge x)#B,

!

q&Q

maxy(x

maxa#(Z/qZ)'

####5(y; q, a)! y

#(q)

#### <<x

(loge x)A, (A.7)

where5(x; q, a) =

!

n(xn&a (mod q)

'(n).

In keeping with the above, we now show how some classical problems canbe tackled with Theorem A.9. If ,(x) is the number of divisors function, andn % N, is fixed, then the Titchmarsh divisor problem is to compute the order ofthe function

S(x) =!

p&x

,(p + n)

—see page 208. Theorem A.9 can be applied to this problem to get the follow-ing—see [75, Theorem 5.11, p. 202] for a related result.

Theorem A.10 Bombieri–Vinogradov Applied to Titchmarsh

For any n % N, there exists a constant c % R+ such that

S(x) = cx + O

$x loge loge x

loge x

&.

This establishes more than that proved by Titchmarsh [96] in 1930, whereinhe showed that S(x) = O(x).

Bombieri also provided a sieve, essentially generalizing the Selberg sieve, thatwas highly useful in establishing another highlight of sieve theory. To describethis and the application, we need the following notions. If (A.7) holds for anyA > 0 and any 3 > 0 with Q = x*#&, then we say the primes have level ofdistribution /. Thus, according to Theorem A.9, the primes are known to havelevel of distribution / = 1/2. The Elliott–Halberstam conjecture says the primeshave level of distribution / = 1. This remains open.

The generalized Mangoldt function is given by

'k(n) =!

d|n

µ(d) logke(n/d).

Now let {an}"n=1 be a sequence of positive real numbers,

A(x) =!

n&x

an, and H =7

p

(1! f(p))(1! 1/p)#1,

Sieve Methods 379

for a multiplicative function f , then the following, proved by Bombieri in1976—see [8]—under the assumption of the validity of the Elliott–Halberstamconjecture, is called the asymptotic sieve, where k ) 2

!

n&x

an'k(n) $ kHA(x)(loge x)k#1. (A.8)

The case k = 2 and an = 1 for all n is essentially Selberg’s sieve.The most striking application to date of (A.8) was achieved by Friedlan-

der and Iwaniec in 1998—see [29]–[30]—when they proved the following—seeBiographies A.4 on page 381 and A.5 on page 382.

Theorem A.11 The Friedlander–Iwaniec Theorem

There are infinitely many primes of the form a2 + b4.

We have covered an overview of some of the successes of sieve methods,but there are weaknesses. In particular, sieve methods cannot, in general, dis-tinguish between numbers with an even number of prime factors and an oddnumber of prime factors, which is called the parity problem. Bombieri’s sieveclarified some of this issue in [7]–[8], by showing that his sieve implies an asymp-totic formula for !

n&x

anF (n)

precisely when a function F provides what is called equal weight to integerswith an even number of prime factors and those with an odd number of primefactors. It turns out that the generalized Mangoldt functions have exactly thisproperty for k > 1. Of course, the parity problem remains, but the above stridesand applications are indicative of the power of sieve methods.

It is worth pointing out, before we turn to another topic, that the El-liot–Halberstram conjecture implies some fascinating recent results for gapsbetween primes as well as implications for the twin-prime conjecture. Thesewere found by Goldston, Pintz, and Yildirim in 2005—see [31]–[33]. For the fol-lowing statement recall that the infimum of a set S is the greatest lower boundof S and is denoted inf(S). Also, the limit inferior, denoted by lim inf, is givenby

lim infn!"an = limn!"

(infm-nam)

for a sequence {an}.The first result is unconditional.

Theorem A.12 Unconditional Goldston–Pintz–Yildirim

If pn denotes the n-th prime, then

lim infn!"pn+1 ! pn'

loge pn(loge loge pn)2< ".

380 Appendix

Also, if {an} is a sequence of natural numbers satisfying that

|{an : n # N}| > C(loge N)1/2(loge loge N)2

for all su!ciently large N , then infinitely many of the di"erences of two elementsof {an} can be expressed as the di"erence of two primes.

The following is the conditional result.

Theorem A.13 The Conditional Goldston–Pintz–Yildirim Theorem

If the Elliott–Halberstam conjecture is true, then

lim infn!"pn+1 ! pn # 16.

Remark A.4 It is worth noting that, in joint work with S. Graham, Goldston,Pintz, and Yildirim proved that if qn is the n-th natural number with exactly twoprime factors, then under the assumption of a generalized Elliot–Halberstramconjecture:

lim infn!"qn+1 ! qn # 6

–see: http://www.math.boun.edu.tr/instructors/yildirim/yildirim.htm.

Biography A.3 Enrico Bombieri (1940–) was born in Milan, Italy on Novem-ber 26, 1940. He achieved his doctoral degree at the University of Milan in 1963.In 1966, he was appointed to a chair in mathematics at the University of Pisa.He also taught at the Scuola Normale Superiore at Pisa. He was awarded theField’s medal in Vancouver in 1974 for his work in the study of the theory offunctions of several complex variables, the study of primes, as well as to partialdi"erential equations and minimal surfaces. Bombieri’s large sieve methods im-proved upon the methods of Renyi, who had in turn extended the sieve methoddeveloped by Linnik—see Biography A.6 on page 384. Theorems A.8–A.10 area few examples of the applicability of Bombieri’s large sieve method. In 1980,Bombieri was awarded the Balzan International Prize, and in 1984, he waselected as a foreign member of the French Academy of Sciences. He is also aforeign member of the Royal Swedish Academy, and the Academia Europea. In1996, he was elected to be a member of the National Academy of Sciences. Hecurrently works in the U.S.A. as the IBM Von Neumann Professor of Mathe-matics at the Institute for Advanced Study at Princeton, New Jersey, where hehas been since 1977.

Theorems A.12–A.13 are outcroppings of results on sieve methods that beganwith Selberg’s sieve, which has been supplanted by other methods. Selberg’ssieve applies to twin primes as we saw in Theorem A.5 on page 375. In 1997,Heath-Brown generalized Selberg’s application to the problem of almost primes,which are natural numbers that are either prime or a product of two primes.The authors of Theorems A.12–A.13 used Heath-Brown’s argument in ways

http://www.math.boun.edu.tr/instructors/yildirim/yildirim.htm

Sieve Methods 381

that theretofore had not been applied to primes themselves and achieved thesespectacular results. The description of the details of their method is describedat the end of the paper [33].

Biography A.4 John Friedlander (1941–) is a Canadian mathematician atthe University of Toronto, who specializes in analytic number theory. In partic-ular, he is considered to be a world leader in the theory of primes and DirichletL-functions. In 1965, he received his B.Sc. from the University of Toronto, hisM.Sc. in 1966 from the University of Waterloo, and his Ph.D. in 1972 fromPennsylvania State University. He was a lecturer at M.I.T. from 1974 to 1976and has been at the University of Toronto since 1977. He served as chair inthe mathematics department from 1987 to 1991. He spent many years at theInstitute for Advanced Study at Princeton and has collaborated with Bombieriamong others—see Biography A.3 on the preceding page. He was elected as amember of the Royal Society in 1988. In 1997, he collaborated with Iwaniec toprove Theorem A.11 on page 379 using Bombieri’s asymptotic sieve—see Biog-raphy A.5 on the next page. He has received the CRM-Fields Prize recognizinghis achievements. In 1999, he was invited to give the Je"ery-Williams lectureto recognize his leadership in Canadian mathematics.

We now turn to a powerful sieve that is used to great success in factoring.The following is adapted from [64].

In 1988, John Pollard circulated a manuscript that contained the outlineof a new algorithm for factoring integers, which we studied in §2.3. In 1990,the first practical version of Pollard’s algorithm was given in [51], published in1993, the authors of which dubbed it the number field sieve. Pollard had beenmotivated by a discrete logarithm algorithm given in 1986, by the authors of [17],which employed quadratic fields. Pollard looked at the more general scenario byoutlining an idea for factoring certain large integers using number fields. Thespecial numbers that he considered are those large composite natural numbersthat are “close” to being powers, namely those n % N of the form n = rt ! sfor small natural numbers r and |s|, and a possibly much larger natural numbert. Examples of such numbers, which the number field sieve had some successesfactoring, may be found in tables of numbers of the form

n = rt ± 1, called Cunningham numbers.

However, the most noteworthy success was factorization of the ninth Fermatnumber F9 = 229

+ 1 = 2512 + 1 (having 155 decimal digits), by the Lenstrabrothers, Manasse and Pollard in 1990, the publication of which appeared in1993 (see [50]).

To review some of the recent history preceding the number field sieve, weobserve the following. Prior to 1970, a 25-digit integer was considered di"cultto factor. In 1970, the power of the continued fraction method raised this to50 digits (see [68, §5.4, pp. 240–242]). Once the algorithm was up and runningin 1970, legions of 20- to 45-digit numbers were factored that could not be

382 Appendix

factored before. The first major success was the factorization of the seventhFermat number

F7 = 227+ 1 = 2128 + 1,

a 39-digit number, which we described via Pollard’s method in §2.3. By themid 1980’s, the quadratic sieve algorithm was felling 100-digit numbers. Withthe dawn of the number field sieve, 150-digit integers were now being tackled.The number field sieve is considered to be asymptotically faster than any knownalgorithm for the special class of integers of the above special form to which itapplies. Furthermore, the number field sieve can be made to work for arbitraryintegers. For details, see [13], where the authors refer to the number field sievefor the special number n = rt ! s as the special number field sieve. The moregeneral sieve has come to be known as the general number field sieve.

Biography A.5 Henryk Iwaniec (1947–) is a Polish-American mathemati-cian, who was born on October 9, 1947 in Elblag, Poland. He obtained hisdoctorate from the University of Warsaw in 1972 under the direction of An-drzej Schinzel. He was employed at the Institute of Mathematics of the PolishAcademy of Sciences until 1983, when he left Poland for the U.S.A. He heldvisiting positions at the Institute for Advanced Study, the University of Michi-gan, and the University of Colorado at Boulder. Then he went to RutgersUniversity, where he has been a professor since 1987. In 1997, he and JohnFriedlander proved Theorem A.11 on page 379 using Bombieri’s asymptoticsieve—see Biographies A.3 on page 380 and A.4 on page 381. For this he wasawarded the Ostrowski Prize in 2001, where the citation mentioned his “pro-found understanding of the di!culties of the problem.” In 2002 he was awardedthe fourteenth Frank Nelson Cole Prize in number theory. He has contributedmany results to analytic number theory, but in particular to modular forms onthe general linear group and to sieve methods.

Much older than any of the aforementioned ideas for factoring is that at-tributed to Fermat, namely the writing of n as a di!erence of two squares.However, this idea was enhanced by Maurice Kraitchik in the 1920’s. Kraitchikreasoned that it might su"ce to find a multiple of n as a di!erence of squares,namely,

x2 + y2 (mod n), (A.9)

so that one of x ! y or x + y could be divisible by a factor of n. We say couldhere since we fail to get a nontrivial factor of n when x + ±y (mod n). However,it can be shown that if n is divisible by at least two distinct odd primes, thenfor at least half of the pairs x (modulo n), and y (modulo n), satisfying (A.9)with gcd(x, y) = 1, we will have

1 < gcd(x! y, n) < n.

This classical idea of Kraitchik had seeds in the work of Gauss, but Kraitchikintroduced it into a new century in the pre-dawn of the computer age. This

Sieve Methods 383

idea is currently exploited by many algorithms via construction of these (x, y)-pairs. For instance, the aforementioned continued fraction, and quadratic sievealgorithms use it. More recently, the number field sieve exploits the idea. Tosee how this is done, we give a brief overview of the methodology of the numberfield sieve. This will motivate the formal description of the algorithm.

For n = rt!s, as above, we wish to choose a number field of degree d over Q.The following choice for d is made for reasons (which we will not discuss here),which makes it the optimal selection, at least theoretically. (The interestedreader may consult [51, Sections 6.2–6.3, pp. 31–32] for the complexity analysisand reasoning behind these choices.) Set

d =$

(3 + o(1)) log n

2 log log n

&1/3

. (A.10)

Now select k % N, which is minimal with respect to kd ) t. Therefore,

rkd + srkd#t (mod n).

Setm = rk, and c = srkd#t. (A.11)

Thenmd + c (mod n).

Setf(x) = xd ! c,

and let % % C be a root of f . Then this leads to a choice of a number field,namely F = Q(%). Although the number field sieve can be made to work whenZ[%] is not a UFD, the assumption that it is a UFD simplifies matters greatlyin the exposition of the algorithm, so we will make this assumption. Note thatonce made, this assumption implies that

OF = Z[%].

See [51] for a description of the modifications necessary when it is not a UFD.Now the question of the irreducibility of f arises. If f is reducible over Z,

we are indeed lucky, since then

f(x) = g(x)h(x), with g(x), h(x) % Z[x],

where 0 < deg(g) < deg(f). Therefore,

f(m) = n = g(m)h(m)

is a nontrivial factorization of n, and we are done. Use of the number field sieveis unnecessary. However, the probability is high that f is irreducible since mostprimitive polynomials over Z are irreducible. Hence, for the description of thenumber field sieve, we may assume that f is irreducible over Z.

384 Appendix

Biography A.6 Yuri Vladimirovich Linnik (1915–1972) was born in BelayaTserkov, Ukraine on January 21, 1915. His university studies began in 1932when he entered Leningrad University, from which he graduated in 1938. Hebegan studying for his doctorate under the guidance of Vladimir Tartakovski,and produced a thesis on quadratic forms that earned him the higher degree ofD.Sc. in Mathematics and Physics. In April of 1940, the Leningrad branch ofSteklov Institute for Mathematics was formed and Linnik began working therefrom the outset. At this time the German army was approaching Leningradand Linnik was involved in the fighting in Kazan. When the siege of Leningradended in 1944, he returned to the Steklov Institute. He was also appointedas professor of mathematics at Leningrad State University, and he stayed inLeningrad for the rest of his life, working on number theory, probability, andstatistics. One of his contributions to the analytic theory of quadratic formswas to introduce ergodic methods into its study. In 1941, he published a paper[52] which introduced his large sieve. He used this term to describe the methodof eliminating some residue classes modulo a prime from a given set of integerswhere the number of classes (possibly) increased when the prime increased.He was motived to create his sieve in order to tackle Vinogradov’s hypothesis,which postulated that the size np of the smallest quadratic nonresidue moduloa given prime p is O(pe) for any e > 0. He was able to use his sieve to showthat the number of primes p < x for which np > pe is O(loge loge x). Linnik’sresults using his sieve naturally led him to study Dirichlet L-functions, wherehe generalized density theorems to them. His interest in probability theory alsoled him to introduce the dispersion method into number theory. In 1959, heused his method to prove that any su!ciently large integer can be representedas the sum of a prime and two squares of integers—see [53]. He also solvedproblems in statistics and applied his methods to number-theoretic problems. Hewas highly talented outside of mathematics as well, speaking seven languagesfluently and had interests in poetry and history. Among his honours were:election to the International Statistical Institute, the Academy of Sciences ofthe USSR in 1964, being awarded the State Prize in 1947, and the Lenin Prizein 1970. He was also awarded an honourary doctorate from the University ofParis. He died on June 30, 1972 in Leningrad, now St. Petersburg, Russia.

Biography A.7 Maurice Kraitchik (1882–1957) was born on April 21, 1882 inMinsk, capital of the former Belorussian Soviet Socialist Republic. From 1915to 1948, he was an engineer in Brussels, Belgium and also held a directorshipat the Mathematical Sciences section of the Mathematics Institute of AdvancedStudies there. From 1941 to 1946, he was Associate Professor at the New Schoolfor Social Research in New York. He died on August 19, 1957 in Brussels.

Since f(m) + 0(mod n), we may define the natural homomorphism,

5 : Z[%] ./ Z/nZ,

Sieve Methods 385

given by% ./ m % Z/nZ.

Then

5

B

C!

j

aj%j

D

E =!

j

ajmj .

Now define a set S consisting of pairs of relatively prime integers (a, b), satisfyingthe following two conditions:

7

(a,b)'S

(a + bm) = c2, (c % Z), (A.12)

and 7

(a,b)'S

(a + b%) = &2, (& % Z[%]). (A.13)

Thus,5(&2) = c2,

so5(&2) + c2 (mod n).

In other words, since 5(&2) = 5(&)2, then if we set 5(&) = h % Z,

h2 + c2 (mod n).

This takes us back to Kraitchik’s original idea, and we may have a nontrivialfactor of n, namely gcd(h ± c, n) (provided that h '+ ±c(mod n)).

The above overview of the number field sieve methodology is actually aspecial case of an algebraic idea, which is described as follows. Let R be a ringwith homomorphism

# : R ./ Z/nZ5 Z/nZ,

together with an algorithm for computing nonzero diagonal elements (x, x) forx % Z/nZ. Then the goal is to multiplicatively combine these elements to obtainsquares in R whose square roots have an image under # not lying in (x,±x) fornonzero x % Z/nZ. The number field sieve is the special case

R = Z5 Z[%], with #(z, &) = (z, 5(&)).

Before setting down the details of the formal number field sieve algorithm,we discuss the crucial role played by smoothness introduced in Definition 2.21 onpage 93. Recall that a smooth number is one with only “small” prime factors.In particular, n % N is B-smooth for B % R+, if n has no prime factor biggerthan B. Smooth numbers satisfy the triad of properties:

(1) They are fairly numerous (albeit sparse).

(2) They enjoy a simple multiplicative structure.

386 Appendix

(3) They play an essential role in discrete logarithm algorithms.

If F = Q(%) is a number field, then by definition

an algebraic number a + b% % Z[%] is B-smooth if |NF (a + b%)| is B-smooth.

Hence, a + b% is B-smooth if and only if all primes dividing |NF (a + b%)| areless than B. Thus, the idea behind the number field sieve is to look for smallrelatively prime numbers a and b such that both a+%b and a+mb are smooth.Since 5(a + %b) = a + mb, then each pair provides a congruence modulo nbetween two products. Su"ciently many of these congruences can then be usedto find solutions to h2 + c2 (mod n), which may lead to a factorization of n.

The above overview leaves open the demanding questions as to how wechoose the degree d, the integer m, and how the set of relatively prime integersa, b such that Equations (A.12)–(A.13) can be found. These questions may nowbe answered in the following formal description of the algorithm.

$ The Number Field Sieve Algorithm

Step 1. (Selection of a Factor Base and Smoothness Bound)There is a consensus that smoothness bounds are best chosen empirically.

However, there are theoretical reasons for choosing such bounds as

B = exp((2/3)2/3(log n)1/3(log log n)2/3),

which is considered to be optimal since it is based upon the choice for d asabove. See [51, Section 6.3, p. 32] for details. Furthermore, the reasons for thisbeing called a smoothness bound will unfold in the sequel.

Define a set S = S1 3 S2 3 S3, where the component sets Sj are given asfollows. S1 = {p % Z : p is prime and p # B},

S2 = {uj : j = 1, 2, . . . , r1 + r2 ! 1, where uj is a generator of UF }.

(Here {r1, r2} is the signature of F , and the generators uj are the generators ofthe infinite cyclic groups given by Dirichlet’s Unit Theorem—see [64, Theorem2.78, p. 114].) Also,

S3 = {& = a + b% % Z[%] : |NF (&)| = p < B2 where p is prime },

where B2 is chosen empirically. Now we set the factor base as

F = {aj = 5(j) % Z/nZ : j % S}.

Also, we may assume gcd(aj , n) = 1 for all j % S, since otherwise we have afactorization of n and the algorithm terminates.

Step 2. (Collecting Relations and Finding Dependencies)

We wish to collect relations (A.12)–(A.13) such that they occur simultane-ously, thereby yielding a potential factor of n. One searches for relatively primepairs (a, b) with b > 0 satisfying the following two conditions.

Sieve Methods 387

(i) |a + bm| is B-smooth except for at most one additional prime factor p1,with B < p1 < B1, where B1 is empirically determined.

(ii) a+b% is B2-smooth except for at most one additional prime & % Z[%] suchthat |NF (&)| = p2 with B2 < p2 < B3, where B3 is empirically chosen.

The prime p1 in (i) is called the large prime, and the prime p2 in (ii) iscalled the large prime norm. Pairs (a, b) for which p1 and p2 do not exist(namely when we set p1 = p2 = 1) are called full relations, and are calledpartial relations otherwise. In the sequel, we will only describe the full relationssince, although the partial relations are more complicated, they lead to relationsamong the factor base elements in a fashion completely similar to the ones forfull relations. For details on partial relations, see [50, Section 5].

First, we show how to achieve relations in Equation (A.12), the “easy” part(relatively speaking). (This is called the rational part, whereas relations inEquation (A.13) are called the algebraic part.) Then we show how to put thetwo together. To do this, we need the following notion from linear algebra.

Every n % N has an exponent vector v(n) defined by n =8"

j=1 pvj

j , wherepj is the jth prime, only finitely many of the vj are nonzero, and

v(n) = (v1, v2, . . .) = (vj)"j=1

with an infinite string of zeros after the last significant place. We observe thatn is a square if and only if each vj is even. Hence, for our purposes, the vj givetoo much information. Thus, to simplify our task, we reduce each vj modulo 2.Henceforth, then vj means vj reduced modulo 2. We modify the notion of theexponent vector further for our purposes by letting B1 = "(B), where "(B) isthe number of primes no bigger than B. Then, with p0 = !1, a+bm =

8B1j=0 p

vj

jis the factorization of a + bm. Set

v(a + bm) = (v0, . . . , vB1),

for each pair (a, b) with a + b% % S3. The choice of B allows us to make theassumption that |S3| > B1+1. Therefore, the vectors in v(a+bm) for pairs (a, b)with a + b% % S3 exceed the dimension of the F2-vector space FB1+1

2 . In otherwords, we have more than B1 + 1 vectors in a B1 + 1-dimensional vector space.Therefore, there exist nontrivial linear dependence relations between vectors.This implies the existence of a subset T of S3 such that

!

a+b#'T

v(a + bm) = 0 % FB1+12 ,

so 7

a+b#'T

(a + bm) = z2 (z % Z).

This solves Equation (A.12).

388 Appendix

Now we turn to the algebraic relations in Equation (A.13). We may calculatethe norm of a + b% by setting x = a and y = b in the homogeneous polynomial

(!y)df(!x/y) = xd ! c(!y)d,

with f(x) = xd ! c. Therefore, NF (a + b%) = (!b)df(!ab#1) = ad ! c(!b)d.Let

Rp = {r % Z : 0 # r # p! 1, and f(r) + 0 (mod p)}.

Then for relatively prime pairs (a, b), we have

NF (a + b%) + 0 (mod p) if and only if a + !br (mod p),

and this r is unique. Observe that by the relative primality of a and b, themultiplicative inverse b#1 of b modulo p is defined since, for b + 0(mod p),there are no nonzero pairs (a, b) with NF (a + b%) + 0(mod p).

The above shows that there is a one-to-one correspondence between those& % Z[%] with |NF (&)| = p, a prime and pairs (p, r) with r % Rp. Note that thekernel of the natural map

5 : Z[%] ./ Z/pZ is ker(5) = ,a + b%-,

the cyclic subgroup of Z[%] generated by a + b%. It follows that

|Z[%] : ,a + b%-| = |NF (a + b%)| = p,

so Z[%]/,a + b%- is a field.This corresponds to saying that the Z[%]- ideal P = (a + b%) is a principal,

first-degree prime Z[%]-ideal, namely one for which NF (P) = p1 = p. Hence,Z[%]/P $= Fp, the finite field of p elements.

The above tells us that in Step 1 of the number field sieve algorithm , the setS3 essentially consists of the first-degree prime Z[%]-ideals of norm NF (P) # B2.These are the smooth, degree one, prime OF -ideals, namely those ideals whoseprime norms are B2-smooth.

In part (ii) of Step 2 of the algorithm on page 387, the additional primeelement & % Z[%] such that |NF (&)| = p2 with B2 < p2 < B3 corresponds tothe prime OF -ideal P2 called the large prime ideal. Moreover, P2 correspondsto the pair (p2, c(mod p2)), where c % Z is such that a + !bc(mod p2), therebyenabling us to distinguish between prime ideals of the same norm. If the largeprime in Step 2 does not occur, we write P2 = (1). Now, since

|a + bm| =7

p'S1

pvp ,

and|a + b%| =

7

u'S2

utu7

s'S3

svs , (A.14)

Sieve Methods 389

for nonnegative tu, vs % Z, and since 5(a + bm) = 5(a + b%), then7

p'S1

5(p)vp =7

u'S2

5(u)tu7

s'S3

5(s)vs ,

in Z/nZ. Therefore, we achieve a relationship among the elements of the factorbase F, as follows

7

u'S2

5(u)tu7

s'S3

5(s)vs +7

p'S1

5(p)vp (mod n). (A.15)

Furthermore, we may translate (A.14) ideal-theoretically into the ideal product

|a + b%| =7

u'S2

utu7

P'S3

"vPP , (A.16)

where P ranges over all of the first-degree prime Z[%]-ideals of norm less thanB2, and "P is a generator of P.

Thus, (A.15) gives rise to the identity7

p'S1

5(p)vp =7

u'S2

5(u)tu7

P'S3

5("P)vP .

If |S3| > "(B), then by applying Gaussian elimination for instance, we canfind x(a, b) % {0, 1} such that simultaneously

7

a+b#'S3

(a + b%)x(a,b) =

**7

u'S2

utu

+ *7

s'S3

svs

++2

,

and7

a+b#'S3

(a + bm)x(a,b) =

B

C

B

C7

p'S1

pvp

D

E

D

E2

,

hold. From this a factorization of n may be gleaned, by Kraitchik’s method.Practically speaking, the number field sieve tasks consist of sieving all pairs

(a, b) for b = b1, b2 . . . , bn for short (overlapping) intervals [b1, b2], with |a| lessthan some given bound. All relations, full and partial, are gathered in this wayuntil su"ciently many have been collected.

The big prize garnered by the number field sieve was the factorization of F9,the ninth Fermat number, as described in [50]. In 1903, A.E. Western found theprime factor 2424833 = 37 · 216 + 1 of F9. Then in 1967, Brillhart determinedthat F9/2424833 (having 148 decimal digits) is composite by showing that itfails to satisfy Fermat’s Little Theorem. Thus, the authors of [50] chose

n = F9/2424833 =92512 + 1

:/2424833.

390 Appendix

Then they exploited the above algorithm as follows. If we choose d as in Equa-tion (A.10), we get that d = 5. The authors of [50] then observed that since2512 + !1(mod n), then for h = 2205, we get

h5 + 21025 + 2 ·92512

:2 + 2 (mod n).

This allowed them to choose the map

5 : Z[ 5&

2] ./ Z/nZ, given by 5 : 5&

2 ./ 2205.

Here Z[ 5&

2] is a UFD. Then they chose m and c as in Equation (A.11), namelysince r = 2, s = !1, and t = 512, then the minimal k with 5k = dk ) t = 512is k = 103, and m = 2103, so c = !8 + 25·103 (mod n). This gives rise tof(x) = x5 + 8 with root % = ! 5

&23, and Z[%] ( Z[ 5

&2]. Observe that

8F9 = 2515 + 8 =92103

:5 + 8.

Thus,5(%) = m = 2103 + !2615 + !

92205

:3 (mod n).

Notice that 2103 is small in relation to n, and is in fact closer to 5&

n. Since

5(a + b%) = a + 2103b % Z/nZ,

we are in a position to form relations as described in the above algorithm.Indeed, the authors of [50] actually worked only in the subring Z[%] to find theirrelations. The sets they chose from Step 1 are S1 = {p % Z : p # 1295377},

S2 = {!1,!1 + 5&

2,!1 + 5&

22! 5&

23

+ 5&

24},

for units u1 = !1, u2 = !1 + 5&

2, and u3 = !1 + 5&

22 ! 5

&23

+ 5&

24, and

S3 = {& % Z[%] : |NF (&)| = p # 1294973, p a prime}.

The authors began sieving in mid-February of 1990 on approximately thirty-five workstations at Bellcore. On the morning of June 15, 1990 the first of thedependency relations that they achieved turned out to give rise to a trivialfactorization! However, an hour later their second dependency relation gaveway to a 49-digit factor. This and the 99-digit cofactor were determined by A.Odlyzko to be primes, on that same day. They achieved: F9 = q7 · q49 · q99,where qj is a prime with j decimal digits as follows:

q7 = 2424833,

q49 = 7455602825647884208337395736200454918783366342657,

and q99 = 741640062627530801524787141901937474059940781097519

023905821316144415759504705008092818711693940737.

Sieve Methods 391

Fermat numbers have an important and rich history, which is intertwinedwith the very history of factoring itself. Euler was able to factor F5. In 1880,Landry used an idea attributable to Fermat to factor F6. As noted above, F7 wasfactored by Pollard. Brent and Pollard used a version of Pollard’s “rho”-methodto factor F8 (see [68, pp. 206–208] for a detailed description with examples ofthe rho-method). As we have shown above, F9 was factored by the numberfield sieve. Lenstra’s elliptic curve method was used by Brent to factor F10 andF11—see §9.3. Several other Fermat numbers are known to have certain smallprime factors, and the smallest Fermat number for which there is no knownfactor is F24. For updates on the largest prime discoveries, see the website:

http://www.utm.edu/research/primes/largest.html.

We have covered several applications of sieve methods as well as their his-torical development. The power of the theory is clearly paramount, but thecomplete proofs of the results in this section would provide the foundation fora third course in number theory. Fittingly, we close our discussion here.

http://www.utm.edu/research/primes/largest.html


Bibliography

[1] D.J. Albers, “Freeman Dyson: Mathematician, Physicist, and Writer”:Interview with Donald J. Albers, College Math. J. 25 (1994), 3–21. (Citedon page 155.)

[2] R. Alter and K.K. Kubota, The Diophantine equation x2 +D = pn, PacificJ. Math. 46 (1973), 11–16. (Cited on page 276.)

[3] A. Baker, Linear forms in logarithms of algebraic numbers, Mathematica13 (1966), pp. 204–216; 14 (1967), pp. 102–107; and 15 (1968), pp. 204–216. (Cited on page 166.)

[4] A. Baragar, On the unicity conjecture for Marko" numbers, Canad. Math.Bull. 39 (1996), 3–9. (Cited on page 123.)

[5] S. Beatty, Problem 3173, American Math. Monthly 33 (1926), 159. (Citedon page 264.)

[6] E. Bombieri, On the large sieve, Mathematika 12, 201–225 (1965). (Citedon page 377.)

[7] E. Bombieri, On twin-almost primes, Acta Arith. 28 (1975), 177–193, 457–461. (Cited on page 379.)

[8] E. Bombieri, The asymptotic sieve, Mem. Acad. Naz. dei XL (1976), 243–269. (Cited on page 379.)

[9] E. Bombieri, Roth’s theorem and the abc-conjecture, preprint ETH Zurich(1994). (Cited on page 299.)

[10] E. Bombieri, The Mordell conjecture revisited, Ann. Sc. Norm. Super.PisaCl. Sci 17 (1990), 615–640. (Cited on page 299.)

[11] C. Breuil, B. Conrad, F. Diamond, and R. Taylor, On the modularity of el-liptic curves over Q: Wild 3-adic exercises, J. Amer. Math. Soc. 14 (2001),843939. (Cited on page 364.)

[12] Y. Bugeaud and T.N. Shorey, On the number of solutions of the generalizedRamanujan-Nagell equation, J. fur die Reine und Angew. Math. 539 (2001),55–74. (Cited on page 281.)

393

394 Advanced Number Theory

[13] J.P. Buhler, H.W. Lenstra Jr., and C. Pomerance, Factoring integers withthe number field sieve, in The Development of the Number FieldSieve, A.K. Lenstra and H. W. Lenstra Jr. (Eds.), Lecture Notes in Mathe-matics, Springer-Verlag, Berlin, Heidelberg, New York 1554 (1993), 50–94.(Cited on page 382.)

[14] H. Chatland and H. Davenport, Euclid’s algorithm in quadratic numberfields, Bulletin of the American Math. Society 55 (1949), 948–953. (Citedon page 50.)

[15] D.A. Clark, A quadratic field which is Euclidean but not norm-Euclidean,Manuscripta Mathematica 83 (1994), 327–330. (Cited on page 50.)

[16] B. Conrad, F. Diamond, and R. Taylor, Modularity of certain potentiallyBarsotti-Tate Galois representations, J. Amer. Math. Soc. 12 (1999), 521–567. (Cited on page 364.)

[17] D. Coppersmith, A. Odlyzko, and R. Schroeppel, Discrete logarithms inGF (p), Algorithmica I (1986), 1–15. (Cited on page 381.)

[18] D.A. Cox, Primes of the Form x2+ny2, Wiley, New York, (1989). (Citedon pages 98, 100, 322, 325, 349–350.)

[19] R. Crandall and C. Pomerance, Prime Numbers: A ComputationalPerspective Springer, New York, Berlin (2001). (Cited on page 298.)

[20] H. Darmon and A. Granville, On the equations zm = F (x, y) and Axp +Byq = Czr, Bull. London Math. Soc., 27 (1995), 513–543. (Cited on page295.)

[21] H. Davenport, The Work of K.F. Roth, Proc. Int. Cong. Math. (1958),LVII-LX Cambridge University Press, 1960. (Cited on page 160.)

[22] J. dePhillis, Mathematical Conversation Starters, M.A.A., Washing-ton, (2002). (Cited on pages 67, 347.)

[23] N.D. Elkies, ABC implies Mordell, Indagationes Math. 11 (2000), 197–200.(Cited on page 299.)

[24] P. Erdos, How many pairs of products of consecutive integers have the sameprime factors?, Amer. Math. Monthly 87 (1980), 391–392. (Cited on page297.)

[25] G. Faltings, Diophantine approximations on abelian varieties, Ann. Math.133 (1991), 549–576. (Cited on page 299.)

[26] M. Van Frankenhuysen, The abc-conjecture implies Roth’s theorem andMordell’s conjecture, Math. Contemp. 16 (1999), 45–72. (Cited on page299.)

Bibliography 395

[27] G. Frey, Links between stable elliptic curves and certain Diophantine equa-tions, Annales Universitatis Saraviensis, Series Mathematicae 1 (1986), 1–40. (Cited on page 353.)

[28] G. Frey and H.-G. Ruck, A remark concerning m-divisibility and the dis-crete logarithm problem in the divisor class group of curves, Math. Comp.62 (1994), 865–874. (Cited on page 327.)

[29] J. Friedlander and H. Iwaniec, The polynomial X2+Y 4 captures its primes,Annals of Math. 148 (1998), 945–1040. (Cited on page 379.)

[30] J. Friedlander and H. Iwaniec, Asymptotic sieve for primes, Annals ofMath. 148 (1998), 1041–1065. (Cited on page 379.)

[31] D.A. Goldston, J. Pintz, and C.Y. Yildirim, Primes in tuples I (preprint(2005)-19 of http://aimath.org/preprints.html); to appear in Ann. of Math.(Cited on page 379.)

[32] D.A. Goldston, J. Pintz, and C.Y. Yildirim, Primes in tuples II(preprint, see:http://front.math.ucdavis.edu/author/D.Goldston). (Citedon page 379.)

[33] D.A. Goldston, J. Pintz, and C.Y. Yildirim, The path to recent progress onsmall gaps between primes, Clay Math. Proceed. 7 (2007). (Cited on pages379, 381.)

[34] S. Goldwasser and J. Killian, Almost all primes can be quickly certified,Proceed. eighteenth annual ACM symp. on theory of computing (STOC),Berkely (1986), 316–329. (Cited on page 324.)

[35] A. Granville, Some conjectures related to Fermat’s last theorem in NumberTheory (R.A. Mollin, ed.) Walter de Gruyter, Berlin, New York (1990),177-192. (Cited on page 297.)

[36] A. Granville and H. Stark, abc implies no Siegel zeros for L-functions ofcharacters with negative discriminant, Invent. Math. 139 (2000), 509–523.(Cited on page 300.)

[37] M. Hall, The Diophantine equation x3 ! y2 = k, in Computers in Num-ber Theory (A. Atkin, B. Birch, eds.) Academic Press (1971). (Cited onpage 296.)

[38] R. Harris, Enigma, Arrow Books, Random House, London (2001). (Citedon page 47.)

[39] D.R. Heath-Brown, Artin’s conjecture for primitive roots, Quart. J. Math.Oxford 37 (1986), 27–38. (Cited on page 369.)

[40] K. Heegner, Diophantische Analysis und Modulfunktionen, Math. Zeitscr.,56 (1952), 227–253. (Cited on page 141.)

http://aimath.org/preprints.html

http://front.math.ucdavis.edu/author/D.Goldston


[41] H. Heilbronn, On Euclid’s algorithm in real quadratic fields, Proc. Cam-bridge Philos. Soc. 34 (1938), 521–526. (Cited on page 50.)

[42] M. Hindy and J.H. Silverman, Diophantine Geometry, an Introduc-tion, Springer, New York, (2000). (Cited on page 299.)

[43] N. Hofreiter, Quadratische Korper mit und ohne Euklidischen Algorithmus,Monatshefte fur Mathematik und Physik 42 (1935), 397–400. (Cited onpage 50.)

[44] J.P. Jones, D. Sato, H. Wada, and D. Wiens, Diophantine representationof the set of prime numbers, Amer. Math. Monthly 83 (1976), 449–464.(Cited on page 295.)

[45] A.W. Knapp, Elliptic Curves, Math. Notes 40, Princeton UniversityPress, Princeton, N.J. (1992). (Cited on pages 330, 365.)

[46] N. Koblitz, Elliptic curve cryptosystems, Math. Comp. 48 (1987), 203–209.(Cited on page 326.)

[47] N. Koblitz, A Course in Number Theory and Cryptography, Aca-demic Press, New York, London (1988). (Cited on pages 314, 320.)

[48] E. Landau, Uber die Klassenzahl der binaren quadratischen Formen vonnegativer Discriminante, Math. Annalen 56 (1903), 671–676. (Cited onpage 102.)

[49] H.W. Lenstra, Factoring integers with elliptic curves, Annals of Math. 126(1987), 649–673. (Cited on page 325.)

[50] A.K. Lenstra, H.W. Lenstra, M.S. Manasse, and J.M. Pollard, The fac-torization of the ninth Fermat number, Math. Comp. 61 (1993), 319–349.(Cited on pages 381, 387, 389–390.)

[51] A.K. Lenstra, H.W. Lenstra Jr., M.S. Manasse, and J.M. Pollard, Thenumber field sieve, in The Development of the Number Field Sieve,A.K. Lenstra, and H. W. Lenstra Jr. (Eds.), Lecture Notes in Mathematics,Springer-Verlag, Berlin, Heidelberg, New York 1554 (1993), 11–42. (Citedon pages 381–383, 386.)

[52] Yu. V. Linnik, The large sieve, Dokl. AN USSR 30 (1941), 290–292.[Russian] (Cited on pages 376–384.)

[53] Yu.V. Linnik, The dispersion method in binary additive problems, Amer.Math. Soc. (1963) (Translated from Russian). (Cited on page 384.)

[54] K. Mahler, Lectures on transcendental numbers, LNM 546, Springer,Berlin, Heidelberg, New York, (1976). (Cited on page 177.)

Bibliography 397

[55] Y. Matiyasevich, Enumerable sets are Diophantine, Doklady Akad. NaukSSSR 191 (1970), 279–282. [Russian] English translation in Soviet Mathe-matics, Doklady 11 (1970). (Cited on page 295.)

[56] A. Menezes, T. Okamoto, and S. A. Vanstone, Reducing elliptic curve log-arithms to logarithms in a finite field, IEEE Trans. Inform. Theory, 39(1993), 1639–1646. (Cited on page 327.)

[57] L. Merel, Bornes pour la torsion des courbes eliptiques sur les corps denombres, Invent. Math. 124 (1996), 437–449. (Cited on page 312.)

[58] P. Mihailescu, Primary cyclotomic units and a proof of Catalan’s conjecture,J. Reine Angew. Math. 572 (2004), 167–195. (Cited on page 294.)

[59] V. Miller, Use of elliptic curves in cryptography in Advances in Cryptog-raphy—Crypto ’85 Proceed., Springer-Verlag, Berlin, LNCS 218 (1987),417–426. (Cited on page 326.)

[60] R.A. Mollin, Number Theory and Applications, Proceedings of theNATO Advanced Study Institute, Ban! Centre, Canada, 27 April–5 May1988, Kluwer Academic Publishers, Dordrecht (1989). (Cited on page xiii.)

[61] R.A. Mollin, Number Theory, Proceedings of the First Conference of theCanadian Number Theory Association, Ban! Centre, Canada, April 17–27,1988, Walter de Gruyter, Berlin (1990). (Cited on page xiii.)

[62] R.A. Mollin, Quadratics, CRC Press, Boca Raton, London, Tokyo (1995).(Cited on pages 60, 65, 108, 256, 276.)

[63] R.A. Mollin, An elementary proof of the Rabinowitch-Mollin-Williams cri-terion for real quadratic fields, J. Math. Sci. 7 (1996), 17–27. (Cited onpage 153.)

[64] R.A. Mollin, Algebraic Number Theory, Chapman and Hall/CRCPress, Boca Raton, London, Tokyo (1999). (Cited on pages 30, 63, 182,189, 286, 291, 301, 344, 381, 386.)

[65] R.A. Mollin, Fundamental Number Theory with Applications, FirstEdition, CRC, Boca Raton, London, New York (1998). (Cited on pages 60,153, 276.)

[66] R.A. Mollin, An Introduction to Cryptography, First Edition (2001).(Cited on page 326.)

[67] R.A. Mollin, Codes: The Guide to Secrecy from Ancient to ModernTimes, CRC, Taylor & Francis Group, Boca Raton, London, New York(2008). (Cited on pages 205, 327.)


[68] R.A. Mollin, Fundamental Number Theory with Applications, Sec-ond Edition, CRC, Taylor & Francis Group, Boca Raton, London, NewYork (2008). (Cited on pages ix, 1, 10–12, 13, 15, 19, 21, 26–28, 40–41,43, 47, 53, 55, 60, 63, 67, 79, 84, 88, 97–98, 102, 130, 132–133, 140,152, 156, 159–160, 166, 167–168, 178, 182, 191, 198, 209, 213–214, 215,221–222, 228–231, 236, 249, 260, 266, 271–272, 282, 291, 294, 324–327,329, 338, 342, 370, 372, 381, 391, 429, 435.)

[69] R.A. Mollin, A note on the Diophantine equation D1x2 + D2 = akn, ActaMath. Acad. Paed. Nyireg. 21 (2005), 21–24. (Cited on page 281.)

[70] R.A. Mollin, Characterization of D = P 2 + Q2 when gcd(P,Q) = 1 andx2 !Dy2 = !1 has no integer solutions, Far East J. Math. Sci. 32 (2009),285–294 (Cited on page 121.)

[71] R.A. Mollin and P.G. Walsh, A note on powerful numbers, quadratic fields,and the Pellian, C.R. Math. Rep. Acad. Sci. Canada 8 (1986), 109–114.(Cited on page 297.)

[72] L.J. Mordell, Reminiscences of an octogenarian mathematician, Amer.Math. Monthly 78 (1971), 952–961. (Cited on page 154.)

[73] L.J. Mordell, Diophantine Equations, Academic Press, London and Newyork (1969). (Cited on page 285.)

[74] C.J. Moreno and S.S. Wagsta!, Jr., Sums of Squares of Integers. (Citedon page 218.)

[75] W. Narkiewicz, Number Theory,World Scientific Publishers, Singapore(1983). (Cited on pages 371, 373–374, 376–377.)

[76] A. Oppenheim, Quadratic fields with and without Euclid’s algorithm, Math.Ann. 109 (1934), 349–352. (Cited on page 50.)

[77] O. Perron, Quadratische Zahlkorper mit Euklidischen Algorithmus, Math.Ann. 107 (1932), 489–495. (Cited on page 50.)

[78] J. M. Pollard, Factoring with Cubic Integers in The Development ofthe Number Field Sieve, A.K. Lenstra and H. W. Lenstra Jr. (Eds.), inLNM, Springer-Verlag, Berlin, Heidelberg, New York 1554 (1993), 4–10.(Cited on page 92.)

[79] G. Rabinowitsch, Eindeutigkeit der Zerlegung in Primzahlfactoren inquadratischen Zahlkorpern, J. Reine Angew. Math. 142 (1913), 153–164.(Cited on pages 153–154.)

[80] R. Remak, Uber den Euklidischen Algorithmus in reelquadratischenZahlkorpern, Jber. Deutschen Math. Verein 44 (1934), 238–250. (Cited onpage 50.)

Bibliography 399

[81] K.A. Ribet, On modular representations of Gal((Q)/Q) arising from mod-ular forms, Invent. Math. 100 (1990), 431–476. (Cited on page 365.)

[82] J.P. Robertson and K.R. Matthews, A Continued Fraction Approach to aResult of Feit, American Math. Monthly, 115 (2008), 346–349. (Cited onpage 121.)

[83] T. Satoh and K. Araki, Fermat quotients and the polynomial time discretelogarithm for anomalous elliptic curves, Comment. Math. Univ. St. Paul,(1998), 81–92. (Cited on page 327.)

[84] M.R. Schroeder, Number Theory in Science and Communication,Springer (1999). (Cited on page 220.)

[85] A. Selberg, On an elementary method in the theory of primes, Norske Vid.Selsk. Forh. Trondhjem 19, 64-67, (1947). (Cited on page 373.)

[86] I. Semaev, Evaluation of discrete logarithms in a group of p-torsion pointsof an elliptic curve in characteristic p, Math. Comp. 67 (1998), 353–356.(Cited on page 327.)

[87] J.-P. Serre, A Course in Arithmetic, Springer-Verlag, New York, Hei-delberg, Berlin (1973). (Cited on pages 341–342.)

[88] J.H. Silverman, The Arithmetic of Elliptic Curves, Springer, NewYork, Berlin, Heidelberg (1985). (Cited on pages 310, 312, 327, 343, 361,363.)

[89] N. Smart, The discrete logarithm problem on elliptic curves of trace one,J. Cryptology 12 (1999), 193–196. (Cited on page 327.)

[90] J. Solinas, Standard specifications for public key cryptography, Annex A:Number-theoretic background. IEEE P1363 Draft (1998). (Cited on page327.)

[91] B.K. Spearman and K.S. Williams, Representing primes by binary quadraticforms, American Math. Monthly, 99 (1992), 423–426. (Cited on page 141.)

[92] A. Srinavasan, Marko" Numbers and Ambiguous Classes, preprint. (Citedon page 125.)

[93] J. Steuding, Diophantine Analysis, Chapman and Hall/CRC Press,Boca Raton, London, Tokyo (2005). (Cited on page 172.)

[94] J. Tate, Algorithm for determining the type of a singular fiber in an ellipticpencil in Modular Functions of One Variable IV, LNM 476, Springer-Verlag, (1975), 33–52. (Cited on page 360.)

[95] R. Taylor and A. Wiles, Ring-theoretic properties of certain Hecke algebras,Ann. of Math. 141 (1995), 553–572. (Cited on page 364.)


[96] E.C. Titchmarsh, A divisor problem, Rend. Circ. Mat. Palermo 54 (1930),414–429. (Cited on page 378.)

[97] G.R. Veldekamp, Remark on Euclidean rings, Nieuw, Tid. Wisk, 48(1960/61), 268–270 (Dutch). (Cited on page 34.)

[98] A.I. Vinogradov, On the denseness hypothesis for Dirichlet L-series, Izv.AN SSSR, Ser. Matem. 29 (1965), 903–934.[Russian] (Cited on page 377.)

[99] P. Vojta, Diophantine Approximation and Value Distribution The-ory, LNM 1239, Springer, Berlin, 1987. (Cited on pages 300, 386.)

[100] M. Waldschmidt, Open Diophantine problems, Moscow Math. J. 4 (2004),245–305. (Cited on page 179.)

[101] E.W. Weisstein, CRC Concise Encyclopedia of Mathematics, CRCPress, Boca Raton, London, New York (1999). (Cited on pages 227, 338,346.)

[102] H. Weyl, A half-century of mathematics, American Math. Monthly, 58(1951), 523–553. (Cited on page 18.)

[103] A. Wiles, Modular elliptic curves and Fermat’s last theorem, Ann. of Math.(1995), 443–551. (Cited on page 364.)

[104] A. Wintner, The Theory of Measure in Arithmetical Semi-Groups,Waverly Press, Baltimore (1944). (Cited on page 216.)

[105] P. Wolfskehl, Beweis, dass der zweite Factor der Klassenzahl fur die ausden elfen und dreizehnten Einheitswurzeln gebildeten Zahlen gleich Eins ist,J. Reine Angew Math., 99 (1886) 173–178. (Cited on page 224.)

[106] G. Zukav, The Dancing Wu Li Masters: An Overview of the NewPhysics, Bantam Books, New York (1979). (Cited on page 317.)

Solutions to Odd-Numbered Exercises 401

Solutions to Odd-Numbered Exercises

Section 1.1

1.1 Since a, b ! Q, then !a + b ! Q(!), so Q(a! + b) " Q(!). However, a #= 0,so a had an inverse a"1 in Q, and ! = a"1(a! + b) $ ba"1 ! Q(a! + b), soQ(!) " Q(a! + b). Hence, we have equality.

1.3 Since (xp $ 1)/(x$ 1) = xp"1 + xp"2 + · · · + x + 1 and "p is a primitive pth rootof unity, then this is the minimal polynomial m!,Q(x).

1.5 By Proposition 1.1 on page 13, ! ! UF if and only if m!,F (0) = ±1. However,since

m!,F (x) =dY

j=1

(x$ !j),

then this occurs if and only ifQd

j=1 !j = ±1. Hence, all !j are units and thelast statement is proved as well.

1.7 Since

xn $ 1 =n"1Y

j=0

(x$ "jn) =

Y

d|n

Y

gcd(j,n)=d

(x$ "jn),

then it su!ces to show thatY

gcd(j,n)=d

(x$ "jn) = "n/d(x),

since n/d runs over all divisors of n as d does. For gcd(j, n) = d, let j = dk.Then "j

n = "dkn = "k

n/d. Also, gcd(k, n/d) = 1, so

Y

gcd(j,n)=d

(x$ "jn) =

Y

gcd(k,n/d)=1

(x$ "jn) = "n/d(x).

Section 1.2

1.9 This is immediate from Corollary 1.3 since m!,F (x) is irreducible over F .

1.11 Since

m!.Q(x) =dY

j=1

(x$ !j) = xd + ad"1xd"1 + · · · + a1x + a0 ! Q[x],

then the coe!cients of m!,Q(x) are sums of products of the !j so by Exercise1.10, !j ! A for all j = 0, 1, 2, . . . , d$ 1 if and only if ! ! A. Hence,

m!,Q(x) ! (Q % A)[x] if and only if ! ! A

However, by Corollary 1.2 on page 4, Q % A = Z, which proves the result.


1.13 ! = #$ + % where:

(a) $ = 1 + i, % = 1. (b) $ = 7 + i, % = $37i. (c) $ = 1 $ 2i, % = 6i. (d)$ = 2 + i, % = $18i.

1.15 ! = 4x$ 5y + (5x + 4y)i for any x, y ! Z, since

! = (4 + 5i)(x + yi) = #$.

1.17 Suppose that & is a greatest common divisor of ! and #. If &j are associates of &for j = 1, 2, then there are uj ! UF for j = 1, 2 such that & = uj&j . Thus, &j

˛&

which implies that &j divides both ! and # for j = 1, 2. Now if % divides both! and #, then %

˛& by Definition 1.14 on page 21. Therefore, since &j = u"1

j &,

then %˛

&j for j = 1, 2. Hence, by Definition 1.14, &j is a greatest commondivisor of ! and # for j = 1, 2. Conversely, if all associates of & are greatestcommon divisors of ! and #, then in particular & is one.

For the last statement, if &j are gcds for j = 1, 2, then &1

˛&2 and &2

˛&1, so

the result follows.

1.19 Let ! = a + b&

D ! OF . If ! ! UF , then 1 ' ! so !v = 1 for some v !UF . Therefore, NF (!v) = NF (!)NF (v) = 1, so NF (!) = ±1. Conversely, ifNF (!) = ±1, then (a + b

&D)(a $ b

&D) = ±1 so by Definition 1.3 on page 2,

! ! UF .

1.21 Since ! ' #, then there exists u ! UF such that ! = u# so

|NF (!)| = |NF (u#)| = |NF (u)||NF (#)| = |NF (#)|,

by Exercise 1.19.

1.23 Let ! = 2 + i and # = 2$ i. Then gcd(NF (!), NF (#)) = 5. However, if % ! Z[i]such that %

˛! and %

˛#, then there exist $1, $2 ! Z[i] such that ! = %$1

and # = %$2. Thus, NF (%)NF ($2) = NF (#) = 5 = NF (!) = NF (%)NF ($1).Therefore, either NF (%) = 1, in which case we have our counterexample sincethen 2 + i and 2$ i are relatively prime, or NF (%) = 5 which implies NF ($1) =NF ($2) = 1. In the latter case, $j ! {±1,±i} for j = 1, 2, so !$"1

1 = #$"12 ,

which implies ! ' # since ! = #$1$"12 = #u, where u ! {±1,±i}. However, all

solutions of ! = u# lead to contradictions. Hence, NF (%) = 1 and we have ourcounterexample.

1.25 (a) 1 + 2i where 12 + 9i = (6$ 3i)(1 + 2i) and 2 + 69i = (28 + 13i)(1 + 2i) (b)1 + i where 2 + 8i = (5 + 3i)(1 + i) and 21 + 9i = (15$ 6i)(1 + i)

1.27 (a) 3 + 2i where 17 + 7i = (5$ i)(3 + 2i) and 71 + 4i = (17$ 10i)(3 + 2i) (b) 1

1.29 If ! and # are relatively prime, then by Theorem 1.10 on page 21, there exist$, ' ! Z[i] such that

1 = !$ + #'.

Thus, by taking conjugates over this equation, we get

1 = !#$# + ##' #,

which implies that !# and ## are relatively prime since any common divisor ofthem must divide 1.

Conversely, if !# and ## are relatively prime, then as above, there exist$1, '1 ! Z[i] such that 1 = !#$1 + ##'1. Taking conjugates over this equation,we get

1 = (!#)#$#1 + (##)#' #1 = !$#1 + #' #1,

so ! and # are relatively prime.


1.31 If a + bi is primary, then a + b ( 1(mod 4) where a is odd and b is even. Thus,

a + bi = 1 +

„$1 + a + b

4+

„1$ a + b

4

«i

«(2 + 2i) ( 1 (mod 2 + 2i),

in Z[i].

Conversely, if a + bi ( 1(mod 2 + 2i) in Z[i], then there exist c, d ! Z such that

a + bi = 1 + (c + di)(2 + 2i) = 1 + 2c$ 2d + (2c + 2d)i.

By comparing coe!cients,

a = 1 + 2c$ 2d ( 1 (mod 2), b = 2c + 2d ( 0 (mod 2),

anda + b = 1 + 4c ( 1 (mod 4).

1.33 If ! = a + bi is an odd Gaussian integer that is not primary, then one of thefollowing holds, (a) a is even; (b) b is odd; or (c) a + b #( 1(mod 4). It remainsto show that exactly one of its associates $!, i!, or $i! is primary.

If (a) holds, then $! cannot be primary since $a is even. Also, i! = ai$ b.If b is even, then

! = 2

„a2

+b2i

«= (1 + i)(1$ i)

„a2

+b2i

«,

which implies that (1+ i)˛! ! Z[i], contradicting that ! is odd. Thus, b is odd.

If a$ b ( 1(mod 4), then i! is primary. However, $i! = b$ ai is not primarysince b$ a ( $1(mod 4), so in this case exactly one of the associates, i!, of !is primary, so we may assume that a $ b #( 1(mod 4). Since a $ b is odd, thena $ b ( $1(mod 4) which makes $i! the only primary associate. This takescare of case (a).

If (b) holds, and a is odd, then there exist c, d ! Z such that

! = a + bi = 2c + 1 + (2d + 1)i = 2(c + di) + 1 + i = (1 + i)[c$ ci + di + d + 1],

so (1 + i)˛!, contradicting that ! is odd. Hence, a is even. However, this puts

us back in case (a), with which we have already dealt.

If (c) holds, then given that we have already dealt with the cases where a iseven and b is odd, we must have that a is odd and b is even. Since a + b is odd,then a + b ( $1(mod 4), which makes $! = $a $ bi primary, and neither ofthe other associates are primary.

This completes the analysis of the result for ! not, itself, primary. If ! isprimary, then $! = $a$ bi cannot be since $a$ b ( $1(mod 4). Also,

±i! = ai) b

cannot be primary since b is even.

1.35 (a) (1 + i)(2 + 5i)2(3$ 2i)3

(b) (2$ i)(1$ 4i)2(1$ 2i)3

(c) (5 + 2i)(4 + 5i)2(3$ 2i)3

(d) (1 + i)(2 + 7i)2(3$ 8i)


1.37 If p = (a+bi)(c+di) for a, b, c, d ! Z and neither right-hand factor is a unit, thenNF (a+bi) = a2 +b2 = p = NF (c+di) = c2 +d2, since NF (p) = p2. However, asnoted in Example 1.15 on page 28, it is not possible for a prime p ( 3(mod 4)to be a sum of two integer squares. Hence, one of the aforementioned factorsmust be a unit, so p is a Gaussian prime.

Section 1.3

1.39 Let !, # ! R be nonzero elements and set

S = {& ! R : & = (! + )#, for some (, ) ! R}.

Since 1R! + 0 ! S and 0 + 1R# ! S, then S consists of more than just thezero element. If f is the Euclidean function on R, we may choose an element&0 = (0! + )0# ! S with f(&0) as a minimum. Now let & = (! + )# ! S bearbitrary. By condition (b) of Euclidean domains in Definition 1.17 on page 32,there are $, % ! R such that

& = $&0 + %, with either % = 0, or f(%) < f(&0).

Since% = & $ $&0 = (! + )# $ $((0! + )0#) =

(($ $(0)! + () $ $)0)# ! S,

then if % #= 0, condition (b) of Euclidean domains tells us that

f(%) = f((($ $(0)! + () $ $)0)) < f(&0),

a contradiction to the minimality of f(&0). Thus, % = 0, and so & = $&0. Inother words, &0|& for all & ! S. In particular &0|! and &0|#. It remains to showthat any common divisor of ! and # in R must divide &0. Let &1|!, and &1|#.Therefore, &1|$0! + %0# = &0. Hence, &0 is a gcd of ! and # as required.

1.41 If the condition in the exercise holds and !# #= 0 for !, # ! R, then !˛

!#sof(!) * f(!#), which is condition (a) in Definition 1.17. Conversely, if (a)holds and !

˛#, then # = !& for some & ! R. Therefore, by (a), f(!) *

f(!&) = f(#).

1.43 If ! ! R is a unit, there exists a u ! R such that u! = 1R. Thus, by Exercise1.42 and condition (a) of Definition 1.17,

f(1R) * f(!) * f(u!) = f(1R),

so f(!) = f(1R). Conversely, if f(!) = f(1R), then for any # ! R, # = !& + %for some &, % ! R. If % #= 0, then f(%) < f(!) = f(1R) * f(%), a contradiction.Hence, for each # ! R, !

˛#. In particular, !

˛1R, which makes it a unit in R.

1.45 Since a + b&

D ! Z[&

D] " OF for any quadratic field, and since x2 $Dy2 = 1has infinitely many solutions for D > 0 by Pell’s solutions, then we have ourresult.


1.47 If 2 + i = (a + bi)(c + di) for a, b, c, d ! Z, then

NF (2 + i) = 5 = NF (a + bi)NF (c + di) = (a2 + b2)(c2 + d2),

so either a2 + b2 = 1 or c2 + d2 = 1. Therefore by Exercise 1.19 on page 29, oneof them is a unit. The argument for 2$ i is the same. Thus, 2 + i and 2$ i areirreducible.

1.49 If %1 and %2 are least common multiples of ! and #, then by property (b) ofDefinition 1.21 on page 40, %1

˛%2 and %2

˛%1, so %1 ' %2 by Exercise 1.16 on

page 29.

1.51 The converse is false since 2 is irreducible in Z[&

10] by Example 1.17, butNF (2) = 4.

1.53 The converse is false since Z[i] is a UFD by Theorem 1.15 on page 34, and 3 isa Gaussian prime by Exercise 1.37 on page 30, but NF (3) = 9.

Section 1.4

1.55 We may factor in the Gaussian integers Z[i] as follows.

(y + i)(y $ i) = x3.

By the same method as in the proof of Theorems 1.19–1.20 on pages 47 and 48we have that y + i and y$ i are relatively prime. Thus, by unique factorizationensured for the Gaussian integers, there is a # = a + bi ! Z[i] such that

y + i = #3 = (a + bi)3,

andy $ i = (a$ bi)3.

Subtracting the two equations and dividing by 2i we get

1 = b(3a2 $ b2).

Therefore, b = ±1. However, b = 1 implies that 2 = 3a2, which is impossible,so b = $1. This forces 1 = $(3a2 $ 1). Thus, a = 0, so y = 0. Hence,x = ($i)i = 1, which secures the result.

1.57 Since !˛NF (!) = !!# ! Z, then there is a least element n ! N such that !

˛n.

If n = n1n2 for nj ! N, j = 1, 2, with 0 < n1 * n2 < n. Then !˛

(n1n2), so!˛n1 or !

˛n2, contradicting the minimality of n in this regard. Hence, n is a

rational prime, say n = p. If !˛q where q is a rational prime with q #= p, then

by the Euclidean algorithm for rational integers, there exist a, b ! Z such that1 = ap + bq, Since !

˛p and !

˛q, then !

˛1, a contradiction, so p is the only

rational prime divisible by !.

1.59 By Exercises 1.56–1.57, there is a unique rational prime p such that NF (!) = ±p.

1.61 If 2 is not prime in OF , then by Exercise 1.56, 2 = !!# where ! = (a+b&

D)/2 !OF . Thus,

±2 = NF (!) = !!# =a2 $ b2D

4,


where a, b have the same parity. If both are odd, then

±8 = a2 $ b2D ( 1$D ( $4 (mod 8),

a contradiction. If both are even, then ±2 = (a/2)2 $ (b/2)2D, so both a/2and b/2 have to be odd. Therefore, ±2 ( 1 $D ( 4(mod 8), a contradiction.Hence, 2 is prime in OF as required.

1.63 If p˛

D, then |D| = pn for some n ! N. If n = 1, then p = ±&

D ·&

D, where&D is a prime in OF by Exercise 1.52 on page 46, since NF (

&D) = ±p. Thus,

p '&

D2. If n > 1, then

D = p(D/p) =&

D ·&

D. (S1)

However, p does not divide&

D, since to do so would mean that&

D =p(a + b

&D)/2 where a, b ! Z have the same parity, by Theorem 1.3 on page 6.

However, this means that a = 0 and pb/2 = 1, where b must be even, a con-tradiction. Thus, p is not a prime in OF . Therefore, by Exercise 1.56, !

˛p

where ! is prime in OF and NF (!) = ±p. Now, by (S1), !˛ &

D, so !2˛

D,which implies that !2

˛p since ! ! D/p. Thus, p = !2# where # ! OF . How-

ever, NF (p) = p2 = NF (!2)NF (#) = NF (!)2NF (#) = p2NF (#), so NF (#) = 1,which means that # ! UF . Therefore, p ' !2.

Section 2.1

2.1 Let M be a Z-module. If r ! Z, and m ! M , then

r · m = m + · · ·m| {z }r

,

so the properties of an additive abelian group are inherited from this action.Conversely, if M is an additive abelian group, then the addition within thegroup gives the Z-module action as above.

2.3 We only prove this for $ = 1 since the other case is similar.

Suppose that I is an ideal. Therefore, a&

D ! I, so c|a by the minimality of c.We have &

D(b + c&

D) = b&

D + cD ! I,

so c|b. Moreover, since„

bc$&

D

«(b + c

&D) =

b2 $ c2Dc

! I,

thena|(b2 $ c2D)/c.

In other words,ac|(b2 $ c2D).

Conversely, assume that I satisfies the conditions. To verify that I is an ideal, weneed to show that a

&D ! I and (b +

&D)&

D ! I. This is a consequence of thefollowing identities, the details of which we leave to the reader for verification:

a&

D = $(b/c)a + (a/c)(b + c&

D),


andb&

D + cD = $(b2 $ c2D)/c + b(b + c&

D)/c,

so I is an ideal.

2.5 If [!, #] = [&, %], there are integers x, x0, y, y0, z, z0, w, w0 such that

! = x& + y%, # = w& + z%,

and& = x0! + y0#, % = w0! + z0#.

These two sets of equations translate into two matrix equations as follows.

!#

!= X

&%

!,

where

X =

„x yw z

«,

and &%

!= X0

!#

!,

where

X0 =

„x0 y0

w0 z0

«.

Hence, !#

!= XX0

!#

!.

Therefore, the determinants of X and X0 are ±1, so the result follows.

Conversely, assume that the matrix equation holds as given in the exercise. Thenclearly

[!, #] " [&, %].

Since the determinant of X is ±1, we can multiply both sides of the matrixequation by the inverse of X to get that & and % are linear combinations of !and #. Thus,

[&, %] " [!, #].

The result is now proved.

2.7 LetJi = (ai, (bi +

&#)/2) for i = 1, 2

be OF -ideals such that J1J2 " P. Then by the multiplication formulas givenon page 59, J1J2 = (a3, (b3 +

&#)/2) where a3 = a1a2/g ( 0(mod p) with

g = gcd(a1, a2, (b1 + b2)/2)). If p ! a2 (which means that J2 #" P), then p˛

a1

since p cannot divide g given that it does not divide a2. Thus, to show thatJ1 " P, it remains to show that b1 = 2pn + b for some n ! Z, by Exercise 2.6.Now, by Exercise 2.4,

b21 ( # (mod 4a1) and b2 ( # (mod 4p),


so b21 ( b2 (mod 4p). Since p is prime, then b1 ( ±b(mod 2p). If

b1 ( $b (mod 2p), then J1 " P# = (p, ($b +&

#)/2),

so if ($b+&

#)/2 ! P, then J1 " P so we are done by Theorem 2.2 on page 57.If ($b +

&#)/2 #! P, then P % P# = (p), so a3 = 1, and this forces p

˛1, a

contradiction. The remaining case is b1 ( b(mod 2p), so b1 = 2pn + b for somen ! Z, as required.

Section 2.2

2.9 Let Pj be distinct prime R-ideals with

I =rY

j=1

Pajj and J =

rY

j=1

Pbjj ,

where aj , bj + 0. Choose !j ! Pajj $P

aj+1j for j = 1, 2, . . . , r. By Theorem 2.18

on page 84, there exists an ! ! R such that

!$ !j ! Paj+1j for all j = 1, . . . , r.

Thus,! ! P

ajj and ! #! P

aj+1j for 1 * j * r.

Therefore,! ! %r

j=1Pajj " I.

Therefore, by Remark 2.10 on page 81,

I " gcd((!), IJ) = (!) + IJ " I,

so gcd((!), IJ) = I, as required.

2.11 By Exercise 2.11, there is an ! ! I such that

(!) + IJ = I. (S2)

Since (!) " I, then I˛

(!) by Corollary 2.5 on page 76, so there exists anR-ideal H such that (!) = HI. Substituting this into (S2), we get

I = IH + IJ = I(H + J),

by Exercise 2.10. Hence, by Corollary 2.7 on page 77, R = H + J = gcd(H, J).

2.13 If R does not satisfy the DCC, there exists an infinite nonterminating descend-ing sequence of ideals {Ij}, so there can exist no minimal element in this set.Conversely, if R satisfies the DCC, then any nonempty collection S of ideals hasan element I. If I is not minimal, then it contains an element I1. If I1 is notminimal, then it contains an ideal I2, and so on. Eventually, due to DCC, theprocess terminates, so the set contains a minimal element.


2.15 Since s is integral over R, there exists a monic polynomial f(x) =Pd

j=0 rjsj !

R[x] such that f(s) = 0. Thus,

sd = $d"1X

j=0

rjsj ,

so sd+k for any nonnegative integer k can be expressed as an R-linear combina-tion of si for i = 0, 1, . . . , d$ 1. Hence,

R[s] = R + Rs + · · · + Rsd"1,

which means that R[s] is finitely generated as an R-module.

2.17 First we show that I"1 is unique for an invertible fractional R-ideal in the sensethat if IJ = R for some J ! G, then J = I"1.

If I ! G is invertible, and J ! G with IJ = R, then

I"1 = I"1R = I"1(IJ) = (I"1I)J = RJ = J,

so that I"1 = J .

If (a) holds, then, by the above, every nonzero fractional ideal I has a uniqueinverse given by I"1. Since I"1H ! G for any I, H ! G with I nonzero, thisshows that G is a multiplicative group.

Conversely, if (b) holds, then every nonzero I ! G has a unique inverse J ,namely IJ = R. As above, J = I"1, so I is invertible.

2.19 (a) By Theorem 2.12 on page 77,

I =Y

P

PordP(I) and J =Y

P

PordP(J),

where there are only finitely many nonzero exponents. Moreover,Y

P

PordP(IJ) = IJ =Y

P

PordP(I)+ordP(J),

so via the uniqueness guaranteed by Theorem 2.12,

ordP(IJ) = ordP(I) + ordP(J).

(b) Let H = I + J . Then it follows from Exercise 2.10 that

IH"1 + JH"1 = (I + J)H"1 = HH"1 = R,

where the last equality comes from Theorem 2.11 on page 76. Thus, we havethat

both IH"1 " IH"1 + JH"1 = R and JH"1 " IH"1 + JH"1 = R,

so both IH"1 and JH"1 are integral R-ideals. If both IH"1 " P and JH"1 "P, then

R = IH"1 + JH"1 " P + P = P,

contradicting that P is prime. Thus, either IH"1 #" P or JH"1 #" P. Therefore,by Corollary 2.5 on page 76, either P ! IH"1 or P ! JH"1. Thus,

min(ordP(IH"1), ordP(JH"1)) = 0. (S3)


Also, by part (a),

ordP(I) = ordP(IH"1H) = ordP(IH"1) + ordP(H),

andordP(J) = ordP(JH"1H) = ordP(JH"1) + ordP(H).

Therefore, by (S3),

min(ordP(I), ordP(J)) = ordP(H) = ordP(I + J).

(c) Suppose that ordP(I) = a. Select an element !P ! Pa $ Pa+1. ThenordP((!P)) = a = ordP(I). Then by induction, for any prime R-ideal Q dividingI, there exists an element

!Q ! QordQ(I)Y

P

˛I

P $=Q

PordP(I)+1 $ QordQ(I)+1Y

P

˛I

P $=Q

PordP(I)+1,

so ordQ((!Q)) = ordQ(I). Hence, by selecting ! =P

Q˛I!Q ! F , we have, by

inductively extrapolating from part (b), that if Pj

˛I for j = 1, 2, . . . n are all

the distinct prime R-ideals dividing I, then

ordP1((!)) = ordP1

nX

j=1

ordPj (!Pj )

!=

min (ordP1((!P1)), ordP1((!P2)) . . . , ordP1((!Pn))) = ordP1((!P1)),

namely ordP1((!)) = ordP1((!P1)), as required.

Section 2.3

2.21 Let F = Q(!) where ! = 3&$2 for which

m!,Q(x) = x3 + 2 and m!2,F (x) = x3 $ 4.

Hence, !, !2 ! OF . Since

deg(m!,Q) = deg(m!2,Q) = |F : Q| = 3,

then {1, !, !2} provides a Z-basis for Z[ 3&$2]. Since ! ! A % F = OF , then

Z[!] " OF . It remains to show equality.

Since |F : Q| = 3, then |OF : Z| = 3. However, |OF : Z| = |OF : Z[!]| · |Z[!] :Z|, so either |OF : Z[!]| = 1 or |OF : Z[!]| = 3. In the former case, we are donesince then OF = Z[!]. In the latter case, |Z[!] : Z| = 1 is forced and this meansthat Z[!] = Z so ! ! Z which is false.

2.23 Let {!i}i%I and {#j}j%J be bases for K over F and E over K, respectively,where I and J are indexing sets, possibly infinite. We now show that the set ofproducts

{!i#j}(i,j)%I&J


is a basis for E over F . If ! ! E, then it has a unique representation

! =X

j%J

&j#j , where &j ! K for j ! J.

Also, for each j ! J, there is a unique representation

&j =X

i%I

%i!i, where %i ! F for i ! I.

Hence, we have a unique representation

! =X

j%J

&j#j =X

j%J

#j

X

i%I

%i!i =X

j%J

X

i%I

%i!i#j ,

which yields the result.

2.25 By examining coe!cients, we have

fF (#) =nY

j=1

(x$ #j) = xn $ TF (#)xn"1 + · · · + ($1)nNF (#),

so by Exercise 2.24, NF (#), TF (#) ! Q. If ! ! OF , then by Corollary 1.4 onpage 11, m!,Q(x) ! Z[x], so by Exercise 2.24 again, NF (#), TF (#) ! Z.

2.27 Since, for a primitive cube root of unity "3, we have

NF (#) = (a + b! + c!2)(a + b"3! + c"23!2)(a + b"2

3! + c"43!2),

then using the fact thatP2

j=0 "j3 = 0 we get

NF (#) = (a + b! + c!2)((a2 + 2bc)$ (ab + 2c2)! + (b2 $ ac)!2),

so, by simplifying,NF (#) = a3 $ 2b3 + 4c3 + 6abc.

2.29 Since #˛&, then there is a % ! OF such that & = #%, so by Exercise 2.28,

NF (&) = NF (#%) = NF (#)NF (%),

so NF (#)˛NF (&).

2.31 Since (57 $ 1)˛(577 $ 1) and 4

˛(57 $ 1), then (57 $ 1)/4 = 19531

˛(577 $ 1).

2.33 Since 3(3239 $ 1) = 3240 $ 3 = x3 $ 3, where x = 380, and NF (a + b 3&

3) =a3 + 3b3, for F = Q( 3

&3), then NF (x$ 3

&3) = x3 $ 3. An initial run shows that

gcd(3240 $ 3, a3 + 3b3) = 479, for a = 14, and b = 185, so 479|(3239 $ 1).

Section 3.1

3.1 Clearly, sincef(x, y) = g(X, Y ) for

X = px + qy (S4)

andY = rx + sy, (S5)

then equivalent forms represent the same integers by definition. Since ps$ qr =±1 and from (S4)–(S5), x = ±(sX$ qY ) and y = ±(rX$ pY ), so gcd(x, y) = 1if and only if gcd(X, Y ) = 1.


3.3 Suppose that f(x, y) = g(X, Y ) where X = px+qy, Y = rx+sy, and ps$qr = 1.If we set x = X and Y = y, namely p = s = 1 and q = r = 0, then f(x, y) =g(x, y) and we have the reflexive property. Also, since

g(X1, Y1) = f(x, y),

where X1 = sx$ qy and Y1 = py $ rx, then we have the symmetry property.

Lastly, for transitivity, assume that

g(X, Y ) = h(PX + QY, RX + SY ),

where PS $QR = 1. Then since

PX + QY = P (px + qy) + Q(rx + sy) = (Pp + Qr)x + (Pq + Qs)y = P1x + Q1y

and

RX + SY = R(px + qy) + S(rx + sy) = (Rp + Sr)x + (Rq + Ss)y = R1x + S1y

we have

P1S1 $Q1R1 = (Pp + Qr)(Rq + Ss)$ (Pq + Qs)(Rp + Sr) =

PRpq + QRrq + PpSs + QrSs$ PqRp$ PqSr $QsRp$QsSr =

QR(rq $ sp) + PS(ps$ qr) = $QR + PS = 1,

sof(x, y) = h(P1x + Q1y, R1x + S1y),

with P1S1 $Q1R1 = 1, which is the transitive property.

3.5 If f ' g, f = (a, b, c), g = (a1, b1, c1) with f primitive, then

ax2 + bxy + cy2 = a1(px + qy)2 + b1(px + qy)(rx + sy) + c1(rx + sy)2 =

(a1p2 + b1pr+ c1r

2)x2 +(2pqa1 +(ps+rq)b1 +2rsc1)xy +(q2a1 +qsb1 + c1s2)y2,

so if gcd(a1, b1, c1) = g, then g˛gcd(a, b, c) = 1, and the result is secured.

3.7 Applying the substitution x = pX + qY and y = rX + sY to the form

f(x, y) = ax2 + bxy + cy2,

we get the form AX2 + BXY + CY 2, where

A = ap2 + bpr + cr2,

B = 2apq + b(ps + qr) + 2crs,

C = aq2 + bqs + cs2.

A straightforward calculation shows that

B2 $ 4AC = (b2 $ 4ac)(ps$ qr)2,

which yields the result.


3.9 If the primitive form f(x, y) properly represents n ! Z, then

f(x, y) = nx2 + bxy + cy2

may be assumed by Exercise 3.2. Therefore, D = b2 $ 4nc. Thus, D is aquadratic residue modulo n. If n is even, then D ( b2 (mod 8) where b isnecessarily odd, so D ( 1(mod 8). Conversely, if D ( b2 (mod |n|), where nis odd, we may assume that D and b have the same parity by replacing b byb + n, if necessary. Therefore, since D ( 0, 1(mod 4), then D ( b2 (mod 4|n|),which implies that there exists an integer m such that D = b2 $ 4mn. Hence,nx2 + bxy + my2 properly represents n and has discriminant D. Lastly, sincegcd(D, n) = 1, then gcd(n, b, m) = 1, so nx2+bxy+my2 is primitive. If n is evenand D ( b2 (mod 4|n|), then there exists an integer m such that D = b2 $ 4mnand we proceed as above.

3.11 Let f(x, y) = ax2 + bxy + cy2 be a reduced form of discriminant D < 0. Thus,b2 * a2 and a * c. Therefore,

$D = 4ac$ b2 + 4a2 $ a2 = 3a2,

whence,a *

p($D)/3.

For D fixed, |b| * a. This together with the latter inequality imply that thereare only finitely many choices for a and b. However, since b2 $ 4ac = D, thenthere are only finitely many choices for c. We have shown that there are onlyfinitely many reduced forms of discriminant D. By Theorem 3.1 on page 100,the number of equivalence classes of such forms is finite, which is the requiredresult.

3.13 Since a reduced form has coe!cients satisfying b2 * a2 * ac and b2 $ 4ac = D,then

D = b2 $ 4ac * $3ac,

so ac * $D/3. When D = $4n, this means that

ac * 4n/3. (S6)

We use (S6) to test for values up to the bound to prove the result.

When n = 1, this means that ac * 4/3 so a = c = 1 is forced and b = 0.Hence, the only reduced form of discriminant $4 is x2 + y2. If n = 2, thenac * 8/3, so c = 2 and a = 1 is forced given that ac must be even sinceb2 $ 4ac = $8. Therefore, b = 0, and the only reduced form of discriminant $8is x2 + 2y2. If n = 3, then ac * 4. Again, since ac must be even, c + a, andgcd(a, b, c) = 1, then c = 3, a = 1, and b = 0 is forced. Thus x2 +3y2 is the onlyprimitive reduced form of discriminant $12. (There is one imprimitive form,namely 2x2 + 2xy + 2y2, which we do not count.) If n = 4, then ac * 16/3 < 6.With the caveats as above, we must have c = 4, a = 1, b = 0, so x2 + 4y2 isthe only primitive reduced form of discriminant $16. (There is one imprimitiveform, namely 2x2 + 2y2, which we do not count.)

Lastly, if n = 7, then ac * 28/3 < 9, and (b/2)2+7 = ac, so the only possibilityis c = 7, a = 1, and b = 0, so x2 + 7y2 is the only primitive reduced form ofdiscriminant $28. (There is one imprimitive form, namely 2x2 + 2xy + 4y2,which we do not count.)


Section 3.2

3.15 If ! ' $!, then there exist p, q, r, s ! Z such that ps $ qr = 1 and in the casewhere #F ( 0(mod 4),

x2 $ #F

4y2 = $(px + qy)2 +

#F

4(rx + sy)2.

By comparing the coe!cients of x2, we get

p2 $ #F

4r2 = $1,

so p + rp

#F /4 is a unit of norm $1 in OF = Z[p

#F /4].

When #F ( 1(mod 4), then

x2 + xy +1$#F

4y2 = $(px + qy)2 $ (px + qy)(rx + sy)$ 1$#F

4(rx + sy)2.

By comparing the coe!cients of x2 we get that

(2p + r)2 $#F r2 = $4,

so

p +1 +

&#F

2r

is a unit of norm $1 in OF = Z[(1 +&

#F )/2].

3.17 Since we have that

C+OF

=I!F

P+!F

'=I!F

P!F

· P!F

P+!F

,

then, when F is real, by Exercise 3.15, C+OF

= COF if and only if OF has a unit

of norm $1. When F is complex, then P!F = P+!F

since all norms are positive,

so C+OF

= COF . This proves the assertion.

Section 3.3

3.19 Using the multiplication formulas as suggested in the hint, g = a, a3 = 1, b3 = b,% = 1, and µ = * = 0, so

II # = (a)

„1,

b +&

#F

2

«' (a) ' (1),

so I # ' I"1 in COF .

3.21 Set ! = 1 + u if u #= $1, and ! =&

#F if u = $1. If u #= $1, then

(1 + u#)u = u + uu# = u + NF (u) = u + 1.

Therefore,!!#

=u + 1u# + 1

= u.

If u = $1, then!!#

=

&#F

$&

#F= $1 = u,

as required.


Section 3.4

3.23 By Theorem 3.2 on page 102, we know that h(D) = 1 for

D ! {$4,$8,$12,$16,$28},

and indeed these are the only ones of the form D = $4n with h(D) = 1. Wenow look at the remainder of the form D ( 1(mod 4). By the argument in thesolution of Exercise 3.13 on page 413, a form ax2 + bxy + cy2 of discriminantD = b2 $ 4ac must satisfy that ac * $D/3 and must satisfy the inequalities inDefinition 3.4 on page 100. For D = $7 this says ac * 7/3 and the only valuesthat satisfy these restrictions are (a, b, c) = (1, 1, 2). For D = $11, ac * 11/3and the only values satisfying our criteria are (a, b, c) = (1, 1, 3). For D = $19,ac * 19/3 for which only (a, b, c) = (1, 1, 5) works. Lastly for D = $43, only(a, b, c) = (1, 1, 11) fits the inequalities. This completes the solution.

3.25 By Corollary 3.8 on page 138 p = x2 + 14y2 if and only if p ( z2 (mod 56)or p ( z2 + 14(mod 56) for some integer z and this holds if and only ifp ( 1, 9, 15, 23, 25, 39, where the values correspond to z = 1, 3, 5 in each case.Moreover, it is straightforward to check that 2x2 + 7y2 represents the samecongruence classes in (Z/56Z)!. Thus, they are in the same genus.

3.27 By Theorem 3.14 on page 142, the number of forms in each genus is h!F /2r"1.Thus, there is a single class of forms in each genus if and only if h!F /2r"1 = 1.

3.29 Using the same argument as in the solution of Exercise 3.23, any reduced formax2 + bxy + cy2 must satisfy

ac * $D/3 = 56/3 < 19.

Testing for this inequality together with the inequalities in Definition 3.4, theonly solutions are for

(a, b, c) ! {(1, 0, 14), (2, 0, 7), (3, 2, 5), (3,$2, 5)}.

Thus, h($56) = 4.

3.31 Using the hint, we see that when b2 $ 4ac = #F ( 0(mod 4), then b is even so

acx2 + bxy + y2 = (bx/2 + y)2 $ #F

4x2

since comparing the coe!cients of x2, we get b2/4$#F /4 = ac, comparing thecoe!cients of xy we get b = b/2 · 2, and the coe!cients of y2 are both 1. When#F ( 1(mod 4), then b is odd so

acx2 + bxy + y2 =

„$ b + 1

2x$ y

«2

+

„$ b + 1

2x$ y

«x +

1$#F

4x2,

since comparing the coe!cients of x2 we get„

b + 12

«2

$ b + 12

+1$#F

4=

b2 + 2b + 1$ 2b$ 2 + 1$ b2 + 4ac4

= ac,

and comparing the coe!cients of xy we get

2 · b + 12

$ 1 = b,

and the coe!cients of y2 are both 1.


3.33 If (a) holds, then C2!F

= {1} by the hint, so every element in C!F has order1 or 2. Thus, by Exercise 3.32, (b) holds. Conversely, if (b) holds, then byExercise 3.32, every element in C!F has order 1 or 2, so the principal genusis C2

!F= {1}, a single class. However, every genus has the same numbers of

classes of forms so we have the result.

3.35 They are f = (a, b, c) for the values (1, 0, 20), (3, 2, 7), (3,$2, 7), and (4, 0, 5).

3.37 For each of the following values of z, and primes p we have

p ( z2 + z $ 57 (mod 229).

For p = 643949 we have z = $803 and p = 8032 $ 803$ 57. For p = 17863 wehave z = 113 and 1132 + 113 $ 57 = 17863 $ 22 · 229. For p = 24733 we havez = 113 and 1132 + 113$ 57 = 24733$ 52 · 229.

Section 3.5

3.39 We have that ($7/p) = ($1/p)(7/p) = 1 if and only if ($1/p) = (7/p) = $1 or($1/p) = (7/p) = 1. Thus, ($7/p) = 1 if and only if either p ( $1(mod 4) andp ( ±1, 2, 4(mod 7), or else p ( 1(mod 4) and p ( ±1, 2, 4(mod 7). In otherwords, ($7/p) = 1 if and only if either p ( 11, 15, 23(mod 28) or p ( 1, 9, 25(mod 28), which is to say if and only if p ( 1, 9, 11, 15, 23, 25(mod 28).

3.41 Since ($19/p) = ($1/p)(19/p) = 1 if and only if ($1/p) = (19/p) = $1 or($1/p) = (19/p) = 1, then ($19/p) = 1 if and only if either p ( $1(mod 4)and

p ( 1, 4, 5, 6, 7, 9, 11, 16, 17 (mod 19),

or else p ( 1(mod 4) and

p ( 1, 4, 5, 6, 7, 9, 11, 16, 17 (mod 19).

This means that ($19/p) = 1 if and only if either

p ( 7, 11, 23, 35, 39, 43, 47, 55, 63, (mod 76),

orp ( 1, 5, 9, 17, 25, 45, 49, 61, 73 (mod 76),

namely if and only if

p ( 1, 5, 7, 9, 11, 17, 23, 25, 35, 39, 43, 45, 47, 49, 55, 61, 63, 73 (mod 76).

By Example 3.10, Theorem 1.3 on page 6, and (3.6), we have that h"19 =hZ[(1+

'"19)/2] = 1. Thus, by Theorem 3.15, if (#F /p) = ($19/p) = 1, then

p = a2 + ab + 5b2 for some integers a, b. Also 19 = 12$ 1 · 2 + 5 · 22. Conversely,by Exercise 3.9 on page 104, if p #= 19 and p = a2 + ab + 5b2, then ($19/p) = 1.

3.43 By the same methodology as in Exercise 3.41, we get that ($67/p) = 1 if andonly if either

p ( 15, 19, 23, 35, 39, 47, 55, 59, 71, 83, 91, 103, 107, 123, 127, 131, 135, 143, 151, 155

159, 163, 167, 171, 183, 199, 207, 211, 215, 223, 227, 255, 263 (mod 268), (S7)


or

p ( 1, 9, 17, 21, 25, 29, 33, 37, 49, 65, 73, 77, 81, 89, 93, 121, 129, 149, 153, 157, 169,

173, 181, 189, 193, 205, 217, 225, 237, 241, 257, 261, 265 (mod 268). (S8)

Lastly, (S7)–(S8) hold if and only if

p ( 1, 9, 15, 17, 19, 21, 23, 25, 29, 33, 35, 37, 39, 47, 49, 55, 59, 65, 71, 73, 77, 81, 83,

89, 91, 93, 103, 107, 121, 123, 127, 129, 131, 135, 143, 149, 151, 153, 155, 157, 159,

163, 167, 169, 171, 173, 181, 183, 189, 193, 199, 205, 207, 211, 215, 217, 223, 225,

227, 237, 241, 255, 257, 261, 263, 265 (mod 268).

Now the result is established exactly as in Exercise 3.41.

3.45 The following are all of the prime values or 1 for each discriminant.

#F x2 + x + (1$#F )/4 > 0 values for x = 1, 2, . . . , ,(&

#F $ 1)/2-17 x2 + x$ 4 221 x2 + x$ 5 329 x2 + x$ 7 5, 137 x2 + x$ 9 7, 353 x2 + x$ 13 11, 7, 177 x2 + x$ 19 17, 13, 7101 x2 + x$ 25 23, 19, 13, 5173 x2 + x$ 43 41, 37, 31, 23, 13, 1197 x2 + x$ 49 47, 43, 37, 29, 19, 7293 x2 + x$ 73 71, 67, 61, 53, 43, 31, 17, 1437 x2 + x$ 109 107, 103, 97, 89, 79, 67, 53, 37, 19677 x2 + x$ 169 167, 163, 157, 149, 139, 127, 113, 97,

79, 59, 37, 13

Section 3.6

3.47 If (1, 0,$#F ) ' (1, 0,$1), then there is a transformation x = rX + sY andy = tX + uY such that

p ! (ru$ st) (S9)

and(rX + sY )2 $#F (tX + uY )2 ( X2 $ Y 2 (mod p).

It follows thatr2 $ t2#F ( 1 (mod p), (S10)

s2 $ u2#F ( $1 (mod p), (S11)

andrs ( tu#F (mod p). (S12)

Multiplying (S10) by u2, we get

r2u2 $ t2u2#F ( u2 (mod p). (S13)


Multiplying (S11) by t2, we get

t2s2 $ t2u2#F ( $t2 (mod p). (S14)

Now if p ! ru and p ! ts, then we may multiply (S13) by (ru)"1 modulo p andby employing (S12), we get

ru$ ts ( ur"1 (mod p). (S15)

Similarly multiplying (S14) by $(ts)"1, and using (S12),

ru$ ts ( ts"1 (mod p). (S16)

From (S15)–(S16), we get

ur"1 ( ts"1 (mod p),

which implies thattr ( us (mod p). (S17)

Multiplying (S12) by tu and employing (S17), we get

t2u2#F ( trus ( (us)2 (mod p),

contradicting that #F is a quadratic nonresidue modulo p. Hence, either p˛(ru)

or p˛(ts) but not both due to (S9). If p

˛(ur), and p ! (ts), then either p

˛u or

p˛r. If p

˛u, then p

˛(tr) by (S17). Since p ! t, then p

˛r. Thus by (S10),

t2#F ( $1 (mod p),

which implies that p ( 3(mod 4) since #F is a quadratic nonresidue modulo p.However, by (S11), s2 ( $1(mod p) contradicting that p ( 3(mod 4). We haveshown that p ! u. If p

˛r, then p

˛(us) as above, but p ! s. We have shown

that p cannot divide ur. Thus, p˛(ts), so p

˛t or p

˛s. If p

˛t, then by (S17),

p must divide s since it cannot divide u. Thus, by(S11), u2#F ( 1(mod p),contradicting that #F is a quadratic nonresidue modulo p. This completes theproof that (1, 0,$#F ) #' (1, 0,$1).

3.49 If (0, 1, 0) ' (1, 1, 1)(mod 2), then there is a transformation x = rX + sY andy = tX + uY with ru$ st odd, such that

(rX + sY )(tX + uY ) ( X2 + XY + Y 2 (mod 2).

This implies thatrt = 1,

ru + st = 1,

andsu = 1.

However, the first and last equations imply that r = t = 1 or r = t = $1, ands = u = 1 or s = u = $1, and these do not solve the middle equation.


3.51 The existence of integers nj with gcd(nj , #Fj ) = 1 for j = 1, 2 is guaranteed byLemma 3.1. Since gcd(#Fj , nj) = 1 and p

˛#Fj for j = 1, 2, then gcd(nj , p) = 1.

Also, there are integers xj , yj such that nj = ajx2j + bjxjyj + cjy

2j . Therefore,

„nj

p

«=

„ajx

2j + bjxjyj + cjy

2j

p

«. (S18)

However since

4aj(ajx2j + bjxjyj + cjy

2j ) ( (2axj + bjyj)

2 (mod p),

given that4ajcj ( b2

j (mod p),

because p˛#Fj for j = 1, 2, this implies that

„4aj

p

«„ajx

2j + bjxjyj + cjy

2j

p

«= 1,

so by (S18), „nj

p

«=

„aj

p

«for j = 1, 2. (S19)

Suppose that (a1, b1, c1) ' (a2, b2, c2)(mod p). Then

„n1

p

«=

„a1

p

«=

„a1x

21 + b1x1y1 + c1y

21

p

«=

„a2x

22 + b2x2y2 + c2y

22

p

«=

„a2

p

«=

„n2

p

«.

Conversely, if „n1

p

«=

„n2

p

«,

then by (S18)–(S19), „a1

p

«=

„a2

p

«.

Hence, there exists a z ! Z such that a1 ( z2a2 (mod p). This implies

(a1, b1, c1) ' (n1, 0, 0) ' (a1, 0, 0) ' (a2, 0, 0) ' (n2, 0, 0) ' (a2, b2, c2) (mod p).

Section 4.1

4.1 Since we know from the hint that

!$ Aj

Bj=

($1)j

Bj(!j+1Bj + Bj"1),

then ˛˛!$ Aj

Bj

˛˛ =

˛˛ 1Bj((qj+1 + 1/!j+2)Bj + Bj"1)

˛˛ * 1

qj+1B2j

.


4.3 For the first part, with j = 1, 2, let dj + 0 with fj(x) =Pdj

i=0 a(j)i xi. Thus,

gcd(f1(x)f2(x)) = gcd

d1X

i=0

a(1)i xi

d2X

k=0

a(2)k xk

!= gcd

d1X

i=0

a(1)i

d2X

k=0

a(2)k xj+k

!=

gcd{a(1)i a(2)

k } 1(i(d11(k(d2

=

gcd

d1X

j=0

a(1)j xj

!gcd

d2X

k=0

a(2)k xk

!=

gcd(f1(x)) gcd(f2(x)).

Now, if f(x) ! Z[x], then we may assume, without loss of generality, thatgcd(f(x)) = 1 since we may otherwise just look at F (x) = f(x)/ gcd(f(x)) !Z[x]. If f(x) = g(x)h(x) where g(x), h(x) ! Q[x], then we may find rationalnumbers +g and +h such that +gg(x) ! Z[x], +hh(x) ! Z[x], and gcd(+gg(x)) =1 = gcd(+hh(x)), so from the above

gcd(+g+hf(x)) = gcd(+gg(x)) gcd(+hh(x)) = 1.

Hence, +g+h = ±1. By setting

H(x) = sign(+h)+hh(x) and G(x) = sign(+g)+gg(x),

where sign(+g) = 1, if +g > 0, and sign(+g) = $1, if +g < 0, and similarly forsign(+h). Hence, we have that

f(x) = G(x)H(x),

as required.

4.5 Since the base-a expansion of the number is (.100100001 . . .)a, which is infinitelynonrepeating, then we know that it is irrational.

Section 4.2

4.7 An easy check shows that

0 < ($1)n+1#n =)X

j=1

($1)j+1

(n + j)!<

1(n + 1)!

.

Thus,

0 < n!#n($1)n+1 <1

n + 1< 1,

which implies that

n!e"1 = n!!n + n!#n($1)n+1 #! Z.

We have shown that e"1 #! Q since n!!n ! Z, so e #! Q.


4.9 Since , = a/b ! Q and

f(x) = f (0)(x) =xn(a$ bx)n

n!,

then we may set

G(x) =nX

j=0

($1)jf (2j)(x).

Since f (2j)(0) and f (2j)(,) are integers for all j = 0, 1, . . . , n, then G(0), G(,) !Z. Also, since

ddx

`G#(x) sin(x)$G(x) cos(x)

´=`G##(x) + G(x)

´sin(x)

=

f (0)(x) +

n"1X

j=0

($1)jf2(j+1)(x) +n"1X

k=0

($1)k+1f2(k+1)(x)

!sin(x) = f(x) sin(x),

then Z "

0

f(x) sin(x)dx = G(,) + G(0) ! Z. (S20)

However, by selecting n large enough, we must have

0 < f(x) sin(x) <,nan

n!<

1,

,

so

0 <

Z "

0

f(x) sin(x)dx < 1,

contradicting (S20).

Section 4.3

4.11 Since every subgroup of a free abelian group of rank n is a free abelian group ofrank at most n, set the rank of H to be m * n. Then G/H has n$m infinitecyclic factors. Hence, G/H is finite if and only if m = n. If L is a lattice withfree abelian subgroup H of rank n, then H is a full lattice in Rn.

Section 5.1

5.1 Since f(x) = x/(ex $ 1) + x/2 is an even function, namely, f(x) = f($x), thenBn = ($1)nBn for any n > 1, so for odd n, Bn = 0.

5.3 According to the hint, ifP)

j=1(1/j) = d ! R. Then there is an N ! N such thatN * d < N + 1. Also, note that

)X

j=1

1j

= 1 +12

+

„13

+14

«+

„15

+16

+17

+18

«+ · · · > 1 +

12

+12

+12

+ · · ·

so each block has a sum bigger than 1/2. Let M ! N be chosen such that thenumber of blocks larger than 1/2 satisfies M + 2N . Then

d =)X

j=1

1j

> 1 +2M2

+ N + 1,

a contradiction.


5.5 Since F (s, x)$ F (s, x$ 1) = ses(x"1), then

Bn+1(x)$Bn+1(x$ 1)n + 1

= (x$ 1)n. (S21)

Adding (S21) for x = 1, 2, . . . k, we get the result.

5.7 Since we know from Exercise 5.6 that B#n+1(x) = (n + 1)Bn(x), then

Z b

a

Bn(t)dt =1

n + 1

Z b

a

B#n+1(x) =

1n + 1

(Bn+1(b)$Bn+1(a)).

Section 5.2

5.9 By Theorem 5.9 on page 214 and the hint,Pn

j=1 -(j)

n(n + 1)/2. 6n2

,2n2=

6,2

.

5.11 By the definition of the Mobius function and Theorem 5.9, we have simply arestatement, namely X

n(x

|µ(n)| =6x,2

+ O(&

x),

from which it follows that the mean value of µ2 is 6/,2.

Section 5.3

5.13 Suppose that |f(p)| + 1 for some prime p. Then

)X

n=1

|f(n)| +)X

j=0

|f(pj)| =)X

j=0

|f(p)|j

and the latter series clearly diverges. This shows that |f(p)| < 1 for each primep so

)X

j=0

f(p)j $)X

j=0

f(p)j+1 = limn*)

nX

j=

f(p)j $nX

j=0

f(p)j+1

!

= limn*)

nX

j=

(f(p)j $ f(p)j+1)

!= lim

n*)

`1$ f(p)n+1´ = 1,

so

)X

j=0

f(pj) =)X

j=0

f(p)j =1

1$ f(p).

The result now follows.


5.15 Let n = k + 1 and s = $k in Theorem 5.10. Thus,

"($k) =1

$k $ 1+

12

+k+1X

j=2

Bj

j!($k)($k + 1) · · · ($k + j $ 2)

=$1

k + 1

1 +

„$1

2

«(k + 1) +

k+1X

j=2

k + 1

j

!Bj

!

=$1

k + 1

k+1X

j=0

k + 1

j

!Bj = $Bk+1

k + 1,

if k is odd and equals 0 if k is even, by Exercise 5.5.

5.17 This is an immediate consequence of the answer provided in the solution ofExercise 5.15.

5.19 This is immediate from Theorem 5.10 on page 219.

5.21 The integral Z 1

0

B3(t$ ,t-)t"s"3dt

is convergent for Re(s) < $1. Using Exercise 5.6 on page 206 and integrationby parts (three times) we get

Z 1

0

B3(t$,t-)t"s"3dt = $ 1s + 2

Z 1

0

B3(t$,t-)dt"s"2 = $ 1s + 2

B3(t$,t-)t"s"2

˛˛˛

1

0

+1

s + 2

Z 1

0

t"s"2dB3(t$ ,t-) =3

s + 2

Z 1

0

t"s"2B2(t$ ,t-)dt

= $ 3(s + 1)(s + 2)

Z 1

0

B2(t$ ,t-)dt"s"1 = $ 3(s + 1)(s + 2)

B2(t$ ,t-)t"s"1

˛˛˛

1

0

+3

(s + 1)(s + 2)

Z 1

0

t"s"1dB2(t$ ,t-) = $ 12(s + 1)(s + 2)

+6

(s + 1)(s + 2)

Z 1

0

t"s"1B1(t$ ,t-)dt = $ 12(s + 1)(s + 2)

$ 6(s + 1)(s + 2)

Z 1

0

B1(t$ ,t-)dt"s = $ 12(s + 1)(s + 2)

+

6s(s + 1)(s + 2)

`t"sB1(t$ ,t-)

´˛˛˛

1

0

+6

s(s + 1)(s + 2)

Z 1

0

t"sdB11(t$ ,t-)

= $ 12(s + 1)(s + 2)

$ 3s(s + 1)(s + 2)

+6

s(s + 1)(s + 2)

Z 1

0

t"sB0(t$ ,t-)dt

= $ 12(s + 1)(s + 2)

$ 3s(s + 1)(s + 2)

+6

s(s + 1)(s + 2)

Z 1

0

t"sdt

= $ 12(s + 1)(s + 2)

$ 3s(s + 1)(s + 2)

$ 6(s$ 1)s(s + 1)(s + 2)

t"s+1

˛˛˛

1

0


= $ 12(s + 1)(s + 2)

$ 3s(s + 1)(s + 2)

$ 6(s$ 1)s(s + 1)(s + 2)

= $ (s + 3)2s(s$ 1)(s + 1)

.

Hence,

s(s + 1)(s + 2)6

Z 1

0

B3(t$ ,t-)t"s"3dt = $ (s + 2)(s + 3)12(s$ 1)

= $ s12$ 1

2$ 1

s$ 1= $B2s

2$ 1

2$ 1

s$ 1,

as required.

5.23 By Exercise 5.22, the result is immediate since we let x$ 1 = $s, then

$(1$ s) = $(x) = (x$ 1)$(x$ 1) = ($s)$($s).

5.25 By the hint,

$(n) = (n$ 1)$(n$ 1) = (n$ 1)(n$ 2)$(n$ 3) = · · · = (n$ 1)!.

Section 6.1

6.1 x ( 20(mod 72).

6.3 x ( 239, 1958, 2196(mod 133).

6.5 5 + 3 · 7 + 3 · 72 + 3 · 73 + 3 · 74 + 3 · 75 + 3 · 76 + 3 · 77 + 3 · 78 + · · · .6.7 3 + 2 · 5 + 2 · 53 + 2 · 54 + 4 · 55 + 56 + 3 · 57 + 58 + 59 + 2 · 511 + 2 · 512 + 2 · 515 +

516 + 3 · 518 + 3 · 519 + +521 + 524 + 525 + 4 · 528 + 3 · 529 · · ·

Section 6.2

6.9 First, parts (a)–(b) of Definition 6.2 follow immediately from Definition 6.3. Part(c) is, for any x, y ! Q, that

|x + y|p = p"(#p(x+y)) * p"#p(x) + p"#p(y),

sincevp(x + y) + min{*p(x), *p(y)}

from which the non-Archimedean property follows.

6.11 By Definition, for any . > 0, there is an integer n = n(.) such that

|qj $ qk|p < . for all j, k > n.

Thus,|qj |p $ |qk|p * |qj $ qk|p < . for all j, k > n.

By taking k = n + 1 and adding |qn+1|p to both sides,

|qj |p < |qn+1|p + ..

Hence, for all j ! N,

|qj |p * max{|q1|p, . . . , |qn|p, |qn+1|p + .}.

By setting M = max{|q1|p, . . . , |qn|p, |qn+1|p + .}, we have our result.


6.13 The reflexive property is clear. Also, since

($)

limj*)

(qj $ q#j) = 0 if and only if($)

limj*)

(q#j $ qj) = 0,

then {qj} = {q#j} implies {q#j} = {qj}, which is symmetry. Lastly, if {qj} = {q#j}and {q#j} = {q##j }, then

($)

limj*)

(qj $ q#j) = 0 and($)

limj*)

(q#j $ q##j ) = 0,

so by symmetry,($)

limj*)

(q##j $ q#j) = 0.

Therefore,($)

limj*)

(qj $ q#j $ (q##j $ q#j)) =($)

limj*)

(qj $ q##j ) = 0,

so {qj} = {q##j }, which establishes transitivity. Hence, Cauchy sequences arepartitioned into classes as an equivalence relation.

6.15 If limj*) qj = L ! R, then given . > 0, select N ! N such that

|qj $ L| < . for j > N.

Then, if j, k > N , we have

|qj $ qk| = |(qj $ L)$ (qk $ L)| *| qj $ L| + |qk $ L| < 2.,

from which it follows that the sequence is Cauchy.

6.17 Let x = 5/4 and y = 5. Then x + y = 25/4, so

|x + y|5 = 5"2 < max{|x|5, |y|5} = 5"1.

6.19 Given three points x, y, z of a triangle, we have that

|x$ y|p + |y $ z|p = |x$ z|p,

so if |x$ y|p #= |y $ z|p, then by Exercise 6.16,

|x$ z|p = |(x$ y) + (y $ z)|p = max{|x$ y|p, |y $ z|p},

so two of the sides must be equal.

Section 6.3

6.21 Let k + + = m > j, then by (6.8) on page 233,

|qm $ qj |p = |qk+% $ qk+%"1 + qk+%"1 $ qk+%"2 ± · · · ± qj+1 $ qj |p

* max{|qk+% $ qk+%"1|p, . . . , |qj+1 $ qj |p},which yields the result.


Section 6.4

6.23 By Theorem 6.4 on page 244,

! = a/b = p"%

mX

j=0

cjpj

!+

)X

j=0

pm+1+jnC

!,

where

C =m+nX

j=m+1

cjpj"m"1.

Thus, |!|p + 0 if and only if + = 0, namely p ! b.

6.25 If we let ! ! Op, then the polynomial f(x) = !x$ 1 has a root if and only if !is a unit in Op and !"1 is its other root. Thus, f(x) ( 0(mod P) is solvable ifand only if ! is a unit, but

f #(x) = ! #( 0 (mod P)

since no element of P can be invertible. By Lemma there exists a p-adic integer!"1 such that f(!"1) = 0 so !!"1 = 1. This shows that a p-adic integer isinvertible in Op if and only if ! ! Op/P. By Theorem 2.7 on page 68, P is amaximal ideal.

6.27 If |!|p = p"n, then u = !p"n ! Up, so ! = upn. If

! = upn = vpm,

where u, v ! Up, then|!|p = p"n = p"m,

so m = n and u = v.

Section 7.1

7.1 Since n #( 0, 1(mod D), then there is a prime p dividing D such that

n #( 1 (mod pa) where pa˛˛

D.

Also, as in the proof of Theorem 7.1 on page 249, there is a character /pawith

/pa(n) #= 1 which is possible since there exist -(pa) distinct characters modulo

pa, and-(pa) = pa"1(p$ 1) > 1,

since D = pa = 2 is not possible given the existence of n #( 0, 1(mod D). (Forinstance, if p is odd choose a primitive root g modulo pa and the character/pa

(g) = g. Since n ( gi #( 1(mod pa) for some i with 1 * i < -(pa), then/(n) = /(gi) = gi #= 1.) If D #= pa, then select

/D/pa

= /D/pa

0 ,

then the product of these characters is a character / for which /(n) #= 1.


Section 7.2

7.3 By using Exercises 5.12–5.13, with f(n) = /(n)n"s, the result is an immediateconsequence in view of the absolute convergence given by Exercise 7.2.

7.5 We have for 0 < /(s) < 1,

$(s)$(1$ s) =

Z )

0

e"tts"1dt

Z )

0

e"xx"sdx,

and by letting t = xu, we get that the latter equalsZ )

0

e"xu(xu)s"1xdu

Z )

0

e"xx"sdx =

Z )

0

(e"xu"xdx)us"1du,

and now by letting y = x(u + 1), the latter equals

Z )

0

e"y

u + 1dy(us"1du) =

Z )

0

ˆ$e"y˜)

0

„us"1

u + 1du

«=

Z )

0

us"1

u + 1du,

and by the hint, this gives us the result. The last equality in the exercise followsfrom the formula from elementary calculus that

sin(20) = 2 sin 0 cos 0.

7.7 By Theorem 7.3,

L(s, /) =Y

p=prime

(1$ /(p)p"s)"1.

By taking logs we get

loge L(s, /) = $X

p=prime

loge(1$ /(p)p"s) =X

p=prime

)X

m=1

/(pm)mpms

.

Since the latter is absolutely convergent for /(s) > 1, then we may interchangethe order of summation to get that it equals

)X

m=1

X

p=prime

/(pm)mpms

=X

p=prime

/(p)ps

+ R(s, /),

where

|R(s, /)| =

˛˛˛

)X

m=2

X

p=prime

/(pm)mpms

˛˛˛ *

X

p=prime

)X

m=2

1

mpm+(s)* 1

2

X

p=prime

)X

m=2

1

pm+(s).

However, since we have the known geometric series

)X

m=2

1

pm+(s)=

1

p+(s)(p+(s) $ 1)* 2

p2+(s),

then it follows that

|R(s, /)| *X

p=prime

1

p2+(s).


Also, for /(s) > 1, we have

X

p=prime

1

p2+(s)<

X

p=prime

1p2

<)X

m=2

1m2

=,2

6$ 1 < 1,

where the last equality comes from Remark 5.9 on page 220. We have shownthat

loge L(s, /) =X

p=prime

/(p)ps

+ O(1). (S22)

Now if a ! Z with gcd(a, D) = 1, then by part (b) of Corollary 7.2, via (S22),

X

&%GDchar

/(a) loge L(s, /) =X

&%GDchar

X

p=prime

/(a)/(p)ps

+ O(-(D))

= -(D)X

p&a (mod D)

1ps

+ O (-(D)) . (S23)

But we also haveX

&%GDchar

/(a) loge L(s, /) = loge L(s, /0) +X

'#GDchar

'%='0

/(a) loge L(s, /), (S24)

so by equating (S23)–(S24), we get

loge L(s, /0) +X

'#GDchar

'%='0

/(a) loge L(s, /) = -(D)X

p&a (mod D)

1ps

+ O (-(D)) ,

as required.

7.9 Assuming that s > 1, let S(p) =P)

j=1 f(pj)p"js. Thus,

S(p) < Kp"s)X

j=0

p"js = Kp"s(1$ p"s)"1,

which implies that S(p) < 2Kp"s. For a fixed bound N ! N,X

p(N

S(p) < 2KX

p

p"s = B, (S25)

say. Since f is multiplicative, then

NX

n=1

f(n)n"s =NX

n=1

Y

p(N

f(pj)p"js <Y

p(N

)X

n=1

f(pj)p"js

=Y

p(N

S(p) <Y

p(N

(1 + S(p)) <Y

p(N

exp(S(p)) = exp

0

@X

p(N

S(p)

1

A ,

where the last inequality follows from the fact that for any x ! R+, 1+x < exp x.Therefore, from (S25) it follows that

NX

n=1

f(n)n"s < exp B


for all N . Since f is nonnegative, this shows thatP)

n=1 f(n)n"s converges. Thelast statement now follows immediately from Exercise 5.14.

Section 7.3

7.11 This follows from the definitions since the numerator is finite and the denomi-nator goes to 0.

7.13 Parts (a)–(b) are proved in the same way as given in Remark 7.1 on page 248.For part (c), we have

1 = /(1p) = /(a · a"1) = /(a"1)/(a),

which implies /(a"1) = /(a)"1. Lastly, /(a)"1 = /(a) follows from the factthat /(a) ! C and |/(a)| = |"j

p"1| = 1 by part (b).

7.15 That /& and /"1 are characters follows from the definition of the individualcharacters / and &. Therefore, if /, & ! G, the set of multiplicative characterson Fp, then /&"1 ! G, which makes G into a group.

Now since F!p is cyclic—see [68, Theorem A.6, p. 300], let g be a generatorof F!p. Thus, if a ! F!p, then a = gj for some j = 0, 1, 2, . . . , p $ 1. Therefore,/(a) = /(gj) = /(g)j , so the value of /(g) determines all other values. By part(b) of Exercise 7.13, /(g) is a (p $ 1)-st root of unity. Hence, the order of thecharacter group has order at most p$1. If we define for any j = 0, 1, 2, . . . , p$1,

!(gj) = "jp"1

for a primitive p$1-st root of unity "p"1, then ! is clearly a multiplicative char-acter on F!p. Suppose that !k = /0. Therefore, !k(g) = /0(g) = 1. However,

1 = !k(g) = !(g)k = "kp"1,

and since "p"1 is a primitive p$ 1-st root of unity, then (p$ 1)˛k. Moreover,

since!p"1(a) = !(ap"1) = !(1) = 1,

then !p"1 = /0. This shows that !j for j = 0, 1, 2, . . . , p $ 2 are distinct.However, the order of G is at most p$1 from the above, so we have demonstratedthat |G| = p$ 1 and G has generator !.

7.17 Let ! and g be as in the solution of Exercise 7.15 above. Let / = !(p"1)/m.Therefore,

/(g) = !(p"1)/m(g) = !(g)(p"1)/m = "m.

In other words, /(g) is a primitive m-th root of unity. Since a = gj for some jand since xm #= a for any x ! F!p, then m ! j. Hence,

/(a) = /(g)j = "jm #= 1.

Lastly, /m = !p"1 = /0.


7.19 If a = x2 for a ! Fp, then

N(2, a) = 2 = 1 +

„ap

«= 1 + 1 = 2.

If a #= 0, and if a = 0, then

N(2, a) = 1 +

„ap

«= 1 + 0 = 1.

On the other hand, if a = x2 is not solvable, then

N(2, a) = 1 +

„ap

«= 1$ 1 = 0.

7.21 Since a #= 0, then "ap #= 1 and

X

j%Fp

"ajp =

"app $ 1

"ap $ 1

= 0.

If a = 0, then "ap = 1, so X

j%Fp

"ajp = p.

7.23 This is virtually immediate from Exercise 7.21, since

p"1X

j%Fp

"j(a"b)p = p"1p = 1

if a = b and is zero otherwise.

Section 8.1

8.1 By the quadratic formula, the solutions to Equation 8.1 on page 271 are

x = (&

R ±p

R$ 4Q)/2.

Therefore,

! + # = (&

R +p

R$ 4Q)/2 + (&

R$p

R$ 4Q)/2 =&

R,

and!# = (

&R +

pR$ 4Q)(

&R$

pR$ 4Q)/4 =

(R$ (R$ 4Q))/4 = Q.

Also,

!$ # = (&

R +p

R$ 4Q)/2$ (&

R$p

R$ 4Q)/2 =p

R$ 4Q.


8.3 (a)–(b) We use induction on n. The induction step is U1 = 1 ! Z, U2 =&

R. Theinduction hypothesis is

U2i+1 ! Z, and U2i is an integer multiple of&

R for all i < n.

Therefore, by part (a) of Theorem 8.1,

U2n =&

RU2n"1 $QU2n"2

is an integer multiple of&

R by the induction hypothesis, which also impliesthat

U2n+1 =&

RU2n $QU2n"1 ! Z.

The argument for the Vi’s is similar.

8.5 We use induction on n.

Induction Step: For n = 1,

2n"1Un = 1 =

,(n+1)/2-X

k=1

n

2k $ 1

!V n"2k+1

1 #k"1,

and

2n"1Vn = V1 =

,n/2-X

k=0

n2k

!V n"2k

1 #k.

Induction hypothesis:

2n"2Un"1 =

,n/2-X

k=1

n$ 12k $ 1

!V n"2k

1 #k"1,

and

2n"2Vn"1 =

,(n"1)/2-X

k=0

n$ 12k

!V n"2k"1

1 #k.

We may assume that n is even since the other case is similar. By part (f) ofTheorem 8.1, 2Vn = V1Vn"1 + #Un"1U1, and by the induction hypothesis,

2n"1Vn =

n/2"1X

k=0

n$ 12k

!V n"2k

1 #k +

n/2X

k=1

n$ 12k $ 1

!V n"2k

1 #k =

V n1 + #n/2 +

n/2"1X

k=1

n$ 12k

!+

n$ 12k $ 1

!!V n"2k

1 #k =

n/2X

k=0

n2k

!V n"2k

1 #k.

Now we turn to the proof for Un.

2Vn+1 = V1Vn + #UnU1, by part (f) of Theorem 8.1. Thus, from what we havejust proved we get


Un =1#

(2Vn+1 $ V1Vn) =

22n#

n/2X

k=0

n + 12k

!V n+1"2k

1 #k $ V1

2n"1#

n/2X

k=0

n2k

!V n"2k

1 #k.

Therefore,

2n"1Un =

n/2X

k=0

n + 12k

!$

n2k

!!V n"2k+1

1 #k"1 =

n/2X

k=1

n

2k $ 1

!V n"2k+1

1 #k"1,

as required.

8.7 We may assume that Q is odd by Exercise 8.6. Also, by part (d) of Theorem 8.1,Un is even if and only if Vn is even.

(a) In this case,&

R ( 0(mod 2), which by definition means that R ( 0(mod 4).By part (a) of Theorem 8.1, Un+2 ( Un (mod 2). Since U0 = 0, U1 = 1, then2|Un if and only if n is even.

(b) Define: U #2n = U2n/

&R, and U #

2n+1 = U2n+1. By part (a) of Theorem 8.1,U #

2n+2 ( U #2n+1 + U #

2n (mod 2), with U #0 = 0, U #

1 = 1. Thus, U #n ( 0(mod 2) if

and only if n ( 0(mod 4).

(c) Since, U #n+2 ( U #

n+1 + U #n (mod 2), with U #

1 = U #2 = 1, then U #

n ( 0(mod 2)if and only if n ( 0(mod 3).

8.9 Let n = mm1. Then

Un/Um = (!n $ #n)/(!m $ #m) = (!mm1 $ #mm1)/(!m $ #m) =

!m(m1"1) + !m(m1"2)#m + !m(m1"3)#2m + · · · + !m#m(m1"2) + #m(m1"1) =

Vm(m1"1) + Vm(m1"3)Qm + Vm(m1"5)Q

2m + · · · + T,

where T = Qm(m1"2)/2Vm if m1 is even, and T = Qm(m1"1)/2 if m1 is odd. Ineither case, Un/Um is an integral multiple of

&R. Hence, Um|Un.

8.11 Let d = gcd(Um, Un). By Exercise 8.9, Ug|Um and Ug|Un, so Ug|d. It remainsto show that d|Ug. By Exercise 8.4,

2QmUn"m = UnVm $ VnUm (S26)

and, by Exercise 8.6, gcd(Um, Q) = 1 = gcd(Un, Q), so d|2Un"m. If 2|d, then Vm

and Vn are even, so (S26) may be written QmUn"m = Un(Vm/2) $ (Vn/2)Um.Hence, d|Un"m. By a reduction process that mimics the Euclidean algorithm,this shows that d|Ug.

Section 8.2

8.13 The equation has no solutions x, d ! N since a2 $ D = 22 + 43 = 47, butD #= $3a2 ± 1.


8.15 The equation has no solution since 82 + 225 = 172, so a = 8, but D #= $3a2 ± 1.

8.17 22 + 161047 = 115.

Section 8.3

8.19 Since IhOF ' 1, In ' 1, and gcd(hOF , n) = 1, then there exist integers x, y suchthat nx + hOF y = 1. Therefore,

I = Inx+hOFy = (In)x(Iy)hOF ' 1,

as we sought to prove.

8.21 In Theorem 8.4, let k = $13 = $1 $ 3u2 with u = 2, for which x = pm =4u2 + 1 = 17 with m = 1 and y = ±2(3 + 8 · 22) = ±70. Thus, p = 22 + 13, and702 = 173 $ 13. Thus, (x, y) = (17,±70).

8.23 By Theorem 8.4 there can be no solutions since k = $47 #= $3u2 ± 1 for anyinteger u.

8.25 As per the hint, a solution (x, y) to (8.15) implies that

y +&

k = w(u + v&

k)3 (S27)

for a unit w ! OF and some u, v ! Z. Then w = ±.zk for some z ! Z. Since

we may write z = 3z1 + r where r ! {0,±1,±2}, then we may absorb (±.z1k )3

into the cube (u + v&

k)3, so we may assume, without loss of generality, thatw = .r

k, where r ! {0,±1,±2}. Given the definition of . and the fact that(T + U

&k)"1 = T $ U

&k, then we may assume w ! {.j

k : j = 0, 1,$1} if .k

has norm 1 and w ! {.jk : j = 0, 2,$2} if .k has norm $1. In either case,

w ! {.j : j = 0, 1,$1}.

Case S.1 w = 1

From (S27),

y +&

k = (u3 + 3uv2k) + (3u2v + v3k)&

k,

so by comparing coe!cients of&

k, we have that

1 = 3u2v + v3k = v(3u2 + v2k), (S28)

so v = ±1. Hence, multiplying (S28) by v yields

±1 = v = 3u2v2 + v4k + k > 1,

a contradiction.

Case S.2 w ! {T ± U&

k}

From (S27) we have

y+&

k = (T ±U&

k)(u+v&

k)3 = (T ±U&

k)“(u3 + 3uv2k) + (3u2v + v3k)

&k”

= (T (u3 + 3uv2k) ± (Uk(3u2 + v3k)) + (T (3u2v + v3k) ± U(u3 + 3uv2k))&

k.


Therefore, by comparing coe!cients of&

k again yields

1 = T (3u2v + v3k) ± U(u3 + 3uv2k). (S29)

Since k ( 4(mod 9) and U ( 0(mod 9), then 1 = T 2 $ kU2 implies that

T ( ±1 (mod 81).

Hence, by (S29),1 ( !(3u2 + 4v2)v (mod 9), (S30)

where ! ( ±1(mod 9).

From (S30), !v ( ±1(mod 9), so

3u2 + 4 ( !v ( ±1 (mod 9).

Thus,3u2 ( 4, 6 (mod 9),

which are impossible. This completes all cases.

Section 8.4

8.27 By Exercise 2.24,

|Q("n) : Q| = -(n) = deg(m'n,Q(x)),

and by Theorem 1.7, "n(x) = m'n,Q(x), so by Definition 1.9, the result follows.

8.29 By Exercise 8.28 with I = P and J = Pm"1, where m ! N and P0 = OF , we get

OF

P'=

Pm"1

Pm,

and for any n ! N,„

OF

P

«n

'=OF

P1 P

P21 · · ·1 Pn"1

Pn'=

OF

Pn,

so ˛˛OF

P

˛˛n

=

˛˛OF

Pn

˛˛ ,

which is what we sought to show.

8.31 The principal fact to establish is that the multiplication is well defined, namelythat if a + I = a# + I, and b + I = b# + I, then ab + I = a#b# + I. Sincea# ! a# + I = a + I, then a# = a + j for some j ! I. Similarly, b# = b + k forsome k ! I. Thus,

a#b# = (a + j)(b + k) = ab + jb + ak + jk.

Therefore,a#b# $ ab = jb + ak + jk ! I,

since I is an ideal. However, a fundamental fact is that cosets are either equal,or have a trivial intersection. Thus,

ab + I = a#b# + I.

It now follows that R/I is a ring with the properties inherited by the well-definedoperation of multiplication, and 1R + I is the identity of R/I where 1R is themultiplicative identity of R.


8.33 Suppose that & runs through a system of N(I) elements of R which are incon-gruent modulo I. Since

!&1 ( !&2 (mod I) for &1, &2 ! R,

implies thatI˛!(&1 $ &2),

then the relative primality of ! and I implies that I˛(&1 $ &2), namely

&1 ( &2 (mod I).

Hence, !& runs through all residue classes modulo I as & runs over its sys-tem. Therefore, among the !&, there exists one residue class in which # sits.Moreover, it is clearly uniquely determined modulo I.

Now we prove the last assertion. Set gcd(!, I) = G. Assume first that thereis a solution to the congruence !& ( # (mod I). Then there exists a % ! I suchthat !& = # + %. Hence, G

˛I˛(%). However, G

˛(!), so G

˛(#) = (!& $ %).

Conversely, if G˛

(#), then (#) " (!) + I = gcd((!), I), so # = !& + % forsome & ! R and % ! I. Thus, # ( !& (mod (%)), so since I

˛(%), then # ( !&

(mod I).

8.35 This is immediate from Exercise 8.34.

8.37 If we are given !, # ! OF both relatively prime to I, then !# + I is a classcompletely determined by ! and # modulo I, and !# is relatively prime to I.Thus, the group is an abelian group. By definition, the order of the group is!(I). Moreover, if I is a prime OF -ideal, then the group is isomorphic to themultiplicative subgroup of nonzero elements of the field OF /I, and we are done,since it is known that the multiplicative subgroup of all nonzero elements in afield is cyclic—see [68, Theorem A.6, p. 300].

8.39 The classes of the group defined in Exercise 8.37, represented by a rationalinteger, form a subgroup thereof. These are the classes of the representatives1, 2, . . . , p$ 1. Suppose that one of these integers z is not relatively prime to P.Then since there exist u, v ! Z with

up + vz = 1,

and p ! P, we would have 1 ! P, a contradiction. Hence, all of these repre-sentatives are relatively prime to P, and they are distinct. Therefore, for anysuch class z, we must have that zp"1 = 1, the identity class of the group. ByExercise 8.37, the group is cyclic, so there are no more than p$ 1 classes z forwhich zp"1 = 1. Thus, the subgroup of classes represented by a rational integeris identical with the group of classes whose elements raised to the power (p$ 1)is the class 1. This yields the result.

Section 8.5

8.41 Applying the ABC-conjecture with

a = m(m + 2),


b = 1,

andc = (m + 1)2

yields that with finitiely many exceptions, for any 1 > 1,

(m + 1)2 < S(m(m + 1)2(m + 2))(. (S31)

Now we assume that m > n and prove that for k = 3, there are only finitelymany such m for which

S(m) = S(n), S(m + 1) = S(n + 1), and S(m + 2) = S(n + 2). (S32)

Now (S32) implies

m$ n = (m + j)$ (n + j) ( 0 (mod S(m + j))

for 0 * j * 2. Given that

gcd(S(m), S(m + 1), S(m + 2))˛2,

thenS(m(m + 1)2(m + 2))

˛2(m$ n).

Using this in (S31) yields that with finitely many exceptions,

m2 < (m + 1)2 < S(m(m + 1)2(m + 2))( < (2m)(,

which impliesm < 2(/(2"()

with finitely many exceptions. Hence, m is bounded by a constant. We haveshown that for k = 3 in the Erdos–Woods Conjecture holds with finitely manyexceptions, assuming the ABC-conjecture.

8.43 If n is powerful, then any prime p˛

n has exponent n(p) + 2 in the canonicalprime factorization of n. Let S denote the set of primes dividing n that appearto an odd exponent n(p) + 3. Then

n =Y

p|np $%S

pn(p)Y

p%S

pn(p) =Y

p|np $%S

pn(p)Y

p%S

pn(p)"3Y

p%S

p3.

Letting

x =Y

p|np $%S

pn(p)/2Y

p%S

p(n(p)"3)/2

andy =

Y

p%S

p

yieldsn = x2y3.

8.45 Let a be even and set n = am in Exercise 8.42. Then there are only finitelymany values such that m > 1 with a2m$ 1 being powerful. Hence, there cannotbe infinitely many such values, which is what we sought to prove.


Section 9.1

9.1 If x3 + y3 = z3 has nonzero integer solutions, then for

X = 12z/(x + y) and Y = 36(x$ y)/(x + y),

we getY 2 = X3 $ 432.

Since xyz #= 0, then |Y | #= 36. Conversely, assume that

Y 2 = X3 $ 432, (S33)

for some Y = A/B and X = C/D with A, B, C, D ! Z, AD #= 0, and set

x = (36B + A)D, y = (36B $A)D, and z = 6BC.

By (S33), (36 + Y )3 + (36$ Y )3 = (6X)3. Therefore,

x3 + y3 = D3[(36B + A)3 + (36B $A)3] = D3(6XB)3 =

D3(6BC/D)3 = (6BC)3 = z3.

Since |Y | #= 36, then xyz #= 0.

Section 9.2

9.3 Since y2 = x3 + 1, then (y $ 1)(y + 1) = x3. It is easy to see that

g = gcd(y $ 1, y + 1)˛2.

If g = 1, then there are z1, z2 ! Z such that y $ 1 = z31 and y + 1 = z3

2 . Bysubtracting, we get

2 = z32 $ z3

1 .

However, this is impossible since

z32 $ z3

1 ( 0 (mod 4),

given that z1 and z2 must have the same parity. Thus, g = 2. Therefore,„

y $ 12

«„y + 1

2

«= 2

“x2

”3.

Hence, one of (y + 1)/2 or (y $ 1)/2 is of the form z31 , for some z1 ! Z, and the

other is of the form 2z32 , for some z2 ! Z. Thus,

±1 = z31 $ 2z3

2 .

One readily verifies that the only integer solutions to this last equation arez1 = ±1 = z2 and z2 = 0. From these solutions emerge

(x, y) ! {(2,±3), (0,±1), ($1, 0)}.

The result now is a consequence of the Nagell-Lutz Theorem.


9.5 The number of incongruent solutions of

z2 ( y (mod pk)

is 1 + /(y), so the number of solutions of y2 = x3 + ax + b, counting the pointat infinity, is

1 +X

x%Fpk

`1 + /(x3 + ax + b)

´= pk + 1 +

X

x%Fpk

/(x3 + ax + b).

9.7 The discriminant is given by#(E(Q)) = 5.

Also, (1, 0) = Q is clearly a point as is P = (0, 1). Since 2P = (1, 0) = Q, thenP is a point of order 4. Moreover, (0,$1) = $P , so (0,$1) = 3P , and thisimplies that E(Q)t is generated by P , namely

E(Q)t'= Z/4/Z.

9.9 The discriminant is #(E(Q)) = $9. Also, all of the points

{o, ($1, 0), (0,±1), (2,±3)}

are of finite order. Moreover, (2, 3) is of order 6 and E(Q) can be shown to becyclic of order 6, so

E(Q) '= Z/6Z.

Section 9.3

9.11 (a) 97 · 167 (b) 89 · 149 (c) 97 · 547 (d) 101 · 1039.13 (a)–(c) are prime and 26869 = 97 · 277

Section 9.4

9.15 Let y = mx + b be the tangent line to E at x1. Thus, since

2(x1, y1) = (x2, y2),

both (x1, y1) and (x2,$y2) are on y = mx + b. Hence, both points satisfy

3Y

j=1

(x$ !j) = y2 = (mx + b)2.

Also, since y = mx + b is tangent to E at x1, then the three roots of

3Y

j=1

(x$ !j)$ (mx + b)2 = 0

are x2 and x1 repeated. In other words,

3Y

j=1

(x$ !j)$ (mx + b)2 = (x$ x2)(x$ x1)2.


By setting x = !j for each of j = 1, 2, 3 and observing that x1 #= !j since(x2, y2) #= o, then

(x2 $ !j) =

„mx + b!j $ x1

«2

,

which is the square of a rational number for j = 1, 2, 3.

9.17 It is a trivial exercise to verify that

x2 + ny2 = z2 and x2 $ ny2 = t2 (S34)

has a solution in integers with y #= 0 if and only if it has a solution in rationalnumbers with y #= 0.

Suppose first that n is a congruent number. Then by part (1) of Exercise 9.14,

b = 2n/a.

Thus,c2 = a2 + b2 = a2 + 4n2/a2,

or via division by 4: “ c2

”2=“a

2

”2+“n

a

”2. (S35)

Adding ±n to each side of (S35) (to essentially complete the square), we get:

“ c2

”2± n =

“a2± n

a

”2.

Settingx = c/2, y = 1, z = a/2 + n/2, and t = a/2$ n/2

yields a rational solution of (S34), so by the initial comment at the outset ofthis solution, it has an integral solution. This shows that (1) implies (2).

Now we assume that (2) holds. Without loss of generality, we may assumethat x, y, z, t ! N and that these integers are pairwise relatively prime. If y = 1,then by adding the equations in (S34), we get that

2x2 = z2 + t2.

Thus, both z and t have the same parity. If they are both even, then x is evencontradicting the relative primality in pairs. Therefore, they are both odd. Bysubtracting the two equations in (S34), we get that

2n = z2 $ t2 ( 0 (mod 8)

sincez2 ( 1 ( t2 (mod 8).

Thus, 4 divides n contradicting the squarefreeness of n. We have shown thaty #= 1. By multiplying the equations in (S34), we get:

„xtzy3

«2

=

„x2

y2

«3

$ n2 x2

y2.

This shows thatP = (X, Y ) = (x2/y2, xtz/y3)


is a rational (but not integral) point on E, so P has infinite order. Thus,

2P = (x2, y2) #= o.

From Exercises 9.14–9.15, n is a congruent number. This shows that (2) implies(1) and we are done.

Section 10.1

10.1 We have, for z = e$ fi being the complex conjugate of z = e + fi, that

2(!z) = 2„

az + bcz + d

«= 2

„(az + b)(cz + d)(cz + d)(cz + d)

«= 2

„(az + b)(cz + d)

|cz + d|2

«,

where the denominator of the last equality comes from the fact that

(cz + d)(cz + d) = c2(e2 + f2) + 2cde + d2 = (ce + d)2 + c2f2 = |cz + d|2.

Hence,

2(!z) =2[(az + b)(cz + d)]

|cz + d|2 ,

so it remains to show that 2[(az + b)(cz + d)] = 2(z). However, this followsfrom the fact that ad$ bc = 1 since

2[(az + b)(cz + d)] = 2[ac(e2 + f2) + ade + bce + bd + f(ad$ bc)i]

= 2(f(ad$ bc)i) = f = 2(z).

10.3 Assume that ! =

„a bc d

«! $ such that !z ! D. If 2(!z) < 2(z), then we

may replace z by !z and ! by !"1, which tells us that 2(!z) + 2(z) may beassumed without loss of generality. Therefore, by (10.1) on page 332,

|cz + d|2 =2(z)2(!z)

* 1, (S36)

so |cz + d| * 1. This means that |c| * 1, namely c ! {0,±1}. If c = 0, then

d = ±1, and ! =

„±1 b0 ±1

«! $, namely !z = z ± b. Given that z, !z ! D,

then |b| = |z $ !z| * 1, namely b ! {0,±1}. If b = 0, then ! is the identity,which contradicts the hypothesis. If b = ±1, then !z = z±1. Also, |/(z)| * 1/2and |/(z ± 1)| * 1/2, so /(z) = ±1/2 is forced.

Now consider the case where c = ±1. Then (S36) tells us that |z + d| * 1,which forces d = 0 unless z = "3 = ($1 +

&$3)/2 or z = 1 + "3, since this

is the case where !z = z + 1 = $z2, /(z) = $1/2, and d = 1 (respectively!z = z $ 1 = $(z $ 1)2 $ 1, /(z) = 1/2, and d = $1)—see Exercise 1.54 onpage 46.

When d = 0, since ad $ bc = 1 and bc = $1, then either b = 1 = $c orc = 1 = $b. Thus,

!z = ±a$ 1/z = ±a$ z, (S37)

where z is the complex conjugate of z. If a = 0, then !z = $z = $1/z. Also,since z ! D, then |z| + 1, and since |z + d| = |z| * 1, then |z| = 1.


Now assume that a #= 0. Since z, !z ! D, then

|a| = |/(!z) + /(z)| * 1, (S38)

so a = ±1. Thus, by (S37), /(!z) = /(±1 $ z) = ±1 $ /(z), but /(z) * 1/2and /(!z) * 1/2, so it follows that if a = $1, then /(z) = /(!z) = $1/2,while if a = 1, then /(z) = /(!z) = 1/2. However, by (S37), 2(!z) = 2(z),so !z = z, which forced ! to be the identity contradicting the hypothesis. Thiscompletes all cases.

Section 10.2

10.5 If condition (b) is satisfied, then by (10.5) on page 337, f(z + 1) = f(z). Also,

since S =

„0 $11 0

«is a generator of $ by Theorem 10.1 on page 333, then

f(&z) = f($1/z) = ($z)kf(z).

Conversely, assume that conditions (1)–(2) hold. Given

& =

„a bc d

«! $, (S39)

defined(&, z) = cz + d.

Now we show that for !, & ! $, we have

d(!&, z) = d(!, &z)d(&, z). (S40)

Let & be given by (S39), and let ! =

„a# b#

c# d#

«. Then

!& =

„a# b#

c# d#

«„a bc d

«=

„aa# + b#c a#b + b#dc#a + d#c c#b + d#d

«,

sod(!&, z) = (c#a + d#c)z + c#b + d#d

d(!, &z) = d

„!,

az + bcz + d

«= c#

„az + bcz + d

«+ d#,

andd(&, z) = cz + d.

Hence,

d(!, &z)d(&, z) =

»c#„

az + bcz + d

«+ d#

–· [cz + d]

= c#(az + b) + d#(cz + d) = (c#a + d#c)z + c#b + d#d = d(!&, z),

which establishes (S40).

Now we establish that

d(&"1, z) = (d(&, &"1z))"1. (S41)


Since &"1 =

„d $b$c a

«, then

d(&"1, z) = $cz + a,

and

d(&, &"1z) = d

„&,

dz $ b$cz + a

«= c

„dz $ b$cz + a

«+ d

=c(dz $ b) + d($cz + a)

$cz + a=

1$cz + a

= (d(&"1, z))"1,

which is (S41).

Now assume thatf(%z) = d(%, z)kf(z), (S42)

where z ! H and % ! $. Then (S40) tells us that (S42) holds for % = !& and(S41) tells us that (S42) holds for % = &"1. Hence, the subset of $ for which(S42) holds is a subgroup. However, conditions (1)–(2) tell us that this subgroupcontains S and T , which generate all of $ by Theorem 10.1 on page 333. Hence,(1)–(2) imply that (b) holds.

10.7 We have |$ :$ 0(n)| = pa + pa"1. For 0 * + * pa"1$ 1 set &% =

„1 0p+ 1

«and

for 0 * m * pa $ 1 set &m =

„m 1$1 0

«. Therefore we have that

3pa+pa"1

m=0 &m$0(pa) " $,

so we merely have to show that these &m represent distinct cosets. If 0 * + *pa"1 $ 1 and 0 * m * pa $ 1,

&"1% &m =

„1 0$p+ 1

«„m 1$1 0

«=

„m 1

$p+m$ 1 $p+

«#! $0(p

a).

If 0 * +, m * pa"1 $ 1, then

&"1% &m =

„1 0$p+ 1

«„1 0

pm 1

«=

„1 0

p(m$ +) 1

«,

which is in $0(pa) if and only if + = m. Lastly, if 0 * +, m * pa $ 1, then

&"1% &m =

„0 $11 +

«„m 1$1 0

«=

„1 0

(m$ +) 1

«,

which is in $0(pa) if and only if + = m. Hence, all left cosets are distinct.

10.9 Let a/c ! Q with gcd(a, c) = 1. Then by the Euclidean algorithm there exist

b, d ! Z such that ad$ bc = 1. Thus, & =

„a bc d

«! $. Select ! ! $0(n), as

well as any &j and set & = &j!. Thus, &j!(0) = &(0) = a/c. Hence, a/c and&j(0) represent the same cusp.

10.11 If n = 8 in Exercise 10.10, then taking

&j = &%=1 =

„1 02 1

«, &i = &%=3 =

„1 06 1

«, and ! =

„1 14 5

«


yields

&"1i ! =

„1 1$2 $1

«.

Therefore, &"1i (0) = $1/2, and since

&"1j =

„1 0$2 1

«,

then &"1j (0) = $1/2. Hence, both &"1

i (0) and &"1j (0) represent the same

cusp $1/2. However,

&j&"1i =

„1 0$4 1

«,

which is not of the upper triangular form in the su!cient condition.

10.13 By Exercise 10.12 and (10.18) of the hint, we have that f(x) = , for all x.Thus, since f(x) = $(x)$(1$ x) sin(,x), then we deduce that

$(x)$(1$ x) =,

sin(,x). (S43)

Also, by (5.34) on page 224, $(x + 1) = x$(x), so (S43) may be rewritten as

sin(,x) =,

$x$(x)$($x).

Now (10.19) from the hint allows us to replace the gamma function to achieve

$x$(x)$($x) = $x

e")x

x

)Y

j=1

ex/j

1 + x/j

! e)x

$x

)Y

j=1

e"x/j

1$ x/j

!=

1x

)Y

j=1

1

1$ x2

j2

.

Hence,

sin(,x) = ,x)Y

j=1

„1$ x2

j2

«,

so by letting z = ,x, we get the result,

sin(z) = z)Y

j=1

„1$ z2

,2j2

«.

10.15 By Remark 10.3, E24 = E8, so by Example 10.2 on page 339,

1 + 480)X

n=1

$7(n)qn =

1 + 240

)X

n=1

$3(n)qn

!2

,

which implies that

1 + 480)X

n=1

$7(n)qn = 1 + 480)X

n=1

$3(n)qn + 2402

)X

n=1

$3(n)qn

!2

so)X

n=1

$7(n)qn =)X

n=1

$3(n)qn + 120)X

n=1

n"1X

j=1

($3(n)$3(n$ j)) qn.


Therefore,

$7(n) = $3(n) + 120n"1X

j=1

$3(n)$3(n$ j),

as required.

10.17 By Definition 10.4,

j(z) =1728 · 603G4(z)3

#(z)=

1 + 720q + 179280q2 + 16954560q3 + · · ·q $ 24q2 + 252q3 $ 1472q4 + · · ·

=1q

+ 744 + 196884q + 21493760q4 + · · · ,

which shows that j is a modular form of weight 0, but not a cusp form.

Section 10.3

10.19 If we have isomorphic elliptic curves E1'= E2, then by Definition 10.12 on

page 350,

j(E2) =1728(g(2)

2 )3

(g(2)2 )3 $ 27(g(2)

3 )2=

1728!12(g(1)2 )3

!12(g(1)2 )3 $ 27!12(g(1)

3 )2

=1728(g(1)

2 )3

(g(1)2 )3 $ 27(g(1)

3 )2= j(E1).

10.21 From the hint,)X

n=0

xn = (1$ x)"1.

Di%erentiating with respect to x, we get

(1$ x)"2 =)X

n=1

nxn"1 =)X

n=0

(n + 1)xn = 1 +)X

n=1

(n + 1)xn.

10.23 Using the expansion in Exercise 10.22,

2#(z)2 = 4z"6 $ 24G4z"2 $ 80G6 + · · · ,

2(z)3 = z"6 + 9G4z"2 + 15G6 + · · · ,

and2(z) = z"2 + 3G4z

2 + · · · ,

sof(z) = 2#(z)2 $ 42(z)3 + 60G42(z) + 140G6

is analytic around z = 0. Since f(z) is an elliptic function with respect to Land 2 is analytic on C $ L by Remark 10.7 on page 349, then f is an analyticelliptic function, so by Liouville’s theorem given in the hint, f is constant. Butsince f(0) = 0, then f is identically zero, so

2#(z)2 = 42(z)3 $ 60G42(z)$ 140G6 = 42(z)3 $ g2(L)2(z)$ g3(L),

as required.


10.25 From the definitions, if L1 and L2 are homothetic,

g2(L1) = g2(3L2) = 60X

)(#)L2"{0}

1w4

= 3"460X

(#L2"{0}

1w4

= 3"4g2(L2),

and similarly,g3(L1) = 3"6g3(L2).

Therefore, by Equation (10.20) on page 349,

j(L1) =1728g2(L1)

3

g2(L1)3 $ 27g3(L1)2=

17283"12g2(L2)3

3"12g2(L2)3 $ 273"12g3(L2)2

=1728g2(L2)

3

g2(L2)3 $ 27g3(L2)2= j(L2),

and by (10.24) on page 350, j(E1) = j(E2), so by Exercise 10.19, E1'= E2.

Conversely, assume that E1'= E2, so by Exercise 10.19, j(E1) = J(E2), so by

(10.24), j(L1) = j(L2). If

g2(L2) #= 0 #= g3(L2),

then let

34 =g2(L1)g2(L2)

.

Since

j(L1) =1728g2(L1)

3

g2(L1)3 $ 27g3(L1)2=

1728g2(L2)3

g2(L2)3 $ 27g3(L2)2= j(L2),

so by cross multiplying we get that

g2(L1)3(g2(L2)

3 $ 27g3(L2)2) = g2(L2)

3(g2(L1)3 $ 27g3(L1)

2).

However, sinceg2(L1)

3 = 312g2(L2)3,

it follows that

312(g2(L2)3 $ 27g3(L2)

2) = g2(L1)3 $ 27g3(L1)

2 = 312g2(L2)3 $ 27g3(L1)

2,

so312g3(L2)

2 = g3(L1)2,

or by rewriting

312 =

„g3(L1)g3(L2)

«2

.

Taking square roots, we get

36 = ±g3(L1)g3(L2)

.

If the minus sign occurs, then replace 3 by&$13, so without loss of generality,

we may assume that the plus sign occurs. We have demonstrated that wheng2(L2) and g3(L2) are nonzero, then there is a nonzero 3 ! C such that

g2(L2) = 3"4g2(L1) = g2(3L1) and g3(L2) = 3"6g3(L1) = g3(3L1). (S44)


Now from (10.22) on page 350, since (S44) holds, then

2(z; L2) = 2(z; 3L),

namely, they have the same Laurent expansions about z = 0. Since they agreeon a disk about z = 0, then

2(z; L2) = 2(z; 3L)

for all z ! C. Since the underlying lattice is the set of poles of 2, then L2 = 3L1.We have proved the result for all cases except where g2(L2) = 0 or g3(L2) = 0.Note that by Exercise 10.24, it is not possible to have

g2(L2) = 0 = g3(L2).

Suppose thatg2(L2) #= 0 = g3(L2),

then as above, we getg2(L2) = g2(3L1),

and since g3(L2) = 0, then

0 = g3(L2) = g3(3L1).

The other case is similar.

Section 10.4

10.27 We have from (10.30) and (10.32) that

c34 $ c2

6

1728=

11728

ˆ(b2

2 $ 24b4)3 $ ($b3

2 + 36b2b4 $ 216b6)2˜

=b22b

24 $ b3

2b6

4$ 8b3

4 + 9b2b4b6 $ 27b26. (S45)

Now since

b22b

24 $ b3

2b6 = $b22(b2b6 $ b2

4) = $b22

ˆ(a2

1 + 4a2)(a23 + 4a6)$ (2a4 + a1a3)

2˜

= $4b22(a

21a6 + 4a2a6 $ a1a3a4 + a2a

23 $ a2

4) = $4b22b8,

then plugging this into (S45), we get

c34 $ c2

6

1728= $b2

2b8 $ 8b34 + 9b2b4b6 $ 27b2

6 = #(E),

as we sought to prove.

10.29 By Exercise 10.28, the elliptic curve E given by y2 + y = x3 $ x2 $ 10x $ 20has good reduction for all primes p #= 11. Here is the good reduction table.

p 2 3 5 7 13 17 19 23 29 31 37 41 43Np 4 4 5 10 10 20 20 25 30 25 35 50 50

ap(E) $1 0 1 $2 4 $2 0 $1 0 7 3 $8 $6


10.31 According to the hint, we look at F(0, 1, 0) = Z for which 4F/4Z(o) = 1 #= 0.Hence, no elliptic curve can be singular at infinity.

We know that E given by (10.26) is singular if and only if E given by (10.27)on page 353 is singular, the latter given by

f(x, y) = $y2 + 4x3 + b2x2 + 2b4x + b6 = 0.

Given that E is singular if and only if there is a point P = (x0, y0) with

4f/4x(P ) = 0 = 4f/4y(P ),

then2y0 = 12x2

0 + 2b2x0 + 2b4 = 0,

so y0 = 0, and P = (x0, 0), which is therefore a repeated root of 4x3 + b2x2 +

2b4x + b6 = 0, namely

4x3+b2x2+2b4x+b6 = (x$!)2(4x$#) = 4x3$(8!+#)x2+2(!#+2!2)x$!2# = 0.

This implies thatb2 = $8!$ #,

b4 = !# + 2!2,

b6 = $!2#,

soc4 = b2

2 $ 24b4 = 16!2 $ 8!# + #2,

andc6 = $64!3 + 48!2# $ 12!#2 + #3.

Therefore,

c34 = 4096!6 $ 6144!5# + 3840!4#2 $ 1280!3#3 + 240!2#4 $ 24!#5 + #6,

and

c26 = 4096!6 $ 6144!5# + 3840!4#2 $ 1280!3#3 + 240!2#4 $ 24!#5 + #6.

Thus,# =( c3

4 $ c26)/1728 = 0.

We have shown, by contrapositive, that E is nonsingular if and only if # #= 0.

10.33 By using the solution of Exercise 10.31 above, we see that we can transformy2 + y = x3 $ x2 via replacing y with (y $ 1)/2 to get

y2 = 4x3 $ 4x2 + 1,

which reduces, modulo 11, to

y2 = 4x3 $ 4x2 + 1 = (x$ 8)2(4x$ 6),

where in the notation used above, ! = 8 and # = 6, observing that #(E(F11)) =$11 ( 0(mod 11). Graphing the right-hand side we get


10.35 Since a1 = a3 = a4 = 0, a2 = p, and a6 = 1, then b2 = 4p, b4 = 0, b6 = 4, andb8 = p. Therefore,

# = $b22b8 $ 8b3

4 $ 27b26 + 9b2b4b6 = $16(p3 + 27),

so since p > 3, then p ! # so E has good reduction at p by Exercise 10.31.

10.37 Reduced modulo p, we get y2 = x3 which is Figure 10.2 on page 359, so it hasa cusp over Fp.

10.39 From (10.38)–(10.39),c34

#= $ (24 · 31)3

115,

so 115˛#, 24 · 31

˛c4, and c4 = 24 · 31 · 3

p$#/115. Also,

$ (24 · 31)3

115= 1728 +

c26

#,

so

c26 = #

„$ (24 · 31)3

115$ 1728

«=$#115

· 26 · 412 · 612,

which implies that c6 = 23 · 41 · 61 ·p$#/115. Hence, there is an integer k #= 0

such that# = $115k6,

c6 = 23 · 41 · 61k3,

andc4 = 24 · 31k2,

as required.

10.41 For the Frey curve, in the notation of (10.26)–(10.34) on pages 353–354, wehave that a1 = a3 = 0, a2 = ap + cp, a4 = apcp, a6 = 0, b2 = 4(ap + cp),b4 = 2apcp, and b6 = 0, so

c4 = 16(ap+cp)2$48apcp = 16a2p+32apcp+16c2p$48apcp = 16(a2p$apcp+c2p),


and

# = 16(ap + cp)2a2pc2p $ 8 · 23a3pc3p = 16a4pc2p $ 32a3pc3p + 16a2pc4p

= 16(a2pb2pc2p)

„a2p $ 2apcp + c2p

b2p

«= 16(a2pb2pc2p)

„(ap $ cp)2

b2p

«

= 16(a2pb2pc2p)

„(bp)2

b2p

«= 16(a2pb2pc2p),

which verifies (10.46)–(10.47).

If p˛#, then p|abc. Since a, b, c are pairwise relatively prime, p ! c4, so by Ex-

ercise 10.32, (10.45) is minimal at p. Also, we see that if p|ac, then E((mod p))has a node at (0, 0), whereas if p|b, E((mod p)) has a node at (ap, 0). Therefore,

n = 2*Y

p|abc

p

for some nonnegative integer %, so it remains to check for p = 2. Given that anadmissible change of variables “uses up” powers of 4 and 24||c4, then we mayreduce only once. Without loss of generality, assume that c is even. Then

cp ( 0 (mod 32), (S46)

since p + 5. Also, we may assume, without loss of generality, that

ap ( $1 (mod 4), (S47)

since if not then we interchange a and b to get (S47) given that bp ( $ap (mod 4)and p > 2. Now, by setting x = 4X and y = 8Y + 4X as an admissible changeof variables in (10.45), we get

Y 2 + XY = X3 $ 1 + ap + cp

4X2 +

apcp

16X, (S48)

where the coe!cients are integers via (S46)–(S47). Hence, (S48) is global mini-mal. Reducing modulo 2, the right-hand side of (S48) is either X3 or X3 + X2.The sole singular point is at (0, 0), then it must be a node since neither Y 2+XYnor Y 2 +XY +X2 is a square–see Remark 10.14 on page 360. This proves that% = 0, so

n =Y

p|abc

p,

as we sought to show.


Index

SymbolsD(L)

Discriminant of a lattice, 182E(F )

Elliptic curve over F , 302GD

char

Group of Dirichlet characters, 248G2k(z)

Eisenstein series, 337G!F

Genus group, 142L(E, s)

L-function for an elliptic curve, 363M(x)

Merten’s function, 222Mk($)

Space of modular forms of weight k,342

Mk($0(n))Space of modular forms of weight k

and level n, 343M0

k ($)Space of cusp forms of weight k, 342

N(I)Norm of an ideal I, 118, 292

NF (!)Norm from F , 18

Sk($0(n))Space of cusp forms of weight k and

level n, 343Un,Vn

Lucas functions, 272X0(n)

Compact Riemann surface, 363#(E(F ))

Discriminant of an elliptic curve, 302#(L)

Discriminant of a lattice, 349#(z)

Discriminant function, 340

#F

Discriminant of a quadratic field, 7Up

p-adic units, 246F!p

Multiplicative group of Fp, 268$0(n)

Hecke subgroup of $, 342$

Modular group, 332$(n)

Principal congruence subgroup of $,347

$(s)Gamma function, 224

2(s)Imaginary part of s ! C, 332

&(n)Mangoldt function, 377

&k(n)Generalized Mangoldt function, 378

"n(x)Cyclotomic polynomial, 12

QRational number field, 2

Q("p)Prime cyclotomic field, 286

Qp

Field of p-adic numbers, 235/(s)

Real part of s ! C, 218Z

Rational integers, 1!˛#Division in OF , 8

! ' #Associates, 19

)(z)Dedekind-eta function, 341

A

451


Ring of all algebraic integers, 3P

Set of all primes, 370C+

OF

Narrow ideal class group, 110COF

Ideal class group, 109!(I)

Number of idealresidue classes, 293

D(S)Dirichlet density, 263

Op

Ring of p-adic integers, 236H!

h 3Q 3 {0}, 361H

Upper half complex plane, 332OF

Ring of integers in F , 4UR

group of units in R, 2li(x)

Logarithmic integral, 223o

Point at infinity, 302µ(n)

Mobius function, 214m

Residue class of m, 1325(d)

Number of distinct prime divisorsfunction, 370

QField of all algebraic numbers, 2

-(n)The Euler totient, 214

,(x)Number of primes * x, 221

,(x; k, +)Number of primes p ( +(mod k)

with p * x , 373exp(x)

ex, 261$(n)

Sum of divisors function, 212'(n)

Number of divisors function, 208C

Riemann sphere, 331

.D

Fundamental unit of a real quadraticfield, 259

|S| < 0Finite cardinality, 183

||Proper division, 102

2Weierstrass 2-function, 349

6(s)Completed zeta function, 226

"F (s)Dedekind-zeta function, 256

"(s)Riemann’s zeta function, 218

"n

Primitive root of unity, 2f(n) ' g(n)

Asymptotically equal, 200f = (a, b, c)

Binary quadratic form, 97f = O(g)

Big O notation, 209f 4 g

Dirichlet composition, 107h+

OF

Narrow ideal class number, 113hD

Number of positive definite forms ofdiscriminant D, 102

hOF

Wide ideal class number, 113j(z)

j-invariant, 340m!,F (x)

Minimal polynomial, 10o(x)

Little oh function, 163SL(2, Z)

Special linear group, 98GL(n, Z)

General linear group, 66PSL(2, R)

Projective special linear group overR, 332

PSL(2, Z)Modular group, 332

Index 453

Subject

AABC conjecture, 296

implies Erdos–Woods, 297implies Fermat–Catalan, 297implies FLT, 296implies generalized Tijdeman, 296implies Hall’s conjecture, 296implies infinitely many Wieferich

primes, 297implies Thue–Siegel–Roth, 296implies weak Erdos–Molllin–Walsh,

299Absolute convergence, 218Absolute value

Archimedean, 233identical, 234non-Archimedean, 233on a field, 233trivial, 234unity, 234valuation, 233

ACC, 71Additive reduction, 361Admissible change of variables, 355A!ne plane, 302Albers, D.J., 393Algebraic

closure, 9conjugate, 7geometry, 342independence, 177integer, 1, 2

associate, 19relatively prime, 22

number, 2height, 162

number field, 2over a field, 9varieties, 342

Algorithmelliptic curve, 391number field sieve, 386Pollard’s rho, 391

Almost primes, 380Alter, R., 393Ambiguous

classof forms, 119

form, 119ideal, 118

Analyticcontinuation, 219

L-functions, 254density, 263function, 218

Apery’s constant, 172Araki, K., 399Archimedean

absolute value, 233Archimedes, 233Arendt, Hannah, 310, 347Arithmetic function

average order, 208mean value, 216multiplicative, 226

Artin conjectureon "-functions, 259on primitive roots, 369

Artin, Emil, 87Artinian rings, 85, 87Ascending chain condition, 71Assigned values of characters, 131Associate, 19Asymptotic density, 263Asymptotic sieve, 379Asymptotically equal, 200Atkin, A., 395Automorphic function, 347Average order, 208

Euler’s totient, 214number of divisors, 210sum of divisors function, 212

BBabbage, Charles, 229, 247Bachet’s equation, 47, 282Bacon, Francis, 252, 263Bad reduction index, 357Baker, A., 393Ball in Rn, 183Beatty’s Theorem, 264Beatty, Samuel, 393Bernoulli

equation, 207lemniscate, 207numbers, 192

and the Riemann "-function, 198recursion formula, 206


polynomial, 192derivative, 206Fourier series, 196

Bernoulli, Jacob, 207Bernoulli, Johann, 207Bernoulli, Nicolaus and Margaretha, 207Beta function, 260Beukers, F., 298Big O notation, 209Binary quadratic form, 97

equivalence modulo prime, 155inverse, 108one class per genus, 143opposite, 108

Birch, B., 395Blake, William, 218Blanschke’s theorem, 186Bolzano–Wierstrass theorem, 238Bombieri’s asymptotic sieve, 379Bombieri, E., 380, 393Bombieri–Vinogradov theorem, 378Bounded

function on C, 352sequence, 238set, 186

Brauer, Richard, 73Breuil, C., 393Brun’s constant, 371Brun’s Theorem, 371Brun, Viggo, 371Brun–Titchmarsh theorem, 374Bugeaud, Yann, 393

CCantor, Georg, 165Cardano’s formula, 303Cardano, Girolamo, 304Cardinal number, 163Catalan’s constant, 178Catalan, Charles, 294Cauchy

sequence, 234p-adic, 234equivalent, 235null, 235

Cauchy, Augustine-Louis, 239Chan, Raymond, 203Character

Dirichletmodulo N , 247

number of, 249orthogonality, 249

genericform, 131

group, 268multiplicative, 268principal, 248, 268

Chatland, H., 50, 394Chen’s theorem, 375Chinese Remainder Theorem, 84Clark, D.A., 394Class number

form, 113ideal

narrow, 113wide, 113

Clemens, Samuel Langhorne, 159Closed set, 219Coleridge, Samuel Taylor, 369Collinear points, 309Common divisor in OF , 8Compact Riemann surface, 361Complementary sequences, 265Completed zeta function, 226Completely multiplicative function, 227Completing Q, 236Complex lattice, 348Complex torus, 361Conductor as isogeny invariant, 361Congruence

modulo an ideal, 292residue class, 292

modulo and ideal, 85Congruent number, 329Conjugate

field, 91ideal, 63of an element, 91over F , 91over a number field, 91

Conrad, B., 393, 394Convenient numbers, 145Convergent series, 218Convex set, 186Countable set, 163Cox, D.A., 394Crandall, Richard, 394Cromwell, Oliver, 205Cunningham numbers, 381Cusp, 332, 358

Index 455

Cusp form, 336Cyclotomic

field, 286integers, 2polynomial, 11

irreducibility, 12

DDarmon, H., 394Davenport, H., 50, 54, 160, 394DCC, 85Dedekind

)-function, 341cuts, 46domain, 71zeta-function, 256

Dedekind, J.W.R., 46Dense set, 219Density

analytic, 263asymptotic, 263Dirichlet, 263natural, 263

dePillis, John, 67, 394Descartes, Rene, 207Descending chain condition, 85Deuring, Max, 141, 146Diamond, F., 393, 394Dictionary ordering, 180Diophantine

approximation, 159equation

Ramanujan–Nagell, 13set, 295

primes, 295Diophantine equations

of the formxp + yq = zr, 295x2 + 2209 = 17n, 281x2 + 225 = 173m, 281x2 + 2 = p3m, 281x2 + 43 = 473m, 281x2 + 49 = 533m, 281x2 + 5 = p3m, 281x2 $D = pn, 13, 276x2 $Dy2 = pd, 279xn + yn = zn, 353xp + yp + zp = 0, 286y2 = x3 + ax + b, 302y2 = x3 + k, 282

y2 = x3 $ 432, 303Diophantine sets, 295Direct sum, 183Dirichlet

L-functions, 252convergence, 252

character, 247number, 249orthogonality, 249

class number formula, 141composition, 107density, 263divisor problem, 212series, 227theorem

primes and density, 265primes in arithmetic progression,

258Dirichlet, Peter, 141Discrete log

elliptic curve, 326MOV attack, 327nonsupersingular, 327ordinary, 327

Discrete sets, 183Discriminant

elliptic curve, 302form, 99function

and Dedekind-), 341for modular forms, 340

lattice, 349of a lattice, 182quadratic field, 7

radicand, 121quadratic polynomial, 7

Divergent series, 218Domain

p-adic, 236ACC, 71factorization, 37fundamental, 182integral

Dedekind, 71Noetherian, 71

unique factorization, 37Doubly periodic functions, 348Doyle, Sir Arthur Conan, 55, 129Dumas fils, 276, 282Dumas Pere, 276


Dumas, Alexandre, 276Dyson, Freeman, 155

EECC

Menezes-Vanstone, 327MOV threshold, 327

Eddington, Arthur, 282Eichler–Shimura theory, 363Einstein, Albert, 271Eisenstein

series, 337weight k, 339

Elementary symmetric polynomial, 180Elkies, N.D., 394Elliott–Halberstam conjecture, 378Elliptic curve, 302

L-function, 363X0(n)

genus, 364Q-structure, 362j-invariant, 350additive reduction, 361admissible change of variables, 355and modular forms, 347complex torus, 361conductor

isogeny invariant, 361cusp, 358discrete log, 326

MOV reduction, 327discriminant, 302ECC

Menezes-Vanstone, 327Eichler–Shimura theory, 363Frey curve, 353global Weierstrass equation, 353good reduction index, 357good reduction sequence, 357good reduction table, 357Hasse’s bounds, 319integer point, 312isogenous, 367isogeny, 367minimal equation, 356modular, 359

parametrization, 363Eichler–Shimura theory, 363

MOV attack, 327MOV threshold, 327

multiplicative reduction, 360node, 358nonsingular, 357nonsupersingular, 327order

of a point, 310ordinary, 327point

infinite order, 310point at infinity, 302primality test, 322primes of bad reduction, 357primes of good reduction, 357rank, 310reduction modulo p, 313Ribet’s theorem, 365semi-stable reduction, 360Shimura–Taniyama–Weil Conjec-

ture, 360singular point, 357stable reduction, 360STW

in terms of L-functions, 364in terms of modular parametriza-

tions, 363supersingular, 327torsion point, 310

trivial, 310torsion subgroup, 311twist, 355unstable reduction, 361Weierstrass equations, 348Weil curve, 363Weil Pairing, 327

Elliptic function, 347doubly periodic, 348lattice over C, 348Liouville’s theorem, 352period, 348

Emerson, Ralph Waldo, 301Enumerable set, 163Equipotent sets, 163Equivalence

class of forms, 98relation, 238

Equivalentforms, 98valuations, 236

Eratosthenes’ sieve, 370Erdos, Pal, 394

Index 457

Erdos–Mollin–Walsh conjecture, 297Erdos–Woods conjecture, 297ERH, 255Euclidean

algorithmGaussian integers, 23

domain, 32function, 32norm, 34

Eulerconstant, 172convenient numbers, 145generalization of Fermat, 248ideal theorem, 293product

and L-functions, 252totient, 214

average order, 214Euler’s identity for e, 175Euler–Maclaurin summation formula, 193Euler–Mascheroni constant, 203

FFactoring, 88

elliptic curve, 317of F5, 92of F6, 96of F7, 95of F8, 391of F9, 391of F10 and F11, 391Pollard’s algorithm, 94using cubic integers, 92using number field sieve, 386

Factorization domain, 37Faltings, G., 394Fermat

equation, 41general, 41prime, 286

last theorem, 41first case, 286proof for p = 3, 41

little theoremfor ideals, 293

Fermat’s last theoremproof, 365

Fermat–Catalan conjecturefrom ABC, 297

Ferrari, Ludovico, 304

Ferro, S. del, 304Field

p-adic, 236p-adic numbers, 235absolute value, 233algebraic closure, 9algebraic integer extension, 10conjugate, 91discriminant, 7number, 2of quotients, 70polynomial, 91prime cyclotomic, 286simple extension, 3

Flaubert, Gustave, 182FLT, 41

proof, 365regular primes

case I, 291case II, 291

Formambiguous, 119and ideals, 107assigned values, 131binary quadratic

last coe!cient, 97leading coe!cient, 97middle coe!cient, 97

character, 131class number

finiteness, 116composition

Dirichlet, 107discriminant, 99equivalence class, 98equivalence modulo a prime, 155generic characters of, 131improper equivalence, 98indefinite, 99modular, 336

q-expansion, 337and Eisenstein series, 337and elliptic curves, 347cusp, 336functional equation, 336of weight k, 336parabolic, 336space, 342unrestricted, 336

negative definite, 99


positive definite, 99primitive, 97principal, 100proper equivalence, 98proper representation, 98quadratic

equivalent, 98reduced, 100representation, 98united, 107

Fourierseries, 194

Benoulli polynomials, 196history, 197

Fourier, J.B.J., 197, 228Fractional linear transformation, 332Fractional ideal, 75

inverse, 76invertible, 76

Free abelian group, 189Frey curve, 353Frey, G., 353, 395Friedlander, John, 381, 395Friedlander–Iwaniec theorem, 379Frobenius, Georg, 104Full lattice, 182Function

analytic, 218arithmetic, 191

completely multiplicative, 227multiplicative, 226

asymptotically equal, 200automorphic, 347beta, 260Dedekind-), 341doubly periodic, 348elliptic

lattice over C, 348modular, 347period, 348

Euler totient, 214holomorphic, 218identically zero, 337little oh, 163Mobius, 214meromorphic, 219modular, 336

weakly, 336number of divisors, 208Riemann zeta, 218

real zeros, 227trivial zeros, 227

singularity, 348sum of divisors, 212Weierstrass 2-functions, 349zeta

completed, 226Functional equation for $(s), 225Fundamental

domain, 182, 335parallelotope, 182

volume, 182unit, 259

GGalileo Galilei, 240, 243Galois, E., 126Gamma function, 224

functional equation, 225Legendre’s duplication formula, 226

Gausslemma on polynomials, 168sum, 269

Gaussianinteger

gcd, 21odd, 25parity, 25primary, 30quotient, 20remainder, 20

prime, 19GCD

algebraic integers, 21ideals, 79

Gel,fond constant, 178

Gel,fond–Schneider constant, 178

Gel,fond–Schneider theorem, 166

General linear group, 66Generalized Mangoldt function, 378Generalized Riemann hypothesis, 255Genus, 132, 364

duplication theorem, 143group, 142of a coset, 135of forms, 132principal, 132squaring thoerem, 143

Geometry of numbers, 182Gilbert, W.S., 171

Index 459

Goethe, Johann Wolfgang von, 247Goldbach conjecture, 369

Selberg sieve, 376Goldston, D.A., 395Goldwasser, S., 395Goldwasser–Kilian primality proving al-

gorithm, 324Good reduction index, 357Good reduction sequence, 357Graded algebra, 342Granville, Andrew, 394, 395Granville–Langevin conjecture, 297Greatest common divisor

Euclidean domain, 33ideals, 79

Groupbasis, 182free abelian

direct sum, 183generator, 182ideal

narrow class, 110strict class, 110

modular, 332presentation, 334

HHall’s conjecture, 296Hall, Marshall, 296, 395Harmonic analysis, 194Harris, Robert, 47, 395Hasse, Helmut, 73, 146Heath-Brown, D., 395Hebraeus, Leo, 294Hecke congruence subgroups, 342Hecke, Erich, 344Heegner, Kurt, 141, 395Height of algebraic numbers, 162Heilbronn, H., 50, 141, 396Hensel’s lemma, 230Hepburn, Audrey, 88Hermite normal form of ideals, 118Hermite’s formula, 208Hermite, Charles, 126, 128Hilbert’s tenth problem, 295Hilbert’s Theorem 90, 127Hilbert, David, 31, 104, 105, 127, 146Hobbes, Thomas, 205Hofreiter, N., 50, 396Holomorphic

at 0, 337function, 218

Homogeneous polynomial, 330Homothetic lattice, 352Hurwitz "-function, 261Huxley, M.N., 212Huygens, Christian, 175

IIdeal, 55

gcd, 79lcm, 79ambiguous, 118Chinese remainder theorem, 84class group, 109class number

finiteness, 117wide, 113

congruence, 85, 292residue class, 292

conjugate, 63descending

chain condition, 85divide, 57Euler’s theorem, 293Fermat’s little theorem, 293finitely generated, 70first-degree, 388fractional, 75

inverse, 76multiplicative group, 86

greatest common divisor, 79group

principal class, 109Hermite normal form, 118invertible, 276

fractional, 76irreducible, 81least common multiple, 79maximal, 67

fields, 68quotients, 68

narrow class number, 113norm, 256order with respect to a prime, 86PID, 82prime, 57primitive, 65principal, 56products, 57


proper, 56quadratic

multiplication formulas, 59norm, 118

relatively prime, 79smooth, 388unique factorization

theorem, 77Identical absolute value, 234Identically zero function, 337Improper equivalence of forms, 98Indefinite form, 99Infimum, 379Integer

p-adic, 236algebraic, 2cyclotomic, 2point, 312smooth, 93

Integralclosure, 70domain, 9

ascending chain condition, 71Dedekind, 71Noetherian, 71

over a domain, 9polynomial, 1

Integration by parts, 172Invariant for elliptic curves, 350Invertible fractional ideal, 76Invertible ideal, 276Irreducible, 37

elements, 37ideal, 81

Irregular prime, 291Isogenous curves, 367Isogeny, 367Isogeny invariant, 361Iwaniec, H., 382, 395

JJacobi, Carl, 140Jacobites, 204Jenneret, Charles-Edouard, 252Jones, J.P., 396

KKillian, J., 395Klein, Felix, 104, 128Knapp, A.W., 396

Koblitz, N., 396Kraitchik, Maurice, 384Kronecker’s lemma, 217Kubota, A.A., 393Kummer, Eduard, 124

LL-Function, 252

analytic continuation, 254and Euler products, 252complex

nonvanishing, 256convergence, 252elliptic curve, 363real

nonvanishing, 257Landau, Edmund, 50, 104, 396Landau–Siegel zeros, 300Lang, Serge, 87Large sieve, 377

Artin’s conjecture, 377Last coe!cient, 97Lattice, 182

discriminant, 182, 349elliptic function

period, 348full, 182homothetic, 352in C, 348

period, 348invariant, 349

Le Corbusier, 252Leading coe!cient, 97Least common multiple

ideals, 79in UFDs, 40

Lebesgue measure, 183Legendre’s duplication formula, 226Lehmer, D.H., 272Leibniz’ formula, 172Leibniz, G.W. von, 175Lemniscate of Bernoulli, 207Lenstra, A.K., 396Lenstra, H.W., 318, 396Level of a modular form, 343Lindemann, Carl, 127Lindemann–Weierstrass theorem, 177Linearly independent, 166Linfoot, E.H., 141Linnik, Yu V., 384, 396

Index 461

Liouvilleboundedness theorem, 352number, 171theorem, 160theorem on elliptic functions, 352

Liouville, Joseph, 168Liouvillian number, 168Little oh function, 163Logarithmic integral, 223Lord Byron, 229, 247, 252Lovelace, Ada, 229Lucas functions, 272

properties, 272Lucas, E., 272

MMobius function, 214Mobius transformation, 332Mahler, Kurt, 181, 396Malebranche doctrine, 207Malebranche, Nicholas, 207Manasse, M.S., 396Markov

conjecture, 123number, 123

Markov, Andrei, 126Mascheroni, L., 203Masser, David, 296Mathieu, Claude, 168Matiyasevich, Y., 397Matrix

general linear group, 66unimodular, 66

Maximal order, 59Mazur’s theorem, 312Mazur–Kamienny

conjecture, 312MCBT, 182Mean-value theorem, 160Menezes, A., 397Menezes–Vanstone ECC, 327Merel, L., 397Meromorphic function, 219Merten’s constant, 222Merten’s function, 222Merten’s Theorem, 222Middle coe!cient, 97Mihailescu’s theorem, 294Mihailescu, Preda, 294Miller, V., 397

Milton, John, 191Minimal equation for elliptic curves, 356Minimal polynomial, 9, 10Minkowski, H., 190Minkowski, Hermann, 104Modular

elliptic curves, 359form, 336

j-invariant, 340q-expansion, 337and Eisenstein series, 337applications to elliptic curves, 347cusp, 336discriminant function, 340functional equation, 336level, 343normalized, 343of weight k, 336space of, 342unrestricted, 336

function, 336elliptic, 347weakly, 336

group, 332fundamental domain, 335principal congruence subgroup,

347parametrization, 363

Module, 65Mollin, R.A., xiii, 397, 398Mordell, L.J., 398Mordell, Louis Joel, 315, 398Mordell–Weil theorem, 310Moreno, C.J., 398MOV

attack, elliptic curve, 327reduction, discrete log, 327threshold, 327

Multiplicative reduction, 360

NNagell–Lutz theorem, 311, 312Narrow ideal class group, 110Narrow ideal class number, 113Natural density, 263Nearest integer function, 19Nicely, Thomas R., 372Node, 358Noether’s Theorem, 73Noether, Emmily Amalie, 46, 73


Noetherian domain, 71Non-Archimedean absolute value, 233Nonsupersingular

elliptic curve, 327Norm

elementnumber field, 91

ideal, 256of an ideal, 118quadratic, 18quadratic ideal, 66

Norm-Euclidean, 34Normalized cusp form, 343Null sequence, 235Number

algebraic, 2congruent, 329field, 2

conjugate, 91norm, 91

field sievegeneral, 382special, 382

p-adic, 235real, 235

Number of distinct prime divisors, 370Number of divisors

average order, 210function, 208

OOesterle, Joseph, 296Okamoto, T., 397Oppenheim, A., 398Opposite, binary quadratic form, 108Ordinary elliptic curve, 327Ordinate, 312Ostrowski’s Theorem, 240Ostrowski, Alexander Markowich, 242

Pp-adic

absolute valuestrong triangle inequality, 238

Cauchy sequence, 234convergence, 235fields, 236integers, 236number, 231, 235periodic representation, 244

representation as power series, 244solution, 231valuation, 234

Parabolic form, 336Parallelepiped, 182Parallelotope

fundamental, 182volume, 182

Parity problem, 379Pascal, Blasie, 294Pasteur, Louis, 286Pentium chip flaw, 372Period of elliptic function, 348Periodic p-adic representation, 244Perron, O., 50, 53, 128, 398PID, 82Pintz, J., 395Poincare, Charles, 128Poincare, Henri Jules, 97, 147Point

at infinity, 302finite order

elliptic curve, 310infinite order

elliptic curve, 310order

elliptic curve, 310torsion

elliptic curve, 310Pollard’s algorithm, 94Pollard, J.M., 396Polynomial

cyclotomicirreducibility, 12

elementaryfundamental theorem, 180

elementary symmetric, 180field, 91homogeneous, 330integral, 1minimal

height, 162symmetric, 180

Pomerance, C., 394Positive definite form, 99Powerful number, 297Presentation of a group, 334Primality proving algorithm, 324Primality tests

elliptic curve, 322

Index 463

Goldwasser–Killian, 324primality proving, 324

Primary integer, 30Prime, 37

almost, 380as sums of two squares, 26Brun’s theorem, 371cyclotomic fields, 286Diophantine set, 295Dirichlet density, 265element

in a number field, 37Gaussian, 19ideal, 57

first-degree, 388in arithmetic progression, 258inert

in quadratic fields, 63irregular, 291number less than a bound, 371number theorem, 221of bad reduction, 357of good reduction, 357ramified

in quadratic fields, 63regular, 286split

in quadratic fields, 63the p = n2 + 1 conjecture, 369the q = 4p + 1 conjecture, 369Wieferich, 297

Primitiveform, 97ideal, 65root

Artin’s conjecture, 369root of unity, 2

Principal character, 248, 268Principal congruence subgroup, 347Principal forms, 100Principal ideal domain, 82Probability

law of large numbers, 207relative primality, 215

Projectivespace

point at infinity, 302Projective geometry, 302Proper representation, 98Properly equivalent forms, 98

Proulet, P., 178Proulet–Thue–Morse constant, 178Pythagorean triple, 329

primitive, 329

QQuadratic

fieldfundamental unit, 259norm-Euclidean, 34radicand, 121regulator, 259

formintegral binary, 97

integersconjugate, 18norm, 18

Quotient field, 70

RRuck, H.-G., 395Rabinowitsch criterion, 153Rabinowitsch, G., 154Rabinowitsch–Mollin–Williams

criterion, 153Radical on n, 296Radicand, 121Ramanjan

' -function, 341Ramanujan–Nagell equation, 13

generalized, 13Rank

elliptic curve, 310Real number, 235Reduced forms, 100Regular prime, 286Regulator, 259Relatively prime ideals, 79Remak, R., 50, 53, 398Representation problem, 98Residue class modulo an ideal, 292Ribet’s theorem, 365Ribet, K.A., 399Richard, Louis, 126Riemann

"-function, 198"-function, 218

real zeros, 227hypothesis, 223

generalized, 255


sphere, 331zeta function

trivial zeros, 227Riemann, B., 146Ring

isomorphism theorem, 292of algebraic integers, 3of integers, 4

Roth’s Theorem, 160Roth, K.F., 160, 394Russell, Bertrand, 148

SSagan, Carl, 32Saint Augustine, 207Santayana, George, 208Sato, D., 396Satoh, T., 399Schanuel’s conjecture, 178Schinzel, Andrzej, 382Selberg

condition, 372sieve, 373

Brun–Titchmarsh theorem, 374Selberg, A., 373, 399Semaev, I., 399Semi-stable reduction, 360Sequence

p-adicCauchy, 234

p-adic convergence, 235Cauchy, 234

equivalent, 235complementary, 265convergence

uniform, 261null, 235

SeriesL-functions, 252

and Euler products, 252p-adic representation, 244absolutely convergent, 218convergence

uniform, 261convergent, 218Dirichlet, 227divergent, 218sum, 218

Serre, Jean-Pierre, 399Set

bounded, 186closed, 219convex, 186countable, 163dense, 219Diophantine, 295discrete, 183enumerable, 163equipotent, 163infimum, 379translate, 186uncountable, 163

Shakespeare, William, 118Shimura–Taniyama conjecture, 225Shimura–Taniyama–Weil, 360Shorey, T.N., 393Siegel zero, 300Siegel’s theorem, 313Siegel, Carl Ludwig, 170Sieve

almost primes, 380Bombieri

asymptotic sieve, 379Bombieri–Vinogradov theorem, 378Brun’s constant, 371Elliott–Halberstam conjecture, 378Eratosthenes, 370Friedlander–Iwaniec theorem, 379large, 377

Artin’s conjecture, 377Bombieri–Vinogradov, 378Linnik, 377

methods, 369number field

general, 382special, 382

parity problem, 379Pentium chip flaw, 372Selberg, 373

Brun–Titchmarsh theorem, 374Goldbach conjecture, 376twin-prime conjecture, 375upper bound, 373

Selberg’s condition, 372twin primes

Chen’s theorem, 375Sieving, 369Silverman, J.H., 399Simple extension field, 3Singular point, 357

Index 465

Singularity, 348Smart, N., 399Smooth integers, 93, 385Solinas, J., 399Spearman, B.K., 399Sphere in Rn, 183Spinoza, Baruch, 318, 331Squarefree kernel of n, 296Squares

sums of two, 26Squaring the circle, 178Stable reduction, 360Stark, H., 395Stendhal, Henri Beyle, 1Steuding, J., 399Stirling’s formula, 201Stirling, James, 204Strict ideal class group, 110Strong triangle inequality, 238STW

L-functions, 364conjecture, 353modular parametrizations, 363

SumGauss, 269of divisors function, 212

Supersingular elliptic curve, 327Symmetric

polynomial, 180set, 186

Szemeredi’s theorem, 160

TTartaglia, Niccolo, 304Tate, J., 87, 399Taylor, R., 393, 394, 399Theorem 90 by Hilbert, 127Thue, A., 160, 169, 170Thue-Siegel-Roth Theorem, 160Thue–Morse sequence, 178Tijdeman’s theorem, 296Tijdeman, R., 294Titchmarsh divisor problem, 378Titchmarsh, E.C., 376, 400Torsion

groupelliptic curve, 311

pointelliptic curve, 310trivial, 310

Trace elementnumber field, 91

Transcendence, 171of ,, 175of e, 172

Transcendental number, 162Translate of a set, 186Trivial absolute value, 234Trivial zeros of zeta function, 227Twain, Mark, 159Twin prime conjecture, 369Twin-prime conjecture

Selberg sieve, 375Twist of elliptic curves, 355

UUBC, uniform boundedness

conjecture, 312UFD, 37

least common multiple, 40Uncountable set, 163Uniform convergence, 261Unimodular transformation, 98Unique factorization domain, 37Unitary absolute value, 234United form, 107Units, 2Unstable reduction, 361

VValuation

p-adic, 234absolute value, 233equivalent, 236over Q, 233

van der Waerden, B.L., 73, 146Van Frankenhuysen, M., 394Van Heemstra, Edda, 88Veldekamp, G.R., 34, 400Vinogradov, A.I., 400Vojta, P., 400von Neumann, John, 317

WWada, H., 396Wagsta%, S.S. Jr., 398Waldschmidt, M., 400Wallis, John, 205Walsh, Peter Garth, 398Weierstrass


2-functions, 349equation, 348

global, 353minimal, 356

product formula, 346Weierstrass, Karl, 179Weil curve, 363Weil Pairing, 327Weil, Andre, 146, 316Weiner, Norbert, 126Weyl, Hermann, 31, 400Wide class number, 113Wieferich prime, 297Wiens, D., 396Wiles, Andrew John, 225, 353, 399, 400Wiles, proof of FLT, 303Williams, Hugh Cowie, 154Williams, K.S., 399Wintner, Aurel, 400Wolfskehl equation, 224Wolfskehl, P., 400

YYee, Alexander, 203Yildirim, C.Y., 395

ZZagier, D., 298Zeta-function, 218

completed, 226Dedekind, 256Hurwitz, 261

Zukav, Gary, 400

Documents

DISCRETE MATHEMATICS AND ITS APPLICATIONS · 2019-10-24 · proof of Fermat’s Last Theorem (FLT) for p = 3. Applications of unique fac-torization are given in terms of both Euler’s