44
arXiv:2103.03942v2 [math.NT] 3 May 2021 Biases in Moments of the Dirichlet Coefficients in One- and Two-Parameter Families of Elliptic Curves S. J. Miller, Y. Weng May 4, 2021 Abstract - Elliptic curves arise in many important areas of modern number theory. One way to study them is take local data, the number of solutions modulo p, and create an L-function. The behavior of this global object is related to two of the seven Clay Millenial Problems: the Birch and Swinnerton-Dyer Conjecture and the Generalized Riemann Hypothesis. We study one-parameter families over Q(T ), which are of the form y 2 = x 3 + A(T )x + B(T ), with non-constant j -invariant. We define the r th moment of an elliptic curve to be Ar,E(p) := 1 p t mod p at(p) r , where at(p) is p minus the number of solutions to y 2 = x 3 + A(t)x + B(t) mod p. Rosen and Silverman showed biases in the first moment equal the rank of the Mordell-Weil group of rational solutions. Michel proved that pA2,E(p)= p 2 + O(p 3/2 ). Based on several special families where computa- tions can be done in closed form, Miller in his thesis conjectured that the largest lower-order term in the second moment that does not average to 0 is on average negative. He further showed that such a negative bias has implications in the distribution of zeros of the elliptic curve L-function near the central point. To date, evidence for this conjecture is limited to special families. In this paper, we explore the first and second moments of some one- and two-parameter families of elliptic curves, looking to see if the biases persist and exploring the consequence these have on fundamental properties of elliptic curves. We observe that in all of the one- and two-parameter families where we can compute in closed form that the first term that does not average to zero in the second- moment expansion of the Dirichlet coefficients has a negative average. In addition to studying some additional families where the calculations can be done in closed form, we also systematically inves- tigate families of various rank. These are the first general tests of the conjecture; while we cannot in general obtain closed form solutions, we discuss computations which support or contradict the conjecture. We then generalize to higher moments, and see evidence that the bias continues in the even moments. Contents 1 Introduction 2 1.1 Rational Points on a Quadratic Equation ...................... 2 1.2 Introduction to Elliptic Curves ............................ 4 1.3 From Random Matrix Theory to the Birch and Swinnerton-Dyer Conjecture . . . 7 1.4 The Bias Conjecture .................................. 9 1.5 Our results ....................................... 11 2 Tools for Calculating Biases 12 3 Representative one-parameter Family 14 4 Numerical computations for biases in second moments 17 1

S. J. Miller,Y. Weng arXiv:2103.03942v2 [math.NT] 3 May 2021

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

arX

iv:2

103.

0394

2v2

[m

ath.

NT

] 3

May

202

1

Biases in Moments of the Dirichlet Coefficients in One- andTwo-Parameter Families of Elliptic Curves

S. J. Miller, Y. Weng

May 4, 2021

Abstract - Elliptic curves arise in many important areas of modern number theory. One way tostudy them is take local data, the number of solutions modulo p, and create an L-function. Thebehavior of this global object is related to two of the seven Clay Millenial Problems: the Birch andSwinnerton-Dyer Conjecture and the Generalized Riemann Hypothesis. We study one-parameterfamilies over Q(T ), which are of the form y2 = x3 + A(T )x+B(T ), with non-constant j-invariant.We define the rth moment of an elliptic curve to be Ar,E(p) := 1

p

∑t mod p at(p)

r, where at(p) is

p minus the number of solutions to y2 = x3 + A(t)x + B(t) mod p. Rosen and Silverman showedbiases in the first moment equal the rank of the Mordell-Weil group of rational solutions.

Michel proved that pA2,E(p) = p2 +O(p3/2). Based on several special families where computa-tions can be done in closed form, Miller in his thesis conjectured that the largest lower-order termin the second moment that does not average to 0 is on average negative. He further showed thatsuch a negative bias has implications in the distribution of zeros of the elliptic curve L-functionnear the central point. To date, evidence for this conjecture is limited to special families. In thispaper, we explore the first and second moments of some one- and two-parameter families of ellipticcurves, looking to see if the biases persist and exploring the consequence these have on fundamentalproperties of elliptic curves. We observe that in all of the one- and two-parameter families wherewe can compute in closed form that the first term that does not average to zero in the second-moment expansion of the Dirichlet coefficients has a negative average. In addition to studying someadditional families where the calculations can be done in closed form, we also systematically inves-tigate families of various rank. These are the first general tests of the conjecture; while we cannotin general obtain closed form solutions, we discuss computations which support or contradict theconjecture. We then generalize to higher moments, and see evidence that the bias continues in theeven moments.

Contents

1 Introduction 21.1 Rational Points on a Quadratic Equation . . . . . . . . . . . . . . . . . . . . . . 21.2 Introduction to Elliptic Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 From Random Matrix Theory to the Birch and Swinnerton-Dyer Conjecture . . . 71.4 The Bias Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Our results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Tools for Calculating Biases 12

3 Representative one-parameter Family 14

4 Numerical computations for biases in second moments 17

1

5 Biases in higher moments 225.1 Biases in fourth moment sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.2 Biases in sixth moment sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.3 Biases in the Third, Fifth, and Seventh Moments . . . . . . . . . . . . . . . . . . 25

6 Conclusion and Future Work 28

7 Acknowledgement 29

A Representative one-parameter and two-parameter families (By Steven J. Millerand Jiefei Wu) 29A.1 One-Parameter Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29A.2 Two-Parameter Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

B Motivation Behind Studying L-Functions 37

C Forms of 4th and 6th moments sums 38C.1 Tools for higher moments calculations . . . . . . . . . . . . . . . . . . . . . . . . 38C.2 Form of 4th moments sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39C.3 Form of 6th moments sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Keywords : Elliptic curves, Dirichlet coefficients, L-functions, biases

Mathematics Subject Classification (2020) : 60B10, 11B39, 11B05 (primary) 65Q30 (sec-ondary)

1 Introduction

The distribution of rational points on elliptic curves are not just of theoretical interest, butalso have applications in encryption schemes. While it is often difficult to study one particularcurve, frequently great progress can be made by looking at families of curves and computingaverages. One powerful tool for such calculations are the associated L-functions. In particular,negative biases in the first moment of their Dirichlet coefficients are known in many cases (andconjectured in general) to be related to the rank. Recent investigations suggest that the secondmoment has similar biases, and these have applications to the distribution of the zeros of theirL-functions.

We report on some of these calculations, as well as extensions to higher moments. Wedeliberately take a leisurely approach to make this paper reasonably self-contained, motivatingthe history and background material as problems of this nature are accessible with minimalpre-requisites (for more details on elementary number theory, see for example [Da1, Da2, MT-B,NZM]). Our hope is to encourage others to continue these investigations in related families. Forthose interested in the code, email the authors.

1.1 Rational Points on a Quadratic Equation

For thousands of years, there has been interest in finding integer solutions to equations or systemsof equations with integer coefficients. These are called Diophantine equations’ perhaps the mostfamous is the Pythagorean theorem.

2

Theorem 1.1 (Pythagorean Theorem) If a and b are the sides of a right triangle with hy-potenuse c, then

a2 + b2 = c2. (1)

However, it is not immediately clear that there are any rational solutions, though a searchquickly finds many. It distressed the Greeks that the right triangle with sides of integer length1 and 1 has a hypotenuse of irrational length

√2. By rescaling a rational triple we may assume

that the sides are integral and relatively prime; we call such primitive Pythagorean triples, andcan write down an explicit formula to generate all Pythagorean triples.

Lemma 1.2 (Pythagorean Triples) Given any Pythagorean triple there exist positive inte-gers k, m and n with m > n such that

a = k · (m2 − n2), b = k · (2mn), c = k · (m2 + n2), (2)

where m and n are coprime and not both odd.

a

b

c

Figure 1: A right triangle with sidelength of a, b and c

x

y

(−1, 0) (1, 0)

(0,−1)

(0, 1)

(x, y)(0, t)

Figure 2: A rational parametrization of the cir-cle x2 + y2 = 1

Proof. Finding integer Pythagorean triples (see Figure 1) is equivalent to finding rationalpoints on the unit circle x2 + y2 = 1; just let

x =a

cand y =

b

c. (3)

We now find all rational points on the unit circle. Let (x, y) denote an arbitrary point onthe circle. In Figure 2, we know one rational solution1, (−1, 0). The line through (x, y) withslope t is given by the equation

y = t(1 + x). (4)

Hence, the other point of intersection of the line with the unit circle is

1− x2 = y2 = t2(1 + x)2. (5)

1There are three other obvious rational solutions which we could have used; the standard convention is to usethis one.

3

Dividing each side by the root (1 + x), corresponding to the root x = −1, we get

1− x = t2(1 + x). (6)

Using the above relation, we find

x =1− t2

1 + t2y =

2t

1 + t2. (7)

Thus if x and y are rational numbers, then the slope t = y/(1 + x) is also rational. Conversely,if t is rational then x and y are rational. Hence, by letting t range over the rational numbersnumbers, we generate all the rational pairs on the circle (except (−1, 0) as in this case t isinfinite). �

Since we are able to generate the rational points on a quadratic equation, it is natural tostudy how to generate the rational points on a cubic equation, such as an elliptic curve.

1.2 Introduction to Elliptic Curves

We motivate studying elliptic curves in general by investigating first special cases, arising fromright triangles with area 1. We have the following equation:

1 =1

2ab. (8)

We substitute a = xc and b = yc from (3) and obtain

1 =1

2c2xy. (9)

Plugging in our results from (7) gives

1 =1

2c2(1− t2

t2 + 1

)(2t

t2 + 1

)

=c2

(t2 + 1)2(t− t3). (10)

Divided both sides by c2/(t2 + 1)2, we get

(t2 + 1

c

)2

= t− t3. (11)

If we let Y = (t2 + 1)/c and X = −t we obtain

Y 2 = X3 −X, (12)

which is an equation of an elliptic curve, which we formally define; see [ST] for more details.

An elliptic curve in standard (or Weierstrass) form is the set of points (x, y) satisfying thecubic equation

y2 = x3 + ax+ b, (13)

4

where a, b ∈ Q and the discriminant 4a3 + 27b2 is not zero. This last condition is to avoiddegenerate cases. For example, we do not want y2 = x2(x− 1) to be an elliptic curve; when wesend y to xy we get y2 = x− 1, a parabola. More generally, it is of the form

y2 + a1xy + a3y = x3 + a2x2 + a4x+ a6. (14)

We study two kinds of families of elliptic curves: one-parameter and two-parameter. While wecan change variables to put our curves in standard form, for convenience we often have an x2

term. In the definitions below if we specialize the variables to integers we obtain an ellipticcurve (provided of course that the discriminant is non-zero).

Definition 1.3 (One-Parameter Family of Elliptic Curves) A one-parameter family is ofthe form

E : y2 = x3 +A(T )x+B(T ), (15)

with A(T ), B(T ) ∈ Q[T ], which are polynomials of finite degree and rational coefficients.

Definition 1.4 (Two-Parameter Family of Elliptic Curves) A two-parameter family is ofthe form

y2 = x3 +A(T, S)x+B(T, S), (16)

with A(T, S), B(T, S) ∈ Q[T, S].

One of the reasons that there is such interest in elliptic curves is the following result.

Theorem 1.5 (Mordell’s Theorem) The set of rational points on an elliptic curve is a finitelygenerated group.

Figure 3: Points within the range |x| ≤20 on Rank 0 Elliptic Curve E : y2 =x3 + x2 − 165x + 1427.

Figure 4: Points within the range |x| ≤20 on Rank 1 Elliptic Curve E : y2 =x3 − 4x+ 4.

Figures 3 and 4 demonstrate the addition law for an elliptic curve of rank 0 and an ellipticcurve of rank 1; the “point at infinity” acts as the identity element for addition. As the rank ofthe elliptic curve increases, there are typically more points within a certain range of x.

RSA cryptography, which is based on groups arising from primes or the product of twodistinct primes p and q, (Z/pqZ), was the gold standard in cryptography for years. However, itwas also well-known that if we are able to factor a large number, then we can easily break RSA.Hence, it led to a search for other interesting groups with more complicated structure. Ellipticcurves became the natural candidate because they have a group structure. Two points generate

5

a third, but note that for the Pythagorean triples we only needed to find one point to generatethem all. See [RG] for more details.

Next, we define a characteristic of elliptic curves that is relevant to our paper. Often onecan gain an understanding of a global object by studying a local one. In particular, for a primep we can look at how often we have pairs (x, y) satisfying y2 = x3 + ax + b mod p. As half ofthe non-zero elements of Fp = Z/pZ = {0, 1, 2, . . . , p − 1} are non-zero squares modulo p andthe other half are not squares, it is reasonable to expect that for a randomly chosen x that halfthe time it will generate two solutions modulo p and half the time it will generate zero. Thuswe expect the number of pairs to be of size p, and it is valuable to look at fluctuations aboutthis expected number.

Definition 1.6 (Dirichlet Coefficients) For E an elliptic curve y2 = x3+ax+b and a primep, we define the Dirichlet coefficients aE(p) by

aE(p) := p− |E(Fp)|, (17)

where |E(Fp)| is the number of solutions (x, y) to y2 = x3 + ax+ b mod p with x, y ∈ Fp. Theseare used in constructing the associated L-function to the elliptic curve, L(E, s) =

∑n aE(n)/n

s,which generalizes the Riemann zeta function ζ(s) =

∑n 1/n

s.

There is a very useful formula for aE(p) (if the curve E is clear we often suppress the subscriptadn write a(p) or ap). The Legendre symbol

(ap

)is zero if a is zero modulo p, 1 if a is a non-zero

square modulo p, and −1 otherwise. Thus 1 +(x3+ax+b

p

)is the number of solutions modulo p

for a fixed x. If we sum this over all x modulo p we obtain |E(Fp)|, and thus

aE(p) = −∑

x mod p

(x3 + ax+ b

p

). (18)

Much is known about the a(p)’s. We focus on their size and average behavior, though recentbreakthroughs have determined much more about their distribution.

Theorem 1.7 (Hasse, 1931) The Riemann Hypothesis for finite fields holds if E is an ellipticcurve and p a prime; we have

|aE(p)| ≤ 2√p. (19)

Hasse’s theorem is very similar to the Central Limit Theorem, which itself is an example ofthe philosophy of square-root cancelation: if we have N objects of size 1 with random signs,then frequently the sum is of size 0 with fluctuations on the order of

√N .2 In our setting, we

expect half of the time(x3+A(t)x+B(t)

p

)equals 1, and the other half time it is -1. As we have p

terms of size 1 with random signs, the results should be of size√p, which is Hasse’s theorem.

We often use big-Oh notation to denote sizes of the quantities we study.

Definition 1.8 (Big-Oh Notation) We say f = O(g(x)), read f is big-Oh of g, if there existsan x0 and a B > 0 such that for all x ≥ x0 we have |f(x)| ≤ Bg(x).

2Results such as these are correct up to the power of N but can miss logarithms;

6

Last but not least, we define some other important characteristics of elliptic curves. As theaEt(p) only depend on t mod p by (18), for a fixed prime p we only need to study specializationsof T modulo p.

Definition 1.9 (Moment of a One-Parameter Family) Let E be a one parameter familyof elliptic curves y2 = x3 + A(T )x + B(T ) over Q(T ), with Et the specialized curves. For eachpositive integer r, we define the rth moment:

AE,r(p) :=1

p

t mod p

aEt(p)r. (20)

There is a natural extension to two-parameter families, where we sum over s and t modulop.

We conclude with one final concept and result, the rank.

Definition 1.10 (Geometric Rank of E) The group of rational solutions of an elliptic curveE, denoted E(Q), can be written as r copies of Z plus a finite torsion part:

E(Q) ∼= Zr × E(Q)tors; (21)

r is the geometric rank of E.

The analytic rank of E is the order of vanishing of the associated L-function at the centralpoint. Similar to many other problems in mathematics, frequently one of these objects is easierto study than the other, and the hope is that there is a connection between them. This is truefor elliptic curves; we turn to this next, and see the key role played by the aE(p)’s.

1.3 From Random Matrix Theory to the Birch and Swinnerton-Dyer Conjecture

The Birch and Swinnerton-Dyer conjecture is one of the seven Clay Millenial Problems; thesewere formulated in the spirit of Hilbert’s successful list from the start of the twentieth century,and are meant to inspire and highlight important mathematics. It is based on a L-functionof an elliptic curve, connecting analysis to geometry, two great different fields of mathematics.Before stating it, we first describe some of the problems and methods of modern number theoryto motivate both why we care about this conjecture, as well as the main topic of this paper. Formore on this story, see [BFMT-B, FM].

Given the inability to theoretically describe the energy levels of atoms more complicatedthan hydrogen, due to the complexities of the mathematics, physicists developed statisticalapproaches. Based on extensive numerical data, Wigner proposed that one could model nuclearphysics through random matrices; that the behavior of eigenvalues in these matrix ensemblesdescribed the behavior of energy levels. Briefly, from quantum mechanics we can write downthe equation

Hψn = Enψn, (22)

where H is the Hamiltonian, ψn are the energy eigenstates and En the energy levels. Unfor-tunately the nuclear forces leading to the operator H are too complicated to allow us to writeit down and analyze it explicitly. Wigner’s great insight was to instead consider a probabilitydistribution on N × N matrices, calculate averages over these ensembles, scale them appropri-ately so that limiting behavior exists as N → ∞, and the appeal to a central limit theorem type

7

of law to show that a typical operator will have behavior close to the system average. Thesepredictions were confirmed experimentally in studies of the energy levels of heavy nuclei.

Amazingly, similar results were found in the spacings between the zeroes of the Riemann Zetafunction, which connects integers to primes and helps us understand the mysterious distributionof primes, seem to follow the RMT prediction too.

Definition 1.11 (Riemann Zeta Function) For Re(s) > 1

ζ(s) :=∞∑

n=1

1

ns=

p prime

(1− 1

ps

)−1

. (23)

The zeta function is defined as the sum over integers above, but its utility comes fromthe product expansion (which follows immediately from the geometric series formula and thefundamental theorem of arithmetic, which states each integer can be written uniquely as aproduct of prime powers in increasing order). Initially defined only for Re(s) > 1, the zetafunction can be analytically continued to the entire complex plane with a simple pole of residue1 at s = 1:

ξ(s) := Γ(s2

)π−

s2 ζ(s) = ξ(1− s), (24)

where the line Re(s) = 1/2 is the critical line, and s = 1/2 the central point. The Riemannhypothesis states all the non-trivial zeros of ζ(s) have real part equal to 1/2 (due to the presenceof the Gamma factor, ζ(s) vanishes at the negative even integers). By doing a contour integralof the logarithmic derivative of ζ(s) and shifting contours, one obtains the Explicit Formula,which relates a sum over zeros to a sum over prime. Figures 5 and 6 show similar behavior inthe spacings between energy levels of heavy nuclei and spacings between zeros of ζ(s).

Figure 5: A Wigner distribution fit-ted to the spacing distribution of 932s-wave resonances in the interactionUranian + n at energies up to 20 keV.

Figure 6: 70 million spacings betweenadjacent zeros of ζ(s), starting at the1020th zero. The solid curve is theRMT prediction for the GUE ensem-ble, and the dots are the zeta zeros(from Odlyzko).

We now move on to discuss other L-functions; for more on these see for example [Kn]. Withthe normalization below, the critical strip is 0 < Re(s) < 2, and the functional equation of thecompleted elliptic curve L-function relates values at s to those at 2− s.

8

Definition 1.12 (L-function) The Hasse-Weil L-function of an elliptic curve E : y2+a1xy+a3y = x3 + a2x

2 + a4x+ a6 with coefficient aE(p) and discriminant ∆,

∆ := −b22b8 − 8b43 − 27b6

2 + 9b2b4b6, (25)

where b2 = a12 + 4a4, b4 = 2a4 + a1a3 and b6 = a3

2 + 4a6, is defined as

L(s,E) :=∏

p|∆

1

1− app−s

p∤∆

1

1− app−s + p1−2s. (26)

Similar to the zeta function, these L-functions take local data and create a global object,from which much can be deduced; in Appendix B we give an example of how we can piecetogether local information to construct a global object which, if a closed form expression exists,can allow us to deduce information about the objects of interest. The most important of theseinferences is the famous Birch and Swinnerton-Dyer conjecture.

Conjecture 1.13 (Birch and Swinnerton-Dyer Conjecture) The order of vanishing ofL(E, s) at the central point s = 1 is equal to the rank of the group of rational points E(Q).

In other words, Birch and Swinnerton-Dyer conjectured that the geometric rank of an ellipticcurve equals its analytic rank.

Unfortunately, it is not known what values of rank r are possible for an elliptic curve. In1938, Billing found an elliptic curve with rank 3. The largest known rank increased over thenext few decades. The largest is due to Elkies in 2006, and is rank at least 28. Interestingly,there are not examples of elliptic curves for each rank smaller than 28 (see [Du] for a morecomprehensive historical data on elliptic curve records). While originally it was thought thatthe ranks are unbounded, now some conjecture that this is not the case.

1930 1950 1970 1990 20190

5

10

15

20

25

30

Year

Ran

k>=

Elliptic Curve Records

1.4 The Bias Conjecture

We are now ready to state our main object of study, the bias conjecture. The original motivationfor it comes from the distribution of low-lying zeros in families of L-functions; this is part of then-level densities introduced by Katz and Sarnak [KS1, KS2]. The next few paragraphs are thusmore technical and assume some familiarity of the subject, and may be safely skipped.

9

Similar to using the Riemann Zeta function to understand the distribution of primes, weuse the Explicit Formula, which relates sums over primes of the Dirichlet coefficients aE(p) anda2E(p) to sums of test functions over zeros, to deduce information about the zeros. We look ata one-parameter family E : y2 = x3 + A(t)x + B(t), with t ∈ [N, 2N ], and where φ is an evenSchwartz-class function that decays rapidly (this means φ, and all of its derivatives, decay fasterthan 1/(1 + |x|)A for any A > 0), logR is the average log conductor (and tells us how to scalethe zeros near the central point s = 1), and 1 + iγ are the non-trivial zeros of the L-function:

1

N

2N∑

t=N

γt

φ

(γtlogR

)= φ̂(0) + φ(0)− 2

N

2N∑

t=N

p

log p

logR

1

pφ̂

(log p

logR

)at(p)

− 2

N

2N∑

t=N

p

log p

logR

1

p2φ̂

(2 log p

logR

)at(p)

2 +O

(log logR

logR

); (27)

the result above comes from integrating the logarithmic derivative of the L-function against theSchwartz test function φ and then shifting contours. If the generalized Riemann Hypothesis istrue then γ ∈ R.

Note that if the test function is non-negative, then dropping the contributions of φ at all thezeros that are not at the central point removes a non-negative amount from the left hand side.The right hand side then becomes an upper bound for the average rank of the elliptic curves inthe family:

1

N

2N∑

t=N

γt=0

φ (0) ≤ φ̂(0) + φ(0)− 2

N

2N∑

t=N

p

log p

logR

1

pφ̂

(log p

logR

)at(p)

− 2

N

2N∑

t=N

p

log p

logR

1

p2φ̂

(2 log p

logR

)at(p)

2 +O

(log logR

logR

), (28)

which means that

φ (0) ∗ AverageRank(N) ≤ φ̂(0) + φ(0) − 2

N

2N∑

t=N

p

log p

logR

1

pφ̂

(log p

logR

)at(p)

− 2

N

2N∑

t=N

p

log p

logR

1

p2φ̂

(2 log p

logR

)at(p)

2 +O

(log logR

logR

). (29)

Thus when φ is non-negative, we obtain a bound for the average rank in the family byrestricting the sum to be only over zeros at the central point. The error O (log logR/ logR)comes from trivial estimation and ignores probable cancelation, and we expect O (1/ logR) orsmaller to be the correct magnitude. For most one-parameter families of elliptic curves we havelogR ∼ logNa for some integer a, where t ∈ [N, 2N ].

The main term of the first and second moments of the at(p) give φ (0)∗AverageRank(N) and−1

2φ(0); this is a standard application of the prime number theorem to evaluate the resultingsums; for details see the appendices on prime sums in [Mi1]. This is reminiscent of the CentralLimit Theorem, where so long as some weak conditions are satisfied for independent, identicallydistributed random variables, their normalized sum converges to the standard normal. In thatsetting, if the moments are finite we can always adjust the distribution to have mean zero and

10

variance one, and it is only these moments that enter the limiting analysis. The higher momentsdo have an impact, but it is only through the lower order terms, which control the rate ofconvergence.

We have a similar situation here. First, the higher moments of the Dirichlet coefficientscontribute in the big-Oh terms O (1/ logR). Second, the lower order terms in the first andsecond moments can contribute, but not to the main term in the expansions above. Explicitly,assume the second moment of at(p)

2 is p2 −mEp+O(1), mE > 0. We have already handled thecontribution from p2, and −mEp contributes

S2 ∼ −2

N

p

log p

logRφ̂

(2log p

logR

)1

p2N

p(−mEp)

=2mElogR

p

φ̂

(2log p

logR

)log p

p2. (30)

We thus have a prime sum which converges, and this sum is bounded by∑

p log p/p2. Thus,

S2 converges and there is a contribution of size 1/ log(R). This is the motivation behind whythe Bias conjecture, which S. J. Miller conjectured in his thesis [Mi1], matters, as a bias has animpact in our estimates on the rank and the behavior of zeros near the central point.

Conjecture 1.14 (Second Moment Elliptic Curve Bias Conjecture) Consider a familyof elliptic curves. Then the largest lower term in the second moment expansion of a family whichdoes not average to 0 is on average negative.

If the Bias conjecture holds, then when we estimate the rank of a family, there is always anextra term that slightly increases the upper bound for the average rank. This amount decreasesas logR grows, and thus in the limit plays no role; however, it does lead to a small but noticeablecontribution for small and modest sized conductors.

1.5 Our results

We report on our results. Much is known about the first moment of the Dirichlet coefficients ofelliptic curves. Work of Nagao [Na1, Na2] and Rosen and Silverman [RoSi] shows that the firstmoment in families is related to the rank of the family over Q(T ); specifically, a small negativebias results in rank. This was used by Arms, Lozano-Robledo and Miller [ALM] to constructone-parameter families of elliptic curves with moderate rank, and later generalized to ellipticcurves over number fields [MMRSY] and hyper-elliptic curves [HKLL-RM].

It is thus natural to ask if there is a bias in the second moments, and if so what are theconsequences. We have already seen that a negative bias here is related to some of the observedexcess rank and repulsion of zeros of elliptic curve L-functions near the central point for finiteconductors.

We start with a result from Michel [Mic] on the main term of the second moments, and thesize of the fluctuations, in one-parameter families.

Theorem 1.15 For a one-parameter family E : y2 = x3+A(T )x+B(T ) with non-constant j(T )-

invariant j(T ) = 1728 4A(T )3

4A(T )3+27B(T )2, the second moment of the Dirichlet coefficients equals

pA2,E(p) = p2 +O(p3/2), (31)

11

with the lower order terms of size p3/2, p, p1/2 and 1 having important cohomological interpre-tations.

It is possible to have terms of size p3/2; see for example [Mi3].

Theorem 1.16 (Birch’s Theorem) For the family E : y2 = x3 + ax+ b of all elliptic curves,the second moment of the Dirichlet coefficients equals

pA2,F (p) =∑

a,b mod p

aE(p) = p3 − p2. (32)

See [Bi, Mi1, Mi3, Mic].We provide detailed calculations in §3 and §A for some families to illustrate the techniques.

See [MWe] and [Wu] for the comprehensive calculations for all of the one-parameter familieswhere we were able to obtain closed form expressions. We then turn to families where we cannotobtain closed form expressions, higher rank families, and higher moments. We provide somedetails from representative families here, and refer the reader to [MWe] for more. The followingsummary, taken from the appendix by Miller and Weng in [ACFKKLMMWWYY], summarizesthese results.

• All the rank 0 and rank 1 families studied have data consistent with a negative bias intheir second moment sums. However, for higher rank families (rank at least 4) the datasuggests that there is instead a positive bias.

• For the fourth moment, we also believe that the rank 0 and rank 1 families have negativebiases. We see this in some families; in others we see the presence of terms of size p5/2 whosebehavior is consistent with their averaging to zero, but its presence makes it impossible todetect the lower order terms. Interestingly, again for higher rank families (rank at least 4)the data is consistent with a positive bias.

• The sixth moment results are similar to the fourth moment. The results are consistentwith either a negative bias, or a leading term (now of size p7/2) averaging to zero for lowerrank families. Our data also suggests that higher rank families have positive biases.

• For the odd moments, the coefficients of the leading term vary with the primes. Our datasuggests that the average value of the main term for the (2k+1)st moment is −Ck+1rp

k+1,where Cn = 1

n+1

(2nn

)is the nth Catalan number.

2 Tools for Calculating Biases

In this section we explain why the first moment is often related to the rank, and then introducethe linear and quadratic Legendre sums, the Jacobi symbol as well as the Gauss Sum Expansion,which can be used to compute biases in elliptic curves. See more details from [RoSi, BEW, BAU,Mi1].

Theorem 2.1 (Rosen-Silverman) For an elliptic surface (a one-parameter family), if Tate’sconjecture holds, the first moment is related to the rank of the family over Q(T ):

limx→∞

1

x

p≤x

A1, E(p) log pp

= rankE(Q(T)). (33)

12

Conjecture 2.2 (Tate’s Conjecture for Elliptic Surfaces [ST]) Let E/Q be an elliptic sur-face and L2(E , s) be the L-series attached to H2

et(E/Q,Ql). Then L2(E , s) has a meromorphiccontinuation to C and satisfies

−ords=2L2(E , s) = rankNS(E/Q), (34)

where NS(E/Q) is the Q-rational part of the Neron-Severi group of E . Further, L2(E , s) doesnot vanish on the line Re(s) = 2.

Tate’s conjecture is known to hold for rational surfaces: An elliptic curve y2 = x3+A(T )x+B(T )is rational if and only if one of the following is true:

1. 0 < max(3 degA, 2 degB) < 12,

2. 3 degA = 2degB = 12 and ordT=0T12∆(T−1) = 0.

All of the one-parameter families we compute are rational surfaces; A representative case isdone in §3. However, for two-parameter families, we cannot use the Rosen-Silverman theorem,and the ranks are conjectural. Checking their ranks is beyond the scope of this paper, but itcan be done; see [WAZ] for more details. As our interest is in the biases of the second moments,we do not need to know these ranks for our purposes.

The key to our analysis in the families below are closed form expressions for linear andquadratic Legendre sums.

Lemma 2.3 Let a, b, c be positive integers and a 6≡ 0 mod p. Then

x mod p

(ax+ b

p

)= 0, if p ∤ a, (35)

and

x mod p

(ax2 + bx+ c

p

)=

{−(ap

), if p ∤ b2 − 4ac

(p− 1)(ap

), if p | b2 − 4ac.

(36)

The proofs are standard and follow from elementary manipulations of the sums, exploitingchanges of variables modulo p. For details see Appendix C [Mi1] (available online). For examplethe first is proved by sending x to a−1(x− b), which yields

x mod p

(ax+ b

p

)=

x mod p

(x

p

), (37)

which is zero as there are as many non-zero squares as non-squares modulo p.In many families we end up with terms such as

(−1p

). By Dirichlet’s theorem for primes in

arithmetic progression, to first order as N tends to infinity there are the same number of primesp ≤ N congruent to 1 mod 4 as there are congruent to 3 mod 4. Thus, up to lower order termstending to zero as N goes to infinity, the average is controlled by the following:

(−1

p

)=

{1, if p ≡ 1 mod 4

−1, if p ≡ 3 mod 4.(38)

See [Var] for more details.For some families, an alternative expansion for the Dirichlet coefficients is useful.

13

Lemma 2.4 (Quadratic Formula mod p) For a quadratic ax2 + bx+ c ≡ 0 mod p, a 6≡ 0,there are two distinct roots if b2 − 4ac equals to a non-zero square, one root if b2 − 4ac ≡ 0, andzero roots if b2 − 4ac is not a square.

3 Representative one-parameter Family

Lemma 3.1 The first moment of the family y2 = 4x3 + ax2 + bx+ c+ dt is 0.

Proof. For all p > 4d, send t to 4d−1t: Thus

t(p)

(dt

p

)=∑

t(p)

(4t

p

). (39)

Therefore,

A1,ε(p) = −∑

t(p)

x(p)

(4x3 + ax2 + bx+ 4t+ c

p

)

= −∑

x(p)

t(p)

(4t+ 4x3 + ax2 + bx+ c

p

). (40)

As p 6 |4 when p 6= 2, the t-sum vanishes by linear sum theorem, and

A1,ε(p) = 0. (41)

By the Rosen-Silverman Theorem, this is a rank 0 family. �

Lemma 3.2 The second moment of the family y2 = 4x3 + ax2 + bx+ c+ dt is

A2,E(p) =

{p2 − p− p ·

(−48p

)− p ·

(a2−12b

p

)if a2 − 12b 6= 0

p2 − p+ p(p− 1)(−48

p

)otherwise.

(42)

Proof. We have

A2,E(p) =∑

t(p)

x(p)

y(p)

(4x3 + ax2 + bx+ 4t+ c

p

)(4y3 + ay2 + by + 4t+ c

p

)

m(x) = 4x3 + ax2 + bx+ c

n(y) = 4y3 + ay2 + by + c

A2,E(p) =∑

t(p)

x(p)

y(p)

(16t2 + 4(m+ n)t+mn

p

). (43)

The discriminant of 16t2 + 4(m+ n)t+mn is

∆t(x, y) = 16(m + n)2 − 64mn

= 16(m − n)2

δ2 = ∆t(x, y)

δ = 4(m− n)

14

= 4(4x3 + ax2 + bx+ c− 4y3 − ay2 − by − c)

= 4(x− y)(4x2 + 4xy + 4y2 + ax+ ay + b). (44)

If p|δ, then p|x − y or p|4x2 + 4xy + 4y2 + ax + ay + b. x, y range from 0 to p − 1, sop|x − y exactly p times. By the quadratic formula mod p, 4x2 + 4xy + 4y2 + ax + ay + b ≡4y2 + (4x + a)y + 4x2 + ax + b ≡ 0 (mod p) when y =

−4x−a±√

∆y

8 (∆y is the discriminant ofthe polynomial 4y2 + (4x+ a)y + 4x2 + ax+ b in terms of y):

∆y = (4x+ a)2 − 4 · 4(4x2 + ax+ b)

= −48x2 − 8ax+ a2 − 16b. (45)

If ∆y is a non-zero square mod p, there are two solutions. If ∆y is 0 mod p, there is one solution.If ∆y is not a square mod p, there is no solution.The number of pairs of x, y such that p|4x2 + 4xy + 4y2 + ax+ ay + b is

x(p)

1 +

(−48x2 − 8ax+ a2 − 16b

p

)= p+

x(p)

(−48x2 − 8ax+ a2 − 16b

p

). (46)

The discriminant of −48x2 − 8ax+ a2 − 16b is

∆x = (8a)2 − 4 · (−48)(a2 − 16b)

= 256a2 − 3072b

= 256(a2 − 12b). (47)

We break into cases, depending on the value of the discriminant.Case 1: a2 − 12b 6= 0: By the Quadratic Legendre Sum Theorem, if p 6 |256(a2 − 12b),

x(p)

(−48x2 − 8ax+ a2 − 16b

p

)= −

(−48

p

). (48)

The number of pairs of x, y such that p|4x2 + 4xy + 4y2 + ax+ ay + b is

p+∑

x(p)

(−48x2 − 8ax+ a2 − 16b

p

)= p−

(−48

p

). (49)

The cases that we double count x = y and p|4x2 + 4xy + 4y2 + ax+ ay + b is

4x2 + 4xy + 4y2 + ax+ ay + b ≡ 12y2 + 2ay + b ≡ 0 (mod p). (50)

The discriminant of 12y2 + 2ay + b is

∆y = (2a)2 − 4 · 12b = 22 · (a2 − 12b). (51)

By the quadratic formula mod p, the number of solutions is computable, depending ona2 − 12b.

15

The number of solutions to (50) is

1 +

(22 · (a2 − 12b)

p

)= 1 +

(a2 − 12b

p

). (52)

Therefore, the total number of times that p|(x− y)(4x2+4xy+4y2+ax+ay+ b) is the numberof times p|x− y plus the number of times p|4x2 + 4xy + 4y2 + ax+ ay + b minus the cases thatwe double count.The number of times that p|(x− y)(4x2 + 4xy + 4y2 + ax+ ay + b) is

p+ p−(−48

p

)− 1−

(a2 − 12b

p

)= 2p− 1−

(−48

p

)−(a2 − 12b

p

). (53)

The number of times that p 6 |(x− y)(4x2 + 4xy + 4y2 + x+ y − 4) is

p2 −(2p− 1−

(−48

p

)−(a2 − 12b

p

))= p2 − 2p + 1 +

(−48

p

)+

(a2 − 12b

p

). (54)

By Quadratic Legendre Sum Theorem,

A2,E(p) = (p − 1)

[2p− 1−

(−48

p

)−(a2 − 12b

p

)]−[p2 − 2p+ 1 +

(−48

p

)+

(a2 − 12b

p

)]

= p2 − p− p ·(−48

p

)− p ·

(a2 − 12b

p

). (55)

Case 2: a2 − 12b = 0: By the Quadratic Legendre Sum Theorem, since p|0, we have

x(p)

(−48x2 − 8ax+ a2 − 16b

p

)= (p− 1)

(−48

p

). (56)

The number of pairs of x, y such that p|4x2 + 4xy + 4y2 + ax+ ay + b is

p+∑

x(p)

(−48x2 − 8ax+ a2 − 16b

p

)= p+ (p − 1)

(−48

p

). (57)

The cases that we double count x = y and p|4x2 + 4xy + 4y2 + ax+ ay + b is

4x2 + 4xy + 4y2 + ax+ ay + b ≡ 12y2 + 2ay + b ≡ 0 (mod p). (58)

The discriminant of 12y2 + 2ay + b is

∆y = (2a)2 − 4 · 12b = 22 · (a2 − 12b). (59)

By the quadratic formula mod p, the number of solutions is computable, depending ona2 − 12b.

16

The number of solutions to (58) is

1 +

(22 · (a2 − 12b)

p

)= 1. (60)

Therefore, the total number of times that p|(x− y)(4x2+4xy+4y2+ax+ay+ b) is the numberof times p|x− y plus the number of times p|4x2 + 4xy + 4y2 + ax+ ay + b minus the cases thatwe double count.The number of times that p|(x− y)(4x2 + 4xy + 4y2 + ax+ ay + b) is

p+ p+ (p − 1)

(−48

p

)− 1 = 2p− 1 + (p− 1)

(−48

p

). (61)

The number of times that p 6 |(x− y)(4x2 + 4xy + 4y2 + x+ y − 4) is

p2 −(2p − 1 + (p − 1)

(−48

p

))= p2 − 2p + 1− (p− 1)

(−48

p

). (62)

A2,E(p) = (p− 1)

[2p− 1 + (p− 1)

(−48

p

)]−[p2 − 2p + 1− (p − 1)

(−48

p

)]

= p2 − p+ p(p− 1)

(−48

p

). (63)

Thus we have shown

A2,E(p) =

{p2 − p− p ·

(−48p

)− p ·

(a2−12b

p

)if a2 − 12b 6= 0

p2 − p+ p(p− 1)(−48

p

)otherwise.

(64)

4 Numerical computations for biases in second moments

Unfortunately, for most families we cannot obtain a closed form for the first or second moments.We thus report on some numerical investigation of one-parameter families; we explore severaldifferent ranks to see if that has any impact. For some families, we are able to conjecture aformula for the second moment by separating primes into different congruence classes, whichsuggests that there is often a closed-form polynomial expression. We are then able to prove theresults mathematically in some cases. The following table summarizes the numerical results forthe second moments’ expansions of the families we studied; see Figure 7.

We summarize the first and second moments for some families where we are able to proveclosed form expressions for the first two moments. The arguments are representative of the onesneeded for all the families. The proofs are similar to the ones in Section 3

Family: y2 = 4x3 + ax2 + bx+ c+ dt:

• First moment: A1,ε(p) = 0.

17

Figure 7: Systematic investigation for second moments sums.

• Second moment:

A2,ε(p) =

{p2 − p− p ·

(−48p

)− p ·

(a2−12b

p

)if a2 − 12b 6= 0

p2 − p+ p(p− 1)(−48

p

)otherwise.

(65)

Family: y2 = 4x3 + (4m+ 1)x2 + n · tx:

• First moment: A1,ε(p) = 0.

• Second moment:

A2,ε(p) =

{p2 − 3p if p = 4k + 1

p2 − p if p = 4k + 3.(66)

Family: y2 = x3 − t2x+ t4:

18

• First moment: A1,ε(p) = −2p.

• Second moment:

A2,ε(p) = p2 − p− p ·(−3

p

)− p ·

(12

p

)−∑

x(p)

y(p)

(x3 − x

p

)(y3 − y

p

). (67)

For families that we are not able to find closed-form expressions, we calculated the averagebias of the second moment sums for the first 1000 primes; see Figure 8.

Figure 8: Numerical data for the average biases of 2nd, 4th and 6th moments sums.

By Michel’s theorem, we know that the main term of the sum is p2, and lower order termshave size p3/2, p, p1/2 or 1. From the data we have, we can tell if it is likely that the second

moment has a p3/2 term. If the value of second moment−p2

p converges or stays bounded as theprime grows, then it is likely that the largest lower order term of the second moment sum is p,as if there were a p3/2 term we would have fluctuations of size p1/2.

By subtracting the main term p2 from the sum and then dividing by the largest lower term(p3/2 or p), we calculated the average bias; see Figure 9.

The data shows that all the families where we do not believe there is a p3/2 term clearly havenegative biases (around −1). When the p3/2 exists, the bias unfortunately becomes impossibleto see. The reason is that the p3/2 term drowns it out; we now have to divide by p3/2. If thatterm averages to zero, then the term of size p, once we divide by p3/2, is of size 1/p1/2.

Let’s investigate further the consequence of having a term of size p3/2. We divide the differ-ence of the observed second moment minus p2 (the expected value) by p3/2. We now have signed

19

Figure 9: Numerical data for the average biases of second moments sums.

summands of size 1. By the Philosophy of Square-Root Cancelation, if we sum N such signedterms we expect a sum of size

√N . As we are computing the average of these second moments,

we divide by N and have an expected value of order 1/√N . In other words, if the p3/2 term

is present and averages to zero, we expect sums over ranges of primes to be about 1/√N . If

N = 1000 this means we expect sums on the order of .0316. Looking at the data in Figure 9,what we see is consistent with this analysis. Thus, while we cannot determine if the first lowerorder term that does not average to zero has a negative bias, we can at least show that the datais consistent with the p3/2 term averaging to zero for lower rank families.

The table suggests a lot more. From the data, we can see that all the rank 0 and rank 1families have negative biases. However, all four rank 4 families have shown positive biases fromthe first 1000 primes. Thus, we look further to see if it is likely the result of fluctuations, or ifperhaps it is evidence against the bias conjecture.

We now list the results for a few representative families.

We divide the 1000 primes into 20 groups of 50 for further analysis. If the p3/2 term averagesto zero, we would expect each of these groups to be positive and negative equally likely, and wecan compare counts. We now expect each group to be on the order of 1/

√50 ≈ .14. Thus we

shouldn’t be surprised if it is a few times .14 (positive or negative); remember we do not knowthe constant factor in the p3/2 term and are just doing estimates.

20

For the rank 2 family a1 = 1, a2 = t, a3 = −19, a4 = −t − 1, a6 = 0, 12 of the 20 groupsof primes have shown positive biases. Figure 10 is a histogram plot of the distribution of theaverage biases among the 20 groups.

Figure 10: Distribution of average biases in the first 1000 primes for family a1 = 1, a2 = t,a3 = −19, a4 = −t− 1, a6 = 0.

We now further analyze our data by dividing the 1000 primes into 100 groups of 10 for rank 6family a1 = 0, a2 = 2(16660111104t) + 811365140824616222208, a3 = 0,a4 = [2(−1603174809600)t − 26497490347321493520384](t2 + 2t− 8916100448256000000 + 1),a6 = [2(2149908480000)t + 343107594345448813363200](t2 + 2t − 8916100448256000000 + 1)2.Our data suggests that there may be a positive bias in this family; in other words, the biasconjecture may fail if the family rank is sufficiently large. The average bias of second momentssums for the first 1000 primes is 0.246759. Figure 11 is a histogram plot of the distribution ofthe average biases among the 100 groups of 10 primes. Note 78 of the 100 groups of primeshave positive biases, which suggests that it is likely that the second moment of this family hasa positive p3/2 term.

Figure 11: Distribution of average biases in the first 1000 primes for a rank 6 family.

From all the data we’ve collected, the rank 0 and rank 1 families have negative biases more

21

frequently, but we are working with small data sets and must be careful in how much weight weassign such results. Our data for the rank 4 and rank 6 families has shown that it is likely thathigher rank families (rank(E(Q)) ≥ 4) have positive biases.

5 Biases in higher moments

We now explore, for the first time, the higher moments of the Dirichlet coefficients of the ellipticcurve L-functions to see if biases we found in the first and second moments persist. Unfortunatelyexisting techniques on analyzing the second moment sums do not apply to the higher moments,even if we choose nice families. If we switch orders of the moments’ sums and sum over t, weare going to get a cubic or higher degrees polynomials. Therefore, we can only try to predictor observe the biases through numerical evidence. We calculated the 4th and 6th moment sumsfor the first 1000 primes. From Section C.2, we know that the main term of the fourth momentsum is 2p3, and the largest possible lower order terms have size p5/2. From Section C.3, we knowthat the main term of the sixth moment sum is 5p4, and the largest possible lower order termshave size p7/2. From the data we have gathered, all the 4th moments of these families have p5/2

terms, and all the 6th moments have p7/2 terms. By subtracting the main term 2p3 from thefourth moment sum and then dividing by the size of the largest lower term p5/2, we calculatedthe average bias for the fourth moment of the first 1000 primes. Similarly, we subtracted 5p4

from the sixth moment sum and then divided by p7/2 to calculate the average bias for the sixthmoment of the first 1000 primes; See Figure 12.

5.1 Biases in fourth moment sums

From the data, we can see that all the biases for lower rank families in the fourth moment arerelatively small (smaller than 0.2), which indicates that the p5/2 term likely averages to 0. Bythe Philosophy of Square-Root Cancellation, we expect the order of the size of fluctuation to bearound

√1000/1000 ≈ 0.03. Therefore, if the bias is between −0.2 and 0.2, we would expect p2

to be the largest lower order term.Note that for 30 out of 31 families, the bias in fourth moments appear to be similar to the

bias in second moments (families that have negative bias in second moments also seem to havenegative bias in fourth moments, and vice versa), though much smaller magnitudes likely dueto the presence of a p5/2 term that is averaging to zero. We now explore a few representativefamilies whose 4-th moment biases have different scales in magnitudes.

For the rank 1 family a1 = 1, a2 = t, a3 = −1, a4 = −t − 1, a6 = 0, we analyze our databy dividing the 1000 primes into 100 groups of 10. As shown in Figure 13, 63 of the 100 groupsof primes have shown negative biases. The probability of having 17 or more negatives thanpositives (or 17 or more positives than negatives) in 100 tosses of a fair coin (so heads is positiveand tails is negative) is about 1.2%. While unlikely, this is not exceptionally unlikely.

For the rank 3 family a1 = 0, a2 = 5, a3 = 0, a4 = −16t2, a6 = 64t2, 11 of the 20 groupsof primes have shown negative biases. Figure 14 is a histogram plot of the distribution of theaverage biases among the 20 groups.

Despite the fluctuations, all the rank 0 and rank 1 families seem to have negative biasesmore frequently in the first 1000 primes, which suggests that it is possible that negative biasexists in the fourth moments of all rank 0 and rank 1 families. Similar to the second momentsums, the fourth moment sums of families with larger rank appear to have positive biases for

22

Figure 12: Numerical data for the average biases of 2nd, 4th and 6th moments sums.

the first 1000 primes, but this might due to the fluctuations of the p5/2 term as we are workingwith small data set.

To further examine the biases in families with larger ranks, we investigate the rank 6 familya1 = 0, a2 = 2(16660111104t) + 811365140824616222208, a3 = 0, a4 = [2(−1603174809600)t −26497490347321493520384](t2 + 2t − 8916100448256000000 + 1), a6 = [2(2149908480000)t +343107594345448813363200](t2 +2t− 8916100448256000000+1)2 . Our data suggests that thereis a positive bias in this family. The average bias of the fourth moments sums for the first 1000primes is 0.753285. Figure 15 is a histogram plot of the distribution of the average biases amongthe 100 groups of 10 primes. 75 of the 100 groups of primes have positive biases, which suggeststhat it is likely that the fourth moment of this family has a positive p5/2 term.

To sum up, the fourth moments of all lower rank families have first lower order terms thataverage to 0 or negative. Similar to the second moment, our data for the rank 4 and rank 6families suggests that higher rank families might have positive biases.

5.2 Biases in sixth moment sums

We now explore the 6th moment biases for these families.For the rank 0 family a1 = 1, a2 = 1, a3 = 1, a4 = 1, a6 = t, 13 of the 20 groups of primes

have shown negative biases. Figure 16 is a histogram plot of the distribution of the averagebiases among the 20 groups.

23

Figure 13: Distribution of average biases in the first 1000 primes for family a1 = 1, a2 = t,a3 = −1, a4 = −t− 1, a6 = 0.

Figure 14: Distribution of average biases in the first 1000 primes for family a1 = 0, a2 = 5,a3 = 0, a4 = −16t2, a6 = 64t2.

For the rank 1 family a1 = 1, a2 = t, a3 = −1, a4 = −t − 1, a6 = 0, 14 of the 20 groupsof primes have shown negative biases. Figure 17 is a histogram plot of the distribution of theaverage biases among the 20 groups.

For the rank 3 family a1 = 0, a2 = 5, a3 = 0, a4 = −16t2, a6 = 64t2, 10 of the 20 groupsof primes have shown negative biases. Figure 18 is a histogram plot of the distribution of theaverage biases among the 20 groups.

As shown from the data, for most families, it is inconclusive whether the 6th moments havenegative biases. Our data strongly suggests that the p7/2 term averages to 0 for most families ,and the p3 term is drowned out by the fluctuations of p7/2 term.

For the rank 6 family a1 = 0, a2 = 2(16660111104t) + 811365140824616222208, a3 = 0,a4 = [2(−1603174809600)t − 26497490347321493520384](t2 + 2t − 8916100448256000000 + 1),a6 = [2(2149908480000)t + 343107594345448813363200](t2 + 2t − 8916100448256000000 + 1)2,the average bias of the sixth moments sums for the first 1000 primes is 2.26. Figure 19 is ahistogram plot of the distribution of the average biases among the 100 groups of 10 primes. 69

24

Figure 15: Distribution of average biases in the first 1000 primes for a rank 6 family.

Figure 16: Distribution of average biases in the first 1000 primes for family a1 = 1, a2 = 1,a3 = 1, a4 = 1, a6 = t.

of the 100 groups of primes have positive biases, which suggests that it is likely that the sixthmoment of this family has a positive p7/2 term.

To sum up, while we are not able to tell if the negative bias exists in the higher even moments,the data is at least consistent with the first lower order term averaging to zero or negative forfamilies with lower ranks. Thus our numerics support a weaker form of the bias conjecture: thefirst lower order term does not have a positive bias for lower rank families. For families withrank at least 4, the negative bias conjecture may not hold.

5.3 Biases in the Third, Fifth, and Seventh Moments

We now explore the third, fifth, and seventh moments of the Dirichlet coefficients of ellipticcurve L-functions. By the Philosophy of Square-Root Cancellation, p times the third moment, ptimes the fifth moment, and p times the seventh moment should have size p2, p3, p4 respectively(we are multiplying by p to remove the 1/p averaging). For example, the third moment is a sumof p terms, each of size

√p3. Thus as these are signed quantities, we expect the size to be on

the order of√p · p3/2.

25

Figure 17: Distribution of average biases in the first 1000 primes for family a1 = 1, a2 = t,a3 = −1, a4 = −t− 1, a6 = 0.

Figure 18: Distribution of average biases in the first 1000 primes for family a1 = 0, a2 = 5,a3 = 0, a4 = −16t2, a6 = 64t2.

We believe that there are bounded functions c3,E(p), c5,E(p), and c7,E (p) such that

A3,E(p) = c3,E (p)p2 +O(p3/2), A5,E(p) = c5,E(p)p

3 +O(p5/2), A7,E(p) = c7,E(p)p4 +O(p7/2);

(68)our data supports these conjectures. Unlike the second, fourth, and sixth moments, the coef-ficient of the leading term can vary with the prime in the third, fifth, and seventh moments.We calculated the average values of c3,E(p), c5,E (p), and c7,E(p) for each elliptic curve family bydividing the size of the main term (p2 for third moment, p3 for fifth moment, and p4 for theseventh moment); see Figure 20.

26

-5 0 5 10 150

5

10

15

20

Figure 19: Distribution of average biases in the first 1000 primes for a rank 6 family.

Figure 20: Numerical data for the average constant for the main term of 3rd and 5th momentssums.

Our data suggests an interesting relationship between the average constant value for themain term and the rank of elliptic families for these odd moments.

Conjecture 5.1 Consider a one-parameter family of elliptic curves of rank r. The averagevalue of the main term of the 3rd moment is −2rp2.

27

Conjecture 5.2 Consider a one-parameter family of elliptic curves of rank r. The averagevalue of the main term of the 5th moment is −5rp3.

Figure 21: Numerical data for the average constant for the main term of 7th moments sums.

Conjecture 5.3 Consider a one-parameter family of elliptic curves of rank r. The averagevalue of the main term of the 7th moment is −14rp4.

Conjecture 5.4 Consider a one-parameter family of elliptic curves of rank r. Let Cn be thenth Catalan number, 1

n+1

(2nn

). For k ∈ Z+, the average value of the main term of the 2k+ 1 th

moment is −Ck+1rpk+1.

We can try to analyze the third, fifth and seventh moments the same way as we did thefourth and sixth. In doing so, we would obtain expansions that do have terms related to thefirst moment (and hence by the Rosen-Silverman theorem the rank of the group of rationalsolutions); unfortunately there are other terms that arise now, due to the odd degree, that arenot present in the even moments and which we cannot control as easily. We thus leave a furtherstudy of these odd moments as a future project.

6 Conclusion and Future Work

Natural future questions are to continue investigating the second moment bias conjecture inmore and more families, theoretically if possible, numerically otherwise. Since the bias in thesecond moments doesn’t imply biases in higher moments, we can also explore whether there is acorresponding negative bias conjecture for the higher even moments. As these will involve quarticor higher in t Legendre sums, it is unlikely that we will be able to compute these in closed form,and thus will have to resort to analyzing data, or a new approach through algebraic geometryand cohomology theory (Michel proved that the lower order terms are related to cohomologicalquantities associated to the elliptic curve).

Any numerical exploration will unfortunately be quite difficult in general, as there is oftena term of size p3/2 which we believe averages to zero for some families, but as it is

√p larger

than the next lower order term, it completely drowns out that term and makes it hard to seethe bias.

28

For the odd moments, our numerical explorations suggest that the bias in the first moment,which is responsible for the rank of the elliptic curve over Q(T ), persists. A natural futureproject is to try to extend Michel’s work to prove our conjectured main term formulas for theodd moments.

Another area we want to focus on in the future is getting to know the two-parameter familiesbetter. What are the implications of the negative bias of the two-parameter families? How dothey behave differently from one-parameter families or other families and why?

7 Acknowledgement

We thank the authors of [ACFKKLMMWWYY] for comments on related problems. We alsothank Jiefei Wu for helpful conversations, and allowing us to expand upon her introduction from[Wu] for this paper, and to include some of her results in the first appendix.

A Representative one-parameter and two-parameter families (By Steven J.Miller and Jiefei Wu)

A.1 One-Parameter Family

Lemma A.1 The first moment of the one-parameter family y2 = x3 − tx2 + (x − 1)t2 is 0 onaverage, and the family’s rank is 0. More explicitly, for primes greater than 3 we have −p timesthe first moment is 2p if p ≡ 1 mod 12, it is −2p if p ≡ 7 mod 12, and 0 otherwise, thus

− pA1,F(p) =

{2p if p ≡ 1 mod 12

−2p if p ≡ 7 mod 12(69)

and zero for all other primes greater than 3.

Proof. For p > 3 we compute the first moment:

−pA1,F (p) = −∑

t(p)

at(p) =∑

t(p)

x(p)

(x3 − tx2 + (x− 1)t2

p

)

=∑

t(p)

x(p)

(x3 − tx2 + xt2 − t2

p

)

=

p−1∑

t=1

x(p)

(t3x3 − t3x2 + t3x− t2

p

)

=∑

x(p)

p−1∑

t=1

(t2

p

)(tx3 − tx2 + tx− 1

p

)

=∑

x(p)

p−1∑

t=0

(t(x3 − x2 + x)− 1

p

)−∑

x(p)

(−1

p

)

=∑

t(p)

x=0

(−1

p

)+∑

t(p)

x(p);x 6=0

(t(x3 − x2 + x)− 1

p

)−∑

x(p)

(−1

p

)

29

=∑

t(p)

x(p);x 6=0

(t(x3 − x2 + x)− 1

p

), (70)

where the first and third sums cancel in the second to last line above. For the remaining sum, ifx3−x2+x = x(x2−x+1) is not zero modulo p, we can send t to (x3 −x2+x)−1t and thus thesum is zero. Thus the only possible contribution from this term is when x2−x+1 ≡ 0 mod p (wehave already removed the x = 0 term, so we can drop that factor). By the Quadratic Formulamod 2 (Lemma 2.4) the roots are (1±

√−3)/2. A simple calculation using quadratic reciprocity

shows that if p ≡ 1 mod 3 then −3 is a square, and thus the two roots exist, while if p ≡ 2 mod 3it is not a square and there are no roots. It is worth to note that all three roots are distinctwhen p exceeds 3. In particular, 1 ±

√3 cannot be zero, and these two roots are not equal to

each other as that would mean√3 is zero.

So, if p ≡ 2 mod 3 the sum is zero, while if p ≡ 1 mod 3 the sum is 2p(−1

p

). If p ≡ 1 mod 4

then(−1

p

)= 1 while if p ≡ 3 mod 4 then

(−1p

)= −1. By Dirichlet’s Theorem for Primes in

Arithmetic Progressions, we see that to first order, half the time the sum is not zero it is 2pwhile the other half of the time it is −2p. Thus on average it is zero, and since it is a rationalsurface, by the Rosen-Silverman theorem the family’s rank is 0 (we have a positive contributionwhen p ≡ 1 mod 12 balanced by an equal negative contribution when p ≡ 7 mod 12; the otherprimes greater than three contribute zero).

Lemma A.2 The second moment of the one-parameter family y2 = x3 − tx2 + (x − 1)t2 hasa negative bias, which supports the bias conjecture, explicitly p2 − 2pδ2,3(p)− 2p

(−3p

)− p(−2

p

)−

[∑x(p)

(x3−x2+x

p

)]2, where δa,b(p) is 1 if p is a mod b and 0 otherwise.

Proof. We have

pA2,F (p) =∑

t(p)

at2(p)

=∑

t(p)

x(p)

y(p)

(x3 − tx2 + xt2 − t2

p

)(y3 − ty2 + yt2 − t2

p

)

=

p−1∑

t=1

x,y(p)

(t3x3 − t3x2 + t3x− t2

p

)(t3y3 − t3y2 + t3y − t2

p

)

=

p−1∑

t=1

x,y(p)

(t4

p

)(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)

=

p−1∑

t=0

x,y(p)

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)−∑

x,y(p)

(−1

p

)(−1

p

)

=∑

t(p)

x,y(p)

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)− p2. (71)

We have a triple sum involving x, y and t. If we fix x and y we have a quadratic in t. We candetermine its contribution by using Lemma 2.3; to do so we need to compute the discriminant

30

of the quadratic in t, denoted by δ(x, y):

a = (x3 − x2 + x)(y3 − y2 + y) = xy(x2 − x+ 1)(y2 − y + 1)

b = −[(x3 − x2 + x) + (y3 − y2 + y)]

c = 1

δ(x, y) = b2 − 4ac = [(x3 − x2 + x)− (y3 − y2 + y)]2

=((x− y)(x2 + xy − x+ y2 − y + 1)

)2= (δ1(x, y)δ2(x, y))

2 (72)

Let the vanishing sum be V (p), the contribution when either x3 − x2 + x or y3 − y2 + y orboth vanish. We have

pA2,F (p) =∑

t(p)

x,y(p)

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)− p2

=∑

t(p)

x3−x2+x 6≡0,

y3−y2+y 6=0

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)+ V (p)− p2

=∑

t(p)

x3−x2+x,y3−y2+y 6≡0δ=0

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)

+∑

t(p)

x3−x2+x,y3−y2+y 6≡=0,δ 6=0

(t(x3 − x2 + x)− 1

p

)(t(y3 − y2 + y)− 1

p

)

+ V (p)− p2

= (p− 1)∑

x3−x2+x,y3−y2+y 6≡=0,δ=0

((x3 − x2 + x)(y3 − y2 + y)

p

)

−∑

x3−x2+x,y3−y2+y 6≡=0,δ 6=0

((x3 − x2 + x)(y3 − y2 + y)

p

)+ V (p)− p2

= p∑

x3−x2+x,y3−y2+y 6≡=0,δ=0

((x3 − x2 + x)(y3 − y2 + y)

p

)

−∑

x3−x2+x 6≡0,

y3−y2+y 6=0

((x3 − x2 + x)(y3 − y2 + y)

p

)+ V (p)− p2. (73)

We first calculate V (p); we use inclusion/exclusion on x3 −x2+x and y3 − y2+ y vanishing.Assume first that x3−x2+x equals to zero (there are three solutions when p is 1 mod 3: x = 0,and x2−x+1 has two roots; there is one solution when p is 2 mod 3: x = 0, and x2−x+1 has no

root mod p). Then we have(t∗0−1

p

)∑t(p)

∑y(p)

(t(y3−y2+y)−1p

)=(−1

p

)∑t(p)

∑y(p)

(t(y3−y2+y)−1p

),

which is(−1

p

)2 · 3p = 3p if p ≡ 1 mod 3 and(−1

p

)2 · 1p = 1p if p ≡ 2 mod 3 from our A1,F (p)computation, giving 3 · 3p = 9p if p ≡ 1 mod 3 and 1 · 1p = 1p if p ≡ 2 mod 3.

Now we assume y3 − y2 + y ≡ 0 mod 3, which similarly gives us 9p if p is 1 mod 3, and p if pis 2 mod 3. We subtract the doubly-counted x3 − x2 + x ≡ y3 − y2 + y ≡ 0, which equals to 9p

31

if p ≡ 1 mod 3 and p if p ≡ 2 mod 3. Hence, the contribution from at least one of x3 − x2 + xand y3 − y2 + y vanishing is 9p if p ≡ 1 mod 3 and p if p ≡ 2 mod 3.

Next, we are going to calculate the contribution of∑

x3−x2+x 6≡0,

y3−y2+y 6=0,δ=0

((x3−x2+x)(y3−y2+y)p

). We

break into two cases, when each factor of δ(x, y) is zero modulo p, and then deal with the doublycounted pairs.

Case 1: δ1(x, y) ≡ 0 mod p: As δ1(x, y) = x− y, we have x = y, and the Legendre symbol is1 as it is a square, so the double-sum is p− 3 if p ≡ 1 mod 3 and is p− 1 if p ≡ 2 mod 3.

Case 2: δ2(x, y) ≡ 0 mod p: When is δ2(x, y) = x2 + xy − x+ y2 − y + 1 = y2 + (x− 1)y +(x2 − x+ 1) ≡ 0(p)? Using Lemma 2.4, we have

y =−(x− 1)±

√(x− 1)2 − 4(x2 − x+ 1)

2

=−(x− 1)±

√−3x2 + 2x− 3

2, (74)

which reduces to finding when −3x2 + 2x − 3 is a square mod p. We get two values of y if itis equivalent to a non-zero square, one value if it is equivalent to zero, and no value if it is not

equivalent to a square. When solving δ2(x, y) ≡ 0(p), we make sure such y 6∈ {0, 1±√−3

2 }, as weare assuming y3 − y2 + y 6≡ 0(p). If y = 0, we have x2 − x + 1 ≡ 0(p) and the solutions are

x = 1±√−3

2 . We have excluded this case so we do not need to worry about y = 0. By symmetry if

y is 1±√−3

2 , we have x = 0 and we can exclude this case too. When we calculate the contribution

from δ2(x, y) ≡ 0 mod p we encounter the following sum:∑

x2−x+1≡0 mod p

(−3x2+2x−3p

), and since

x2−x+1 ≡ 0 mod p we will just have∑

x2−x+1≡0 mod p

(−3(x2−x+1)−xp

)=∑

x2−x+1≡0 mod p

(−xp

).

We show that this sum is 2 when p is 1 mod 3; it is zero when p is 2 mod 3 as for those primesthere are no roots to x2−x+1 ≡ 0 mod p. Note −x ≡ x2−x+1−x mod p, as x2−x+1 ≡ 0 mod p.Thus −x ≡ x2 − 2x+ 1 = (x− 1)2 mod p, and

(−xp

)= 1 for x such that x2 − x+ 1 ≡ 0 mod p.

If p ≡ 1 mod 3, the contribution in case 2 is

x3−x2+x 6≡0(p)

[1 +

(−3x2 + 2x− 3

p

)]= p− 3 +

x3−x2+x 6≡0(p)

(−3x2 + 2x− 3

p

)

= p− 3 +∑

x(p)

(−3x2 + 2x− 3

p

)

−∑

x=0

(−3x2 + 2x− 3

p

)−

x2−x+1≡0(p)

(−xp

)

= p− 3 +∑

x(p)

(−3x2 + 2x− 3

p

)−(−3

p

)

−∑

x2−x+1≡0(p)

(−xp

)

= p− 3 +∑

x(p)

(−3x2 + 2x− 3

p

)−(−3

p

)− 2

32

= p− 5−(−3

p

)+∑

x(p)

(−3x2 + 2x− 3

p

). (75)

For such p ≡ 2 mod 3 we do not have solutions to x2 − x+ 1 = 0, so we have fewer terms,and the contribution is

x3−x2+x 6≡0(p)

[1 +

(−3x2 + 2x− 3

p

)]= p− 1 +

x3−x2+x 6≡0(p)

(−3x2 + 2x− 3

p

)

= p− 1 +∑

x(p)

(−3x2 + 2x− 3

p

)

−∑

x=0

(−3x2 + 2x− 3

p

)

= p− 1 +∑

x(p)

(−3x2 + 2x− 3

p

)−(−3

p

). (76)

We use Lemma 2.4 again. The discriminant now is 22−4(−3)(−3) = −32. Hence, for p ≥ 5,

p does not divide the discriminant, and∑

x(p)

(−3x2+2x−3p

)is −

(−3p

).

Thus, if p ≡ 1 mod 3, for x 6= 0, 1±√−3

2 , the number of solutions with x2+xy−x+y2−y ≡ −1is p− 5− 2

(−3p

); the number of solutions with x− y ≡ 0 is p− 3. If p ≡ 2 mod 3, for x 6= 0, the

number of solutions with x2 + xy − x+ y2 − y ≡ −1 is p − 1 − 2(−3

p

); the number of solutions

with x − y ≡ 0 is p − 1. To count how many solutions there are of 3x2 − 2x + 1, we use thequadratic formula modulo p, and note the discriminant is 4 − 12 = −8 = 4(−2). So if −2 is asquare there are two roots, if −2 is zero there are no roots and if −2 is not a square there areno roots. Thus the number of doubly counted solutions is 1 +

(−2p

). If p ≡ 1 mod 3, the total

number of pairs is

p− 3 + p− 5− 2

(−3

p

)− 1−

(−2

p

)= 2p − 9− 2

(−3

p

)−(−2

p

), (77)

and if p ≡ 2 mod 3, the total number of pairs is

p− 1 + p− 1− 2

(−3

p

)− 1−

(−2

p

)= 2p − 3− 2

(−3

p

)−(−2

p

). (78)

We will later see that all the Legendre coefficients multiplying the number of solutions arealways 1, so we just need to count the number of solutions. When x = y (and x3 − x2 + x 6≡ 0,

y3 − y2 + y 6≡ 0), clearly((x3−x2+x)(y3−y2+y)

p

)= 1 and these terms each contribute 1.

Consider x 6= y and x2 + xy − x + y2 − y + 1 ≡ 0 (and x3 − x2 + x 6≡ 0, y3 − y2 + y 6≡ 0).Then x, y 6≡ 0 and x2 − x+ 1 ≡ y(−y + 1− x) and y2 − y + 1 ≡ x(−x+ 1− y) and

((x3 − x2 + x)(y3 − y2 + y)

p

)=

(x2y2(−x+ 1− y)2

p

). (79)

We can see that as long as x+y 6= 1, all pairs have their Legendre factor +1. If y = 1−x, then wewould have δ2(x, y) = y2+(x−1)y+x2−x+1 = (1−x)2+(x−1)(1−x)+x2−x+1 = x2−x+1 ≡ 0,

33

which is impossible as we are assuming in this case that x3 − x2 + x is not 0 mod p. Therefore,putting all the pieces together, if p ≡ 1 mod 3, we have the following form

pA2,F (p) = p

(2p− 9− 2

(−3

p

)−(−2

p

))−

x,y 6≡0, 1±√

−32

((x3 − x2 + x)(y3 − y2 + y)

p

)+ 9p

− p2

= p2 − 2p

(−3

p

)− p

(−2

p

)−

x(p)

(x3 − x2 + x

p

)2

, (80)

and if p ≡ 2 mod 3, we have the following form

pA2,F (p) = p

(2p − 3− 2

(−3

p

)−(−2

p

))−∑

x,y 6≡0

((x3 − x2 + x)(y3 − y2 + y)

p

)+ p− p2

= p2 − 2p− 2p

(−3

p

)− p

(−2

p

)−

x(p)

(x3 − x2 + x

p

)2

. (81)

A.2 Two-Parameter Family

In the case of our two-parameter family, we unfortunately are going to have cubic and higherdegree of Legendre sums; however, by cleverly writing s in terms of t, we get linear and quadraticLegendre sums and can calculate them in closed form. Then, we count the number of ways thediscriminant of this new closed form vanishes, or equals 0, because these pairs contribute aLegendre factor of +1.

Lemma A.3 The first moment of the two-parameter family y2 = x3 + t2x+ st4 is 0.

Proof. We have

−p2A1,F (p) = −∑

t(p)

s(p)

at,s(p) =∑

t(p)

x(p)

s(p)

(x3 + t2x+ st4

p

)

=

p−1∑

t=1

x(p)

s(p)

(t3x3 + t3x+ st4

p

)

=

p−1∑

t=1

x(p)

s(p)

(t3

p

)(x3 + x+ st

p

)

=∑

x(p)

s(p)

t(p)

(t

p

)(st+ (x3 + x)

p

)

=∑

x(p)

s(p)

t(p)

(t

p

)(t−1st+ (x3 + x)

p

)

34

=∑

x(p)

s(p)

t(p)

(t

p

)(s+ (x3 + x)

p

). (82)

In the proof we used that, at one point, t was not zero. We sent s to t−1s, and looked at theresulting s sum, which equals to zero. �

Lemma A.4 The second moment times p2 of the two-parameter family y2 = x3 + t2x + st4 isp3 − 2p2 + p− 2(p2 − p)

(−3p

), which supports the bias conjecture.

Proof. We have

p2A2,F (p) =∑

t,s(p)

at,s2(p)

=∑

t(p)

s(p)

x,y(p)

(x3 + t2x+ st4

p

)(y3 + t2y + st4

p

)

=

p−1∑

t=1

s(p)

x,y(p)

(t3x3 + t3x+ st4

p

)(t3y3 + t3y + st4

p

)

=

p−1∑

t=1

s(p)

x,y(p)

(t6

p

)(x3 + x+ st

p

)(y3 + y + st

p

)

=

p−1∑

t=0

s(p)

x,y(p)

(x3 + x+ st

p

)(y3 + y + st

p

)−∑

s(p)

x,y(p)

(x3 + x

p

)(y3 + y

p

)

=∑

x,y(p)

s(p)

t(p)

(st+ (x3 + x)

p

)(st+ (y3 + y)

p

)− p

x(p)

(x3 + x

p

)2

=∑

s=0

t(p)

x(p)

(x3 + x

p

)2

+∑

x,y(p)

s 6=0

t(p)

(st+ (x3 + x)

p

)(st+ (y3 + y)

p

)

− p

x(p)

(x3 + x

p

)2

=∑

x,y(p)

s 6=0

t(p)

(st+ (x3 + x)

p

)(st+ (y3 + y)

p

)

=∑

x,y(p)

s 6=0

t(p)

(ss−1t+ (x3 + x)

p

)(ss−1t+ (y3 + y)

p

)

=∑

x,y(p)

s 6=0

t(p)

(t+ (x3 + x)

p

)(t+ (y3 + y)

p

)

= (p− 1)∑

x,y(p)

t(p)

(t+ (x3 + x)

p

)(t+ (y3 + y)

p

), (83)

35

where in passing from the second to the third line we sent x and y modulo p to tx and ty,which is valid so long as t is not zero; to keep the sum over all t we need to subtract the t = 0contribution. We can also see that when s = 0, since the t-sum is p and there is no t dependence,the contribution from s = 0 and t = 0 cancel out each other. Note that now as s is non-zero,we can send t to s−1t, and we get a nice quadratic sum in t.

We use Lemma 2.3. The discriminant of the quadratic in t, δ(x, y), equals

a = 1

b = (x3 + x) + (y3 + y)

c = (x3 + x)(y3 + y)

δ(x, y) = b2 − 4ac = [(x3 + x)− (y3 + y)]2

= [(x− y)(y2 + xy + (1 + x2))]2, (84)

and we are going to count the number of ways it vanishes. Therefore,

p2A2,F (p) = (p − 1)

x,y mod pδ(x,y)≡0(p)

t(p)

(t+ (x3 + x)

p

)(t+ (y3 + y)

p

)

+∑

x,y mod pδ(x,y) 6≡0(p)

t(p)

(t+ (x3 + x)

p

)(t+ (y3 + y)

p

)

= (p− 1)

x,y mod pδ(x,y)≡0(p)

(p − 1) +∑

x,y mod pδ(x,y) 6≡0(p)

(−1)

= (p− 1)

p

x,y mod pδ(x,y)≡0(p)

+p2(−1)

. (85)

We have three cases for δ(x, y) ≡ 0(p).Case 1: We need to count the number of solutions of δ1(x, y) = x− y ≡ 0, which happens p

times when x = y.

Case 2: We need to count the number of solutions of δ2(x, y) = y2 + xy + (1 + x2) ≡ 0. Bythe Quadratic Formula mod p, we have

y =−x±

√−3x2 − 4

2, (86)

which reduced to finding when −3x2 − 4 is a square. Thus, summing over x for p > 2 yields

x(p)

[1 +

(−3x2 − 4

p

)]= p+

x(p)

(−3x2 − 4

p

)

= p−(−3

p

), (87)

36

which follows from Lemma 2.3. The discriminant now is 02 − 4 · (−3) · (−4). For p ≥ 5, p doesnot divide the discriminant, hence this sum is p−

(−3p

).

Case 3: We need to be careful and remove the contribution from doubly counted tuples. Thedouble counted pairs satisfy both x = y and y2 + xy + (1 + x2) ≡ 0(p), which means that theysatisfy 3x2 + 1 ≡ 0(p), or −3x2 ≡ 1. Thus, there is a double-counted solution if and only if(−3

p

)= 1, and the number of double-counted pairs is 1 +

(−3p

).

Therefore, the total number of pairs for δ(x, y) ≡ 0(p) is

δ1(x,y)≡0

+∑

δ2(x,y)≡0

−∑

δ1(x,y)≡0;δ2(x,y)≡0

= p+ p−(−3

p

)− 1−

(−3

p

)

= 2p− 1− 2

(−3

p

). (88)

Hence, the second moment times p2 of the family equals

p2A2,F (p) = (p− 1)

[p

(2p− 1− 2

(−3

p

))+ p2(−1)

]

= p(p− 1)

(p− 1− 2

(−3

p

))

= p3 − 2p2 + p− 2(p2 − p)

(−3

p

). (89)

B Motivation Behind Studying L-Functions

As seen in the previous sections, our research revolves around L-functions. This is a commontheme in mathematics: we can take local data and make a global object, and then deducebehaviors about the local data. As an example, let’s look at the famous Fibonacci sequence;this section is included as a brief motivation for the power of generating functions by buildingon an example hopefully familiar to most readers.

The recurrence relation between Fibonacci numbers is

Fn+1 = Fn + Fn−1, (90)

and the sequence starts with

F0 = 0, F1 = 1. (91)

Once we have the recurrence relation and the initial conditions, we can in principle computeevery Fibonacci number. However, it is time consuming: to find Fn, we must first find Fi for alli < n. Binet’s Formula allows us to generate any Fibonacci number, and we can derive Binet’sFormula using the following generating function

g(x) =∑

n>0

Fnxn. (92)

37

After some algebraic manipulations, we get∑

n≥2

Fn+1xn+1 =

n≥2

Fnxn+1 +

n≥2

Fn−1xn+1

n≥3

Fnxn =

n≥2

Fnxn+1 +

n≥1

Fnxn+2

n≥3

Fnxn = x

n≥2

Fnxn + x2

n≥1

Fnxn

g(x)− F1x− F2x2 = x(g(x)− F1x) + x2g(x)

g(x) =x

1− x− x2. (93)

Although we can expand the above equation using the geometric series formula, that is a poorapproach as we would have to then expand (x + x2)n, and as the two terms are of differentdegrees in x, it would be hard to identify the coefficient of x to a given power. Instead it isbetter to use the partial fraction expansion obtained by factoring the denominator,

g(x) =x

1− x− x2=

1√5

(1+

√5

2 x

1− 1+√5

2 x−

−1+√5

2 x

1− −1+√5

2 x

). (94)

Then, using the geometric series formula, we obtain Binet’s Formula:

Fn =1√5

[(1 +

√5

2

)n

−(−1 +

√5

2

)n], (95)

which allows us to immediately jump to any Fibonacci number.

C Forms of 4th and 6th moments sums

C.1 Tools for higher moments calculations

The Dirichlet Coefficients of the elliptic curve L-function can be written as

at(p) =√p(eiθt(p) + e−iθt(p)

)= 2

√p cos(θt(p)), (96)

with θt(p) real; this expansion exists by Hasse’s theorem, which states |at(p)| ≤ 2√p. Define

symk(θ) :=sin((k + 1)θ)

sin θ. (97)

By the angle addition formula for sine,

symk(θ) = symk−1(θ) cos θ + cos(kθ). (98)

When k = 1, we havesym1(θ) = 2 cos θ. (99)

Michel [Mic] proved that ∑

t(p)

symk(θt(p)) = O(√p), (100)

where the big-Oh constant depends only on the elliptic curve and k; thus while we should havea k subscript in the implied constant, as k is fixed in our investigations we omit it for notationalsimplicity.

38

C.2 Form of 4th moments sums

A lot is known about the moments of the at(p) for a fixed elliptic curve Et. However, as weare only concerned with averages over one-parameter families, we do not need to appeal to anyresults towards the Sato-Tate distribution, and instead we can directly prove convergence of themoments on average to the moments of the semicircle. In particular, the average of the 2m-thmoments has main term 1

m+1

(2mm

)pm−1. The coefficients 1

m+1

(2mm

)are the Catalan numbers, and

the first few main terms of the even moments are p, 2p2, 5p3 and 14p4.

Lemma C.1 The average fourth moment of an elliptic surface with j(T ) non-constant has mainterm 2p2: ∑

t(p)

at4(p) = 2p3 +O(p

52 ). (101)

Proof. We have to computeat

4(p) = 16p2 cos4 θt(p). (102)

We first collect some useful trigonometry identities:

cos(2θ) = 2 cos2(θ)− 1

cos2(θ) =1

2cos(2θ) +

1

2. (103)

We use these to re-write cos4 θ in terms of quantities we can compute:

cos4(θ) =1

4cos2(2θ) +

1

2cos(2θ) +

1

4

=1

8cos(4θ) +

1

2cos(2θ) +

3

8

=1

8[sym4(θ)− sym3(θ) cos θ] +

1

2cos(2θ) +

3

8. (104)

The following expression will arise in our expansion, so we analyze it first:

−1

8sym3(θ) cos θ = −1

8

sin(4θ)

sin θcos θ

= −1

8

2 sin(2θ) cos(2θ)

sin θcos θ

= −1

8

2 · 2 sin θ cos θ cos(2θ)sin θ

cos θ

= −1

2cos2 θ cos(2θ)

= −1

2cos2 θ(2 cos2 θ − 1)

= − cos4 θ +1

2cos2 θ

16p2 ·(−1

8sym3(θ) cos θ

)= −16p2 cos4 θ + 8p2 cos2 θ

= −16p2 cos4 θ + 2p · at2(p). (105)

39

Thus

16p2 cos4 θ = 2p2sym4θ − 16p2 cos4 θ + 2p · at2(p) + 4p · at2(p)− 2p2

2 · (16p2 cos4 θ) = 2p2sym4θ + 6p · at2(p)− 2p2∑

t(p)

(16p2 cos4 θ) = p2∑

t(p)

sym4θ + 3p∑

t(p)

at2(p)− p3

t(p)

at4(p) = p2 ·O(

√p) + 3p(p2 +O(p

32 ))− p3

= 2p3 +O(p52 ), (106)

as claimed. �

C.3 Form of 6th moments sums

Lemma C.2 The average sixth moment of an elliptic surface with j(T ) non-constant has mainterm 5p3: ∑

t(p)

at6(p) = 5p4 +O(p

72 ). (107)

Proof. We have

at6(p) = 64p3 cos6 θt(p)

cos(3θ) = 4 cos3 θ − 3 cos θ

cos3 θ =cos(3θ) + 3 cos θ

4. (108)

We first expand cos6 θ:

cos6 θ =cos2 (3θ) + 9 cos2 θ + 6cos θ cos (3θ)

16

=12 cos (6θ) +

12 + 9[12 cos (2θ) +

12 ] + 6 cos θ[4 cos3 θ − 3 cos θ]

16

=10 + cos (6θ) + 9 cos (2θ) + 48 cos4 θ − 36 cos2 θ

32

=10 + cos (6θ) + 9 cos (2θ) + 48[18 cos (4θ) +

12 cos (2θ) +

38 ]− 36 cos2 θ

32

=10 + cos (6θ) + 9 cos (2θ) + 48[18 cos (4θ) +

12 cos (2θ) +

38 ]− 18 cos (2θ)− 18

32

=10 + cos (6θ) + 6 cos (4θ) + 15 cos (2θ)

32

=cos (6θ)

32+

10 + 6 cos (4θ) + 15 cos (2θ)

32

=sym6 (θ)− sym5 (θ) cos θ

32+

10 + 6 cos (4θ) + 15 cos (2θ)

32. (109)

Next we find a formula for the symmetric function that will appear:

− 1

32sym5 (θ) cos θ = − 1

32

(sin (6θ)

sin θ

)cos θ

40

= − 1

32cos θ

(3 sin (2θ)− 4 sin3 (2θ)

sin θ

)

= − 1

32cos θ

(6 sin θ cos θ − 32 sin3 θ cos3 θ

sin θ

)

= − 1

32cos θ[6 cos θ − 32 sin2 θ cos3 θ]

= − 3

16cos2 θ +

(1− cos2 θ

)cos4 θ

= − 3

16cos2 θ + cos4 θ − cos6 θ

64p3(− 1

32sym5 (θ) cos θ

)= −12p3 cos2 θ + 64p3 cos4 θ − 64p3 cos6 θ

= −64p3 cos6 θ + 4pat4 (p)− 3p2at

2 (p) . (110)

Thus

64p3 cos6 θ = 2p3sym6 (θ)− 64p3 cos6 θ + 4pat4 (p)− 3p2at

2 (p)

+ 12p3 cos (4θ) + 30p3 cos (2θ) + 20p3

64p3 cos6 θ = p3sym6 (θ) + 2pat4 (p)− 3

2p2at

2 (p) + 6p3 cos (4θ) + 15p3 cos (2θ) + 10p3.(111)

We can re-express some of the terms above in a more convenient form:

15p3 cos (2θ) = 15p3(2 cos2 θ − 1

)

= 30p3 cos2 θ − 15p3

=15

2p2at

2 (p)− 15p3 (112)

and

6p3 cos (4θ) = 6p3[sym4 (θ)− sym3 (θ) cos θ]

= 6p3[sym4 (θ)− 8 cos4 θ + 4cos2 θ]

= 6p3sym4 (θ)− 3pat4 (p) + 6p2at

2 (p) . (113)

Thus

64p3 cos6 θ = p3sym6 (θ) + 2pat4 (p)− 3

2p2at

2 (p) + 6p3sym4 (θ)− 3pat4 (p) + 6p2at

2 (p)

+15

2p2at

2 (p)− 15p3 + 10p3

t(p)

64p3 cos6 θ =∑

t(p)

[p3sym6 (θ) + 2pat4 (p)− 3

2p2at

2 (p) + 6p3sym4 (θ)− 3pat4 (p) + 6p2at

2 (p)

+15

2p2at

2 (p)− 15p3 + 10p3]. (114)

Therefore∑

t(p)

at6 (p) = p3

t(p)

sym6 (θ) + 6p3∑

t(p)

sym4 (θ)− p∑

t(p)

at4 (p) + 12p2

t(p)

at2 (p)− 5p4

41

= p3O (√p) + 6p3O (

√p)− p

(2p3 +O

(p

52

))+ 12p2

(p2 +O

(p

32

))− 5p4

= 5p4 +O(p

72

), (115)

completing the proof. �

References

[ALM] S. Arms, S. J. Miller and A. Lozano-Robledo, Constructing elliptic curves over Q(T)with moderate rank, Journal of Number Theory 123 (2007), no. 2, 388-402.

[ACFKKLMMWWYY] M. Asada, R. Chen, E. Fourakis, Y. Kim, A. Kwon, J. Lichtman, B. Mackall, S.J. Miller, E. Winsor, K. Winsor, J. Yang, and Kevin Yang, Lower-Order Biases

Second Moments of Dirichlet Coefficients in Families of L-Functions, preprint.https://web.williams.edu/Mathematics/sjmiller/public_html/math/papers/

BiasesinEllipticCurvesPaperFinal1027.pdf.

[BFMT-B] O. Barrett, F. W. K. Firk, S. J. Miller and C. Turnage-butterbaugh, From Quantum

Systems to L-Functions: Pair Correlation Statistics and Beyond, in Open Problems inMathematics (editors John Nash Jr. and Michael Th. Rassias), Springer-Verlag, 2016,pages 123–171.

[BAU] L. Bauer, Weierstrass equations: Seminar on elliptic curves and the Weil conjectures,to appear in the 4th talk in the seminar on elliptic curves and the Weil conjecturessupervised by Prof. Dr. Moritz Kerz in the summer term at the University of Regensburg(2016), http://www.mathematik.uni-regensburg.de/kerz/ss16/ausarb/bauer.pdf.

[BEW] B. Berndt, R. Evans, and K. Williams, Gauss and Jacobi Sums, Canadian MathematicalSociety Series of Monographs and Advanced Texts, Vol. 21, 1998.

[Bi] B. J. Birch, How the number of points of an elliptic curve over a fixed prime field varies,J. London Math, Soc. 43 (1968), 57-60.

[Da1] H. Davenport, The Higher Arithmetic: An Introduction to the Theory of Numbers, 7thedition, Cambridge University Press, Cambridge, 1999.

[Da2] H. Davenport, Multiplicative Number Theory, 2nd edition, revised by H. Montgomery,Graduate Texts in Mathematics, Vol. 74, Springer-Verlag, New York, 1980.

[Du] A. Dujella, History of elliptic curves rank records, https://web.math.pmf.unizg.hr/

~duje/tors/rankhist.html.

[FM] F. W. K. Firk and S. J. Miller, Nuclei, Primes and the Random Ma-

trix Connection, Symmetry 1 (2009), 64–105; doi:10.3390/sym1010064.http://www.mdpi.com/2073-8994/1/1/64.

[HKLL-RM] T. Hammonds, S. Kim, B. Logsdon, A. Lozano-Robledo and S. J. Miller, Rank and Bias

in Families of Hyperelliptic Curves via Nagao’s Conjecture, to appear in the Journal ofNumber Theory.

[KS2] N. Katz and P. Sarnak, Zeros of zeta functions and symmetries, Bull. AMS 36 (1999),1–26.

[KN1] M. Kazalicki and B. Naskrecki, Second moments and the bias conjecture for the family

of cubic pencils (2020), preprint, https://arxiv.org/pdf/2012.11306.

[KN2] M. Kazalicki and B. Naskrecki, Diophantine triples and K3 surfaces (2021), preprint,https://arxiv.org/pdf/2101.11705.

[KS1] N. Katz and P. Sarnak, Random Matrices, Frobenius Eigenvalues and Monodromy, AMSColloquium Publications, Vol. 45, AMS, Providence, RI, 1999.

[Kn] A. Knapp, Elliptic Curves, Princeton University Press, Princeton, NJ, 1992.

[MMRW] B. Mackall, S. J. Miller, C. Rapti and K. Winsor, Lower-Order Biases in Elliptic Curve

Fourier Coefficients in Families, to appear in the Conference Proceedings of the Work-shop on Frobenius distributions of curves at CIRM in February 2014.

42

[MMRSY] D. Mehrle, S. J. Miller, T. Reiter, J. Stahl and D. Yott, Constructing families of

moderate-rank elliptic curves over number fields, Minnesota Journal of UndergraduateMathematics 2 (2016–2017), 11 pages.

[Mic] P. Michel, Rang moyen de famille de courbes elliptiques et lois de Sato-Tate, Monatsheftefur Mathematik 120 (1995), 127–136.

[Mi1] S. J. Miller, 1- and 2-level densities for families of elliptic curves: evidence

for the underlying group symmetries, Princeton University, PhD thesis (2002).http://web.williams.edu/Mathematics/sjmiller/public_html/math/thesis/

SJMthesis_Rev2005.pdf.

[Mi2] S. J. Miller, 1- and 2-level densities for families of elliptic curves: evidence for the

underlying group symmetries, Compositio Mathematica 140 (2004), no. 4, 952–992.

[Mi3] S. J. Miller, Variation in the number of points on elliptic curves and applications to

excess rank, C. R. Math. Rep. Acad. Sci. Canada 27 (2005), no. 4, 111–120.

[MT-B] S. J. Miller and R. Takloo-Bighash, An Invitation to Modern Number Theory, PrincetonUniversity Press, 2006.

[MWe] S. J. Miller and Y. Weng, Biases in Moments of Dirichlet Coefficients of Elliptic Curve

Families (2021), http://arxiv.org/abs/2102.02702.

[Na1] K. Nagao, Construction of high-rank elliptic curves, Kobe J. Math. 11 (1994), 211–219.

[Na2] K. Nagao, Q(T )-rank of elliptic curves and certain limit coming from the local points,Manuscr. Math. 92 (1997), 13–32.

[NZM] I. Niven, H. Zuckerman, and H. Montgomery, An Introduction to the Theory of Numbers,5th edition, John Wiley & Sons, New York, 1991.

[RG] R. Rivest as the lecturer and D. Ghosh as the scribe, 6.857 Computer and Network Se-

curity, Lecture 8, http://web.mit.edu/6.857/OldStuff/Fall97/lectures/lecture8.pdf.

[RoSi] M. Rosen and J. Silverman, On the rank of an elliptic surface, Invent. Math. 133 (1998),43–67.

[Si0] J. Silverman, An Introduction to the Theory of Elliptic Curves, to appear in the SummerSchool on Computational Number Theory and Applications to Cryptography at Univer-sity of Wyoming in July 2006. https://www.math.brown.edu/~jhs/Presentations/

WyomingEllipticCurve.pdf.

[ST] J. Silverman and J. Tate, Rational Points on Elliptic Curves, Springer-Verlag, NewYork, 1992.

[Su] A. Sutherland, Point Counting, https://ocw.mit.edu/courses/mathematics/

18-783-elliptic-curves-spring-2015/lecture-notes/MIT18_783S15_lec8.pdf.

[Var] A. Varilly, Dirichlet’s Theorem on Arithmetic Progressions, https://math.rice.edu/

~av15/Files/Dirichlet.pdf.

[WAZ] R. Wazir, Arithmetic on elliptic threefolds, Composito Mathematica 140 (2004), 567-580.

[Wu] J. Wu, Biases in First and Second Moments of the Dirichlet Coefficients in One-

and Two-Parameter Families of Elliptic Curves (student research project advised byS. J. Miller), https://web.williams.edu/Mathematics/sjmiller/public_html/math/papers/WuBiasesinEllipticCurvesPaperFinal81.pdf.

Steven J. Miller

Williams College880 Main StWilliamstown, MA 01267E-mail: [email protected], [email protected]

43

Yan Weng

Peddie School201 S Main StHightstown, NJ 08520E-mail: [email protected]

Received: April 31, 2017 Accepted: June 31, 2017Communicated by Some Editor

44