Integer Sequences
Related to Chemistry
N. J. A. Sloanea
and
Parthasarathy Nambib
aAT&T Shannon Lab, Florham Park, NJ 07932 USA
bBellevue Community College, Bellevue, WA 98004 USA
1
ABSTRACT
The aim of this poster is to inform ACS members about the On-
Line Encyclopedia of Integer Sequences (or OEIS) [1], which is a
freely accessible database containing information about 120,000 number
sequences. This poster will describe just a few of the many entries that
are of interest to chemists.
2
1. The OEIS
Entries in the On-Line Encyclopedia of Integer Sequences (or OEIS) give
the first 100 (or sometimes 10,000) terms of the sequences, their definitions, formulas,
computer programs to generate them, references to the literature and to the Internet,
graphs and other illustrations. Each sequence has a unique identification number:
e.g. A000055 gives the number of trees on n nodes.
If you come across a sequence in your work (1, 3, 17, 40, 102, . . . , say) and wish to
identify it, the OEIS is the place to look.
TheOEIS is maintained by N.J.A.S. and has been in existence in various forms for
over 40 years. The web site is used by thousands of people each day. New sequences
are added at a rate of over 12,000 each year.
Only a selection of some sequences of interest to chemists will be mentioned here.
Our convention is that a(n) usually denotes the nth term of the sequence under
discussion, and we will display the first ten or so terms
a(1), a(2), a(3), . . . ,
although of course the OEIS gives many more terms.
3
2. Three Basic Sequences
A000055: the number of trees on n nodes (see Fig. 1):
1, 1, 1, 2, 3, 6, 11, 23, 47, 106, 235, 551, 1301, 3159, . . . .
A000088: the number of graphs on n nodes (see Fig. 2):
1, 2, 4, 11, 34, 156, 1044, 12346, 274668, 12005168, . . . .
(The OEIS contains literally hundreds of sequences related to those two.)
A000108: the Catalan numbers, C(n) = 1
2n+1
(
2n
n
)
:
1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, . . . .
This is the single most frequently looked-up sequence in the OEIS, with over a
hundred different interpretations. For example, C(n) is the number of ways to insert
n pairs of parentheses in a string of n+ 1 letters. E.g. for n = 3 there are five ways:
((AB)(CD)), (((AB)C)D), ((A(BC))D), (A((BC)D)), (A(B(CD))).
4
Figure 1: The three trees with 5 nodes: illustrating a(5) = 3 in A000055
5
Figure 2: The four graphs with 3 nodes: illustrating a(3) = 4 in A000088
6
3. Sequences from the Beginning of the Graphical
Notation
The graphical notation for chemical structures began in the middle of the nine-
teenth century with the work of Crum Brown. This led Arthur Cayley to initiate
graph theory as a part of mathematics. The following four sequences are typical of
those studied by Cayley around 1874 [2], [3].
In [2](b), Cayley’s aim was to determine the number of alkanes with structure
CnH2n+2, ignoring stereoisomers. If the hydrogen atoms are ignored we get an n-node
unlabeled tree in which every node has degree at most 4. Ignoring stereoisomers means
that the children of a node are unordered. Cayley divided these trees into two classes,
those with a unique central node (“centered” trees) and those with two equally central
nodes (“bicentered” trees). The corresponding sequences are A000022, A000200
and A000602 (see Fig. 3):
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 . . .
centered 1 0 1 1 2 2 6 9 20 37 86 181 422 943 2223 . . .
bicentered 0 1 0 1 1 3 3 9 15 38 73 174 380 915 2124 . . .
alkanes 1 1 1 2 3 5 9 18 35 75 159 355 802 1858 4347 . . .
In fact Cayley’s calculations were incorrect above n=11, and the above values are
taken from [3], where Polya theory was used to determine exact formulas for these
sequences.
7
Figure 3: Illustration of initial terms of A000022, A000200 and A000602, from
Henry Bottomley
8
In [2](a), Cayley attempted to determine the number of alkyl radicals with struc-
ture CnH2n+1, again ignoring stereoisomers. This is sequence A000598:
1, 1, 2, 4, 8, 17, 39, 89, 211, 507, 1238, 3057, 7639, 19241, . . . .
(again Cayley’s result were slightly incorrect).
9
4. The 1930’s: Henze and Blair
In the 1930’s, Henry Henze and Charles Blair, chemists at the Univ. of Texas,
wrote a long series of papers in J. Amer. Chem. Soc. [4], [5], [6] in which they
found recurrence relations for the numbers of isomers in various families of chemical
compounds. For example, they gave recurrences relations for Cayley’s sequences
A000602 and A000598 mentioned above, as well as for the analogous sequences
when stereoisomers are taken into account. Thus we haveA000628 (alkanes CnH2n+2,
taking stereoisomers into account):
1, 1, 1, 2, 3, 5, 11, 24, 55, 136, 345, 900, 2412, 6563, . . . ,
and A000625 (alkyl radicals CnH2n+1, taking stereoisomers into account):
1, 1, 2, 5, 11, 28, 74, 199, 551, 1553, 4436, 12832, . . . .
Henze and Blair’s recurrences are usually extremely complicated, and it has recently
been discovered that there are occasional errors in their tables. Of course their cal-
culations were carried out by hand. Polya’s approach, described in the next section,
proved to be—once the necessary mathematical machinery had been developed—a
much simpler and more powerful method.
10
5. Polya’s Enumeration Theory
A major breakthrough occurred around 1936 when the mathematician George
Polya published two papers [7] in which he developed a general theory of enumeration,
applicable to a wide range of problems in chemistry and combinatorics. His work had
been anticipated in part by Redfield [9], but nevertheless this method is known today
as “Polya’s Enumeration Theory”.
The idea is to use the representation theory of permutation groups and their “cycle
indices” to obtain a far-reaching generalization of “Burnside’s Lemma”. This lemma
states that if a set S of objects is permuted by a group G of permutations, then
the number of equivalence classes (“orbits”) of objects in S is equal to the average
number of objects that are fixed under each permutation in G.
For an excellent introduction to Polya theory, see the article by de Bruijn [10].
In his two papers, Polya gives a large number of examples from chemistry. In
particular, he gives generating functions for several of the sequences mentioned above.
To illustrate, here is the beginning of Table I from [7](a), with the names translated
from German (and of course with the sequence numbers added). It is interesting to
note that he thanks his nephew, the chemical engineer J. Polya, for collaborating in
producing the tables.
11
Table 1: Number of structural isomers for homology series and alkyl-derivatives, from
Polya, Zeit. f. Kristall., 93 (1936), 415–443
n 1 2 3 4 5 6 Name Sequence
CnH2n+2 1 1 1 2 3 5 Paraffins A000602
CnH2n+1X 1 1 2 4 8 17 Alkyls A000598
CnH2nXY 1 2 5 12 31 30 Di-substitued paraffins A000635
CnH2nX2 1 2 4 9 21 52 Di-substitued paraffins A000636
CnH2n−1XYZ 1 4 13 42 131 402 Tri-substitued paraffins A000640
CnH2n−1X2Y 1 3 9 27 81 240 Tri-substitued paraffins A022014
CnH2n−1X3 1 2 5 14 39 109 Tri-substitued paraffins A000641
· · · . . . . . . · · · · · ·
12
6. Magic Numbers I: Polyhedral Clusters
In cluster science it is well-known that clusters containing certain special numbers
of atoms occur more frequently than others. These special numbers are often called
magic numbers, and a search on the Internet for “magic number” and “chemistry”
will produce over 100,000 references.
Many of these sequences of magic numbers are not well-defined mathematically.
Probably the most famous example comes from physics: this is the sequenceA018226
2, 8, 20, 28, 50, 82, 126,
–atoms with one of these numbers of protons or neutrons in their nuclei are considered
to be stable.
In the 1980’s Boon Teo and N.J.A.S. carried out a systematic study of magic
numbers from a geometrical point of view [11], [12], [13].
In [11](a) we analyze clusters that have the shape of one of the Platonic or
Archimedean solids (extending earlier work by Buckmister Fuller, Coxeter and oth-
ers).
13
Figure 4: Cluster in shape of truncated octahedron with each edge subdivided into
two equal segments. This cluster contains G2 = 201 atoms, of which S2 = 122 lie on
the surface (or skin)—from Teo and Sloane, Inorg. Chem., 24 (1985), 4545–4558
14
We provide explicit formulas for the analogous magic numbers when the edges
of the polygons or polyhedra are subdivided into n equal parts. For the truncated
octahedron the formulas for Sn, the number of surface points, and G
n, the total
number of points (the magic numbers), are:
Sn= 30n2 + 2 (n ≥ 1), G
n= 16n3 + 15n2 + 6n+ 1 .
Reference [11](a) contains a large number of such cluster sequences. The following
table shows a small sample.
15
Table 2: Number of Surface Points Snand Total Number of Points G
nfor Various
Archimedean and Other Figures (from Teo and Sloane, Inorg. Chem., 24 (1985),
4545–4558)
Polyhedron n 0 1 2 3 4 5 6 7 8 9 10 Sequence
Truncated Sn1 16 58 128 226 352 506 688 898 1136 1402 A005905
tetrahedron Gn1 16 68 180 375 676 1106 1688 2445 3400 4576 A005906
Cubocta- Sn1 12 42 92 162 252 362 492 642 812 1002 A005901
-hedron Gn1 13 55 147 309 561 923 1415 2057 2869 3871 A005902
Truncated Sn1 32 122 272 482 752 1082 1472 1922 2432 3002 A005903
octahedron Gn1 38 201 586 1289 2406 4033 6266 9201 12934 17561 A005910
· · · . . . . . . . . . . . . · · ·
16
7. Magic Numbers II: Close-Packed Spherical
Clusters
In [11](b), [12], [13] Teo and N.J.A.S. give a similar analysis for the sizes of spher-
ical clusters that can be found in the hexagonal lattice in the plane, the simple cubic,
face-centered cubic (fcc), and body-centered cubic (bcc) lattices in three dimensions,
as well as the hexagonal close-packing (hcp) and diamond structures. Generating
functions for these cluster series are simply the appropriate “theta series” of the lat-
tice with respect to the particular point used as the center of the cluster. Again some
of these results were already known.
For example, consider clusters centered at a lattice point in the fcc lattice. Let
Gn(the magic numbers) denote the total number of atoms in the cluster of radius
√2n, and let S
ndenote the number of atoms on the surface of the cluster. Then we
obtain the cluster series (A004015, A119869):
Sn: 1, 12, 6, 24, 12, 24, 8, 48, 6, 36, 24, 24, 24, 72, 0, 48, . . . ,
Gn: 1, 13, 19, 43, 55, 79, 87, 135, 141, 177, 201, 225, 249, . . . .
On the other hand, if the clusters are centered at an octahedral-shaped hole in the
fcc lattice, the analogous cluster series are (A005887, A119874):
Sn: 6, 8, 24, 0, 30, 24, 24, 0, 48, 24, 48, 0, 30, 32, 72, 0, 48, . . . ,
17
Gn: 6, 14, 38, 38, 68, 92, 116, 116, 164, 188, 236, 236, 266, . . . .
The three references contain a large number of cluster series of this type.
18
8. Coordination Sequences
The cluster series described in the previous section give the numbers of atoms in
balls of successive radii 0,√2,√4,√6,√8, . . . (say) around the central point.
Alternatively, one may classify atoms according to the number of bonds in the
shortest path to the central atom. The zeroth shell consists of a single atom, the
number in the next shell is the conventional coordination number, and in general
the nth shell consists of those atoms that are bonded to atoms in shell n−1 (and which
have not already been counted). The sequence that gives the numbers of atoms in the
successive shells is the coordination sequence of the structure, a term introduced
in 1971 by Brunner and Laves [14].
Coordination sequences were originally used to investigate the topology of frame-
works and to help specify the positions of atoms. They are now routinely used to
characterize crystallographic structures.
In [15], Ralf Grosse-Kunstleve, G. O. Brunner and N.J.A.S. computed the coordi-
nation sequences for all the zeolites in the Meier-Olson Atlas of Zeolite Structure
Types. There are almost 400 sequences, all of which are now in the OEIS, often
with recurrences and explicit formulas.
19
The following is a more typical, three-dimensional example. This coordination
sequence arises in the zeolite structures AFG, CAN, LIO and LOS: (sequence
A008013):
1, 4, 10, 20, 34, 54, 78, 104, 134, 168, 210, 256, 302, 352, . . . .
The nth term, a(n), is given by five different expressions:
a(5m) = 52m2 + 2 ,
a(5m+ 1) = 52m2 + 22m+ 4 ,
a(5m+ 2) = 52m2 + 42m+ 10 ,
a(5m+ 3) = 52m2 + 62m+ 20 ,
a(5m+ 4) = 52m2 + 82m+ 34 .
This kind of formula is quite typical for coordination sequences of zeolites, although
many of the formulas aremuchmore complicated than this one. (Technically, formulas
like this, where a(n) depends on the residue of n modulo some number, are called
PORC functions, which stands for “Polynomial On Residue Classes”!)
20
Figure 5: Example: Coordination sequence of a two-dimensional net, with successive
shells indicated by different colors. From R. W. Grosse-Kunstleve et al., Poster,
Sixteenth European Crystallographic Meeting, Lund, Sweden, August 6–11, 1995
21
9. Number of Periodic Close Packings; Stacking
Sequences
Maximally dense sphere packings can be constructed by stacking layers, where
each layer consists of an infinite sheet of balls in the hexagonal lattice arrangement.
Once one layer is in position—call it a type A layer—then there are three choices—call
them A, B, C—for each of the remaining layers. Such packings are generally called
Barlow packings. If the layers alternate . . .A, B, C, A, B, C, A,. . . with period
three, we obtain the fcc lattice; whereas if they alternate . . .A, B, A, B, A, B, A,. . .
with period two, we obtain the hcp structure.
Many authors have studied enumeration problems arising from such packings.
For example, T. J. McLarnan [16] used Polya theory to find the number of Barlow
packings in which the layers alternate with period exactly n (A011768):
0, 1, 1, 1, 2, 3, 6, 7, 16, 21, 43, 63, 129, 203, 404, 685, 1343, . . . .
The entries a(2) = a(3) = 1 correspond to the hcp and fcc, respectively.
In the same paper McLarnan enumerates many related structures. For example,
the number of ZnS polytypes with period n (A011957) is:
0, 1, 1, 1, 1, 2, 3, 6, 10, 18, 31, 59, 105, 198, 365, 688, 1285, . . . .
The entry a(3) = 1 corresponds to the cubic spharelite structure.
22
For other recent work on enumerating stacking sequences see Thompson and
Downs [17], Lord et al. [18], Estevez-Rams et al. [19]. Stacking sequences are also
studied extensively in metallurgy.
23
10. Benzenoids, Polyhexes, Catafusenes
All the sequences mentioned so far have known generating functions, which make
it possible to compute as many terms as one wishes. In contrast, the sequences here
are typical of really hard combinatorial problems, where one cannot do much better
than a brute force enumeration.
A typical example of this type of problem is the enumeration of organic com-
pounds built up from benzene rings. In mathematical terms we are asking for the
number of different planar figures that can be built up from n hexagons. These are
called hexagonal animals or polyhexes. With 1, 2 or 3 hexagons the numbers are
respectively 1, 1 and 3, as shown in Fig. 6.
The sequence of numbers of polyhexes (A000228) is:
1, 1, 3, 7, 22, 82, 333, 1448, 6572, 30490, 143552, 683101, . . . .
The initial enumeration to n = 8 was made by David Klarner. This sequence has been
extended by several authors, most recently by Joseph Myers [20] , who has computed
the first 20 terms.
The analogous sequence with squares instead of hexagons is much older (and just
as hard). This is the question of counting square animals or polyominoes, the
obvious generalization of dominoes. The sequence A000105 begins
1, 1, 2, 5, 12, 35, 108, 369, 1285, 4655, 17073, 63600, . . . .
24
It has been computed to 28 terms by Tomas Oliveira e Silva [21].
25
Figure 6: Polyhexes with 1, 2 or 3 hexagons, corresponding to the benzene, naphtha-
lene, anthracene, phenalene and phenanthrene structures, respectively.
26
Figure 7: Polyominoes with 1 through 5 squares, illustrating the terms
a(1) = a(2) = 1, a(3) = 2, a(4) = 5, a(5) = 12 of A000105 (from Eric Weis-
stein’s World of Mathematics: Polyomino )
27
11. Other Chemical Sequences
The above examples are just a few of the 120,000 sequences currently in theOEIS.
We apologize to all those people (Balaban, Harary, Lederberg, Losanitsch, ...) whose
names we did not mention. If we had had more space we would also have included
the following topics.
R. W. Robinson’s work on enumerating graphs using extensions of Polya’s
theory (86 entries in MathSciNet).
Sven Cyvin, Børg Cyvin, Jon Brunvoll and others: A very long series
of papers (e.g. [22]) on the enumeration of benzenoid hydrocarbons, catacondensed
hydrocarbons, and other topologies of molecular graphs.
Fullerenes: Enumerated by P. W. Fowler and D. E. Manolopoulos [23], G.
Brinkmann and A. W. M. Dress [24] and others.
Self-Avoiding Walks on Lattices. A major area of research in physics, with an
enormous number of papers and with many sequences in the OEIS. See B. D. Hughes
[25] for an overview.
Dissections and Tilings. Typical questions are: how many ways are there to
dissect an n-sided polygon into triangles (A000207), or how many ways are there to
tile an n× n square with dominoes (A004003)?
Necklaces: How many n-bead necklaces can be made using beads of two colors?
Sequences A000013, A000029, A000031 give typical answers (depending on what
28
symmetries are allowed).
29
12. Conclusion
The goal of the OEIS is to include information about all interesting number
sequences. At the present time over 1,000 of the 120,000 entries have their origin in
chemistry, and chemists will find much material of interest.
To find the OEIS, simply Google “sequences”.
If you come across a number sequence that is not in the OEIS, especially from
the chemical literature, please send it in to the database. There is a special web page
for contributing a new sequence or a comment. The reasons for doing this are that
the next person who comes across the sequence will be grateful for the reference, and
your name will be preserved in the OEIS as the person who contributed the sequence.
(You need not be the author of the sequence to send it in.)
30
13. Postscript: Puzzles
Another use for the OEIS is to help people do well on quizzes. Can you find the
next term in the following? You know where to find the answers!
(1): 61,21,82,43,3,64,24,? (A087409)
(2): 1,3,7,12,18,26,35,45,56,69,83,98,114,131,150,? (A005228)
(3): 4,6,7,9,10,11,12,14,15,16,17,18,19,20,22,23,24,25,? (A001690)
(4): 1,2,4,8,16,22,26,38,62,74,102,104,108,116,122,126,? (A063108)
(5): 679,378,168,48,? (A121105)
(6): 2,4,6,30,32,34,36,40,42,44,46,50,52,54,56,60,62,64,66,2000,?
(A006933, the eban numbers).
(7): 2,12,1112,3112,132112,1113122112,311311222112,? (A006751)
(8): 2,3,3,5,10,13,39,43,172,177,885,? (A019460)
(9): 1,2,2,3,3,4,4,4,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,? (A001462)
31
References
[1] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, published electroni-
cally at www.research.att.com/∼njas/sequences/, 1996–2006.
[2] A. Cayley, (a) Phil. Mag., 67 (1874), 444–447; (b) Chem. Ber., 8 (1875), 1056–1059;
(c) Rep. Brit. Assoc. Adv. Sci., 45 (1875), 257–305.
[3] E. M. Rains and N. J. A. Sloane, J. Integer Sequences, 2 (1999), #99.1.1.
[4] C. M. Blair and H. R. Henze, J. Amer. Chem. Soc., 54 (1932), 1098–1105, 1538–1545.
[5] D. D. Coffman, C. M. Blair and H. R. Henze, J. Amer. Chem. Soc., 55 (1933), 252–253.
[6] H. R. Henze and C. M. Blair, J. Amer. Chem. Soc., 53 (1931), 3042–3046, 3077–3085;
55 (1933), 680–686; 56 (1934), 157.
[7] G. Polya, (a) Zeit. f. Kristall., 93 (1936), 415–443; (b) Acta Math., 68 (1937), 145–254
[English translation in [8]].
[8] G. Polya and R. C. Read, Combinatorial Enumeration of Groups, Graphs and Chemical
Compounds, Springer, 1987.
[9] J. H. Redfield, Amer. J. Math., 49 (1927), 433–455.
[10] N. G. de Bruijn, in Applied Combinatorial Mathematics, ed. E. F. Beckenbach, Wiley,
1964.
[11] B. K. Teo and N. J. A. Sloane, (a) Inorg. Chem., 24 (1985), 4545–4558; (b) 25 (1986),
2315–2322.
[12] N. J. A. Sloane and B. K. Teo, J. Chem. Phys., 83 (1985), 6520–6534.
32
[13] N. J. A. Sloane, J. Math. Phys., 28 (1987), 1653–1657.
[14] G. O. Brunner and F. Laves,Wiss. Zeitschr. Techn. Univ. Dresden, 20 (1971), 387–390.
[15] R. W. Grosse-Kunstleve, G. O. Brunner and N. J. A. Sloane, Acta Cryst., A52 (1996),
879–889.
[16] T. J. McLarnan, Zeitschr. f. Kristall., 155 (1981), 269–291.
[17] R. M. Thompson and R. T. Downs, Acta Cryst., B57 (2001), 761–771; B58 (2002),
153.
[18] E. A. Lord et al., Phil. Mag., A82 (2002), 255–268.
[19] E. Estevez-Rams et al., Acta Cryst., A61 (2005), 201–208.
[20] Joseph Myers, Polyomino, Polyhex and Polyiamond Tiling, published electronically at
www.srcf.ucam.org/∼jsm28/tiling/, 2005 (the entries in [1] are more up-to-date).
[21] Tomas Oliveira e Silva, Animal Enumerations on the {4,4} Euclidean Tiling, published
electronically at http://www.ieeta.pt/ tos/animals/a44.html.
[22] J. Brunvoll, S. J. Cyvin and B. N. Cyvin, J. Math. Chem., 21 (1997), 193–196.
[23] P. W. Fowler and D. E. Manolopoulos, An Atlas of Fullerenes, Cambridge Univ. Press,
1995.
[24] G. Brinkmann and A. W. M. Dress, J. Algorithms, 23 (1997), 345–358.
[25] B. D. Hughes, Random Walks and Random Environments, Oxford, 2 vols., 1995–1996.
33