13
Netlib and NA-Net: Building A Scientific Computing Community Jack Dongarra Gene H. Golub Eric Grosse Cleve Moler IEEE Annals of the History of Computing, Volume 30, Number 2, April-June 2008, pp. 30-41 (Article) Published by IEEE Computer Society For additional information about this article Access Provided by University of Tennessee @ Knoxville at 06/21/11 2:06PM GMT More http://muse.jhu.edu/journals/ahc/summary/v030/30.2.dongarra.html

Netlib and NA-Net: Building A Scientific Computing … and NA-Net: Building A Scientific Computing ... special function routines (e.g., Bessel ... symbolic mathematics system and one

Embed Size (px)

Citation preview

Netlib and NA-Net: Building A Scientific Computing Community

Jack DongarraGene H. GolubEric GrosseCleve Moler

IEEE Annals of the History of Computing, Volume 30, Number2, April-June 2008, pp. 30-41 (Article)

Published by IEEE Computer Society

For additional information about this article

Access Provided by University of Tennessee @ Knoxville at 06/21/11 2:06PM GMT

More

http://muse.jhu.edu/journals/ahc/summary/v030/30.2.dongarra.html

Netlib and NA-Net: Building AScientific Computing Community

Jack DongarraUniversity of Tennessee, ORNL, and University of Manchester

Gene H. Golub(Deceased 16 November 2007)

Eric GrosseGoogle

Cleve MolerMathWorks

Keith MooreUniversity of Tennessee

Two resources evolved in the early 1980s to serve the needs of thescientific computing community. These resources were Netlib, asoftware repository that facilitated distribution of public-domainsoftware, and NA-Net, a file of analysts’ contact information thateventually supported an online directory and newsletter digest. Theauthors who created Netlib and NA-Net explain the history of theseresources and their continuing impact.

With today’s effortless access to opensource software and data via a high-speedInternet and responsive search engines, it ishard to remember just how different the lifeof computational scientists was in the 1970sand early 1980s. At the time, computinglargely took place in a mainframe world,typically supported by one of the fewcommercial numerical libraries installed bycomputer center staff. Since this situationwas imperfect (the existing software did notsolve all of scientific computing’s needs),scientists were forced to write or borrowadditional software. There were a few exem-plary libraries such as EISPACK in circulation,and a larger body of Fortran programs ofvariable quality. Obtaining them was abothersome process, however, that involvedleveraging personal contacts, governmentbureaucracies, negotiated legal agreements,and the expensive and unreliable shipping of9-track magnetic tapes or punch card decks.There had to be a better way. That better waymanifested itself as two different, separately

developed resources filling different func-tions but which were both aimed at servingthe scientific computing community. Theseresources, still active today, were Netlib andNA-Net.

Netlib repositories—collections of mathe-matical software, papers, and databases—wereinitially established in 1984 by Eric Grosse atBell Labs and Jack Dongarra at ArgonneNational Laboratory. Based on a suggestionfrom Gene Golub to build a repository ofresearch software, Dongarra and Grosse de-signed the first Netlib repository. Each repos-itory made a collection of high-quality math-ematical software available via electronicmail—the Bell Labs server provided access viauucp (Unix to Unix CoPy) protocols, while theArgonne server made the collection accessiblevia IP (Internet Protocol). The Netlib collectionquickly spread to mirror servers around theworld. As the Internet became ubiquitous,other access methods were added. Althoughthe electronic mail interface is still supported,most access today is via the World Wide Web.

30 IEEE Annals of the History of Computing Published by the IEEE Computer Society 1058-6180/08/$25.00 G 2008 IEEE

Prior to Netlib (short for ‘‘network library’’),software codes had been shared by interestgroups such as SHARE since the mid-1950s, butsuch groups tended to be limited by member-ship or personal relationships. By wideningthe distribution of public-domain numericalsoftware, Netlib gave greater leverage to earlyopen source efforts and helped establish thetrend of software distribution via the network.Netlib remains popular today.

In a separate development that was alsoaimed at the scientific computing communi-ty, by the time Netlib was established, an ‘‘nalist’’ (na for ‘‘numerical analysis’’) of emailaddresses had already been maintained forseveral years by Gene Golub, then chair of theComputer Science Department at StanfordUniversity. By 1983, this list was being used toprovide an electronic mail forwarding serviceto numerical analysis specialists. Mail to na.lastname@su-score would be forwarded to thelist member with that last name. An emailbroadcast facility was also provided: mail sentto na@su-score would be forwarded to every-one on the list. By February 1987, thisbroadcast facility had evolved into a moder-ated email digest, which soon became aweekly electronic newsletter. A ‘‘white pages’’database and a World Wide Web interfacewere eventually added, resulting in the set ofservices provided by NA-Net today. The NA-Net remains a widely used and valuableresource for the numerical analysis commu-nity. The NA-Digest is one of the oldestelectronic periodicals, and its popularitycontinues to grow.

In this article, which describes the earlyhistory of these two resources and the ratio-nale behind the system architectures chosen,we assess what parts worked well and whichdid not.

NetlibThe concept behind Netlib was to facili-

tate quick, easy, and efficient on-demandaccess to useful public-domain computation-al software of interest to the scientificcomputing community. The mechanismchosen for distribution of this software waselectronic mail. Initially, there were tworepositories: one at Bell Labs and the otherat Argonne National Laboratory. The BellLabs server provided access via uucp proto-cols, while the Argonne server made thecollection accessible via Internet mail. Torequest files from either server, a user wouldsend one or more commands as message textto that server’s email address. For instance, a

message consisting of the text ‘‘send dqagfrom quadpack’’ sent to either research!netlib(uucp) or netlib@anl-mcs (Internet) wouldresult in a reply consisting of the ‘‘dqag’’subroutine from the ‘‘quadpack’’ package,along with any additional routines needed touse that subroutine.

At the time Netlib was introduced, public-domain software was chiefly distributed byphysical media such as magnetic tapes beingsent through the postal service. This wasdifficult not only because of the time andexpense involved with handling physicalmedia, but also because of the lack of anywidely used standards for writing magnetictapes. Each computing platform wrote tapesin its native format; and because of differ-ences in word size, record size, and organi-zation of the data on tape, reading foreigntapes could be quite difficult and laborintensive. Fortran programs were formattedwith sequence numbers in columns 73–80and blank fill, which was painfully ineffi-cient in days of dial-up lines before compres-sion was prevalent.

Of course, the Arpanet had been in exis-tence for several years by that time, andsoftware was commonly shared over theArpanet’s File Transfer Protocol (FTP), but theArpanet served a limited community, and theInternet protocols adopted by the limited-commercial-use Arpanet in 1981 were onlybeginning to see wider use. Software could alsobe uploaded to or downloaded from ‘‘bulletinboard systems’’ or BBSs, but this could entailsignificant long-distance telephone charges toperuse distant servers. A picture of the techni-cal environment of the early Internet, uucp,Bitnet, and separate systems like AOL withgateway size limits due to dial-up and restart-from-beginning error recovery has been docu-mented elsewhere.1,2

By contrast, Arpanet electronic mail sys-tems standardized very early on a commonmessage format to be used across all platforms,and uucp mail used a similar format. Bitnet hadsome different conventions, but adequategateways existed. Costs of email were absorbedinto budget overheads, so experimental usecould flourish without formalities. Collective-ly, these conditions made electronic mail asuperior medium for the exchange of sourcecode, at least for small files, and Netlib madeeffective use of it. Unknown to us at the time,silicon chip designs were being sent off forfabrication by email in a service known asMOSIS (Metal Oxide Semiconductor Imple-mentation Service).3 Another early informa-

April–June 2008 31

tion distribution service by email was atCSnet.4 Clearly, then, by the beginning ofthe 1980s, a critical mass of email users in thescientific community had arrived, and email-based servers were inevitable.

Criteria collection and contents

Netlib limited its scope to mathematicalsoftware and related information of interest toscientific computing. Netlib’s developers madean effort to limit the collection to software ofdemonstrated high quality. The fact that high-quality public-domain software packages suchas LINPACK and EISPACK were available forNetlib’s initial collection helped to set highstandards for future additions to the collec-tion. More generally, the numerical analysiscommunity’s tradition of producing robustsoftware to address clearly defined problems

made it somewhat easier to establish a high-quality library for scientific computing thanfor some other areas.

Another way in which Netlib maintainedits quality was by having all contributionsreviewed by a ‘‘chapter editor’’ who providedsome assurance of the quality, stability, andnovelty of the software, and who resolvedauthorship disputes. However, Netlib neverattempted to do serious testing, as asked ofthe reviewers for ACM’s Transactions onMathematical Software (TOMS). Chapter edi-tors typically made a quick judgment callbased on publication of the underlyingalgorithm in a top journal, reputation of thesoftware in the community, and an instinctdeveloped from personal computing experi-ence in the area for what was novel andworthwhile.

Definition of TermsAMS—The American Mathematical Society (AMS) is

an association of professional mathematicians dedicatedto the interests of mathematical research and scholar-ship through various publications and conferences.

BLAS—Basic Linear Algebra Subprograms (BLAS) arestandardized application programming interfaces forsubroutines to perform basic linear algebra operationssuch as vector and matrix multiplication.

EISPACK—A software library for numerical computa-tion of eigenvalues and eigenvectors of matrices,written in Fortran.

FNLIB—Portable special function routines (e.g., Besselfunctions, the error function, and so on).

IMSL—The IMSL (International Mathematics andStatistics Library) Numerical Libraries are softwarelibraries of numerical analysis functionality that areimplemented in widely used computer programminglanguages of C, Java, C#.NET, and Fortran. Softwaredevelopers will typically embed algorithms from theselibraries into their software applications, using theirpreferred programming language. The IMSL Librariesare provided by Visual Numerics Inc.

LINPACK—A software library for performing numericallinear algebra on digital computers.

Macsyma—A computer algebra system that wasoriginally developed from 1968 to 1982 at MIT as partof Project MAC and later marketed commercially. It wasthe first comprehensive symbolic mathematics systemand one of the earliest knowledge-based systems; manyof its ideas were later adopted by Mathematica, Maple,and other systems.

Matlab—A numerical computing environment andprogramming language. Created by The MathWorks,Matlab allows easy matrix manipulation, plotting offunctions and data, implementation of algorithms,

creation of user interfaces, and interfacing with pro-grams in other languages.

MOSIS—MOSIS (Metal Oxide Semiconductor Im-plementation Service) is probably the oldest (1981)integrated circuit (IC) foundry service and one of thefirst Internet services other than supercomputingservices and basic infrastructure such as email or FTP.

MINPACK—A library of Fortran subroutines for thesolving of systems of nonlinear equations, or the least-squares minimization of the residual of a set of linear ornonlinear equations.

NAG—NAG Numerical Libraries is a software prod-uct sold by The Numerical Algorithms Group Ltd(originally the Nottingham Algorithms Group). Theproduct is a software library of numerical analysisroutines.

PORT—The PORT Mathematical Subroutine Libraryfrom Bell Labs is a collection of Fortran 77 routines thataddress many traditional areas of mathematical soft-ware, including approximation, ordinary and partialdifferential equations, linear algebra and eigensystems,optimization, quadrature, root finding, special func-tions, and Fourier transforms, but excluding statisticalcalculations. PORT stands for Portable, Outstanding,Reliable, and Tested.

QUADPACK—A Fortran subroutine package for the numer-ical computation of definite one-dimensional integrals.

SHARE—A volunteer-run user group for IBM main-frame computers that was founded in 1955 by LosAngeles-area IBM 701 users. It evolved into a forum forexchanging technical information about programminglanguages, operating systems, database systems, anduser experiences for enterprise users of small, medium,and large-scale IBM computers such as IBM S/360, IBMS/370, zSeries, pSeries, and xSeries.

Netlib and NA-Net: Building A Scientific Computing Community

32 IEEE Annals of the History of Computing

The initial Netlib collection containedLINPACK; EISPACK; MINPACK; FNLIB; routines fromthe book Computer Methods for MathematicalComputations by Forsythe, Malcolm, and Mo-ler; and QUADPACK, as well as a collection of‘‘golden oldies.’’ Soon, additional codes wereadded from TOMS as well as some benchmark-ing codes, a biharmonic solver, multiprecisionarithmetic package, BLAS (Basic Linear AlgebraSubprograms), and so on. Substantial effortwas expended in collecting these, especially byTOMS, in a unified form suitable for networkdistribution. There was never any doubt,however, that the real intellectual effort andcredit belonged to the software componentauthors and not to Netlib. For the periodconsidered in this article (1984 to approxi-mately 1994), essentially all scientific softwarewas written in Fortran or C (as opposed toMacsyma or Excel or Matlab, say). Scientificsoftware remains the focus of Netlib.

The dominant commercial packages at thetime were from Harwell Software Libraries,IMSL Numerical Libraries, and NAG NumericalLibraries, with important but smaller rolesplayed by the PORT Mathematical SubroutineLibrary, and the libraries of Boeing and theDepartment of Energy. Although these pack-ages were of high quality and had a financialfoundation for user support, they were diffi-cult to adapt in the transition that was thenunder way, from mainframes in computercenters to departmental servers and personalcomputers. At least as important as thedifficulty of adapting commercial packageswas user dissatisfaction with the delay ingetting new algorithms into the commercialreleases. With Netlib, conversely, since thesoftware was downloaded over the network itwas available in ‘‘real’’ time—there were nodelays in ordering and delivering the software.The user’s ability to change the softwaredirectly when needed is a common theme injustifying open source today.

Another important source of software,outside of Netlib, came from small snippetsof code that users would take from Netlib andpaste directly into their own programs. Thebest and most prominent of these was Numer-ical Recipes,5 which was published after Netlibwas well along but which had at least as widean impact. An enduring role of the software inNetlib, by comparison, is to handle those moredifficult numerical problems that are out ofreach of small algorithms.

Unlike the commercial libraries and effortssuch as Numerical Recipes, Netlib has alsoprovided a home for the numerical research

community to share a historical collection ofmany competing algorithms designed to solvethe same problem. This can confuse theengineer who wants digested advice on thesingle best method to use but enhances theresearch community by making it feasible torun fairer comparisons of new algorithmsagainst old. Extending that role further, Netlibhosts collections of standard data sets orfunctions for such algorithm comparisons.

Finally, Netlib has material that does not fitthe broader pattern, such as the polyhedradatabase. These materials were added at a timewhen file distribution was such a burden thatauthors were desperate for help and seized theNetlib opportunity.

Server locations and mirroring

The Bell Labs Netlib server went online 3January 1984, and was used for softwaredistribution within Bell Labs during thatspring. Both the Bell Labs and Argonne serverswere in public use by July of that year.

The Argonne server was physically movedto Oak Ridge National Laboratory (ORNL) inOctober 1989, when Jack Dongarra moved tothe University of Tennessee. As the Netlibcollection became more popular, it becameadvantageous to mirror the collection outsideof North America; however, the number ofofficial mirrors was deliberately kept smallwhile we learned how to minimize thedifferences between the servers. A Europeanmirror was established in Oslo, Norway, inDecember 1989, and an Australian mirror alsoexisted by 1989. In May 1993, the originalSequent server that had been in service atArgonne was replaced by two SparcStation 2machines—one at ORNL and the other at theUniversity of Tennessee in Knoxville (UTK).

With Netlib, since soft-

ware was downloaded

over the network, it was

available in ‘‘real’’ time—

there were no delays

in ordering and

delivering the software.

April–June 2008 33

These servers were closely synchronized to oneanother, and both of the servers were config-ured to accept email requests sent to [email protected] domain.

The ORNL server was retired in 1997. Theofficial servers today consist of netlib.bell-labs.com in Murray Hill, New Jersey (whichhas moved to netlib.sandia.gov); Netlib.org, atthe University of Tennessee; the UK MirrorService in Kent, England; and Netlib.no inBergen, Norway. These servers are synchro-nized to one another via a process describedelsewhere.6 Since the introduction of anony-mous FTP access to the primary Netlib servers,many more unofficial mirrors have beencreated; there is a great deal of variationamong these mirrors with regard to theorganization and currency of their collections.

What is at least as important as the networkperformance benefit from mirroring is thecultural aspect, namely in the critical roleplayed by the chapter editors. Since this is avolunteer effort by busy people working at thetop of their field, it’s imperative that Netlibmake as few demands on their life as possible.Its principal contribution, therefore, is inallowing them to make their changes directlyin their own file trees, and mirror the changesto the other Netlib sites. Setting up a large,reasonably coherent repository that is genu-inely shared across multiple organizations in apeer fashion has been, as far as we know,another Netlib innovation.

Netlib features

At the time Netlib was created there was noubiquitous Internet, but rather an admixtureof incompatible networks, each with differenthosts, services, and means of addressing. Inmany cases the networks were interconnectedwith mail gateways, but these could bedifficult to use (because of the need toexplicitly route a message through the gate-ways) and unreliable besides. Because text-based electronic mail was the only servicecommon to all of those networks, this was themeans Netlib initially utilized to provideaccess to items in its collection, but as Internetprotocols began to be more widely deployed,other, easier-to-use methods became feasible.A server for the Internet’s FTP was provided onthe Bell Labs Netlib server in August 1991 andsoon afterward at the ORNL server. ThexNetlib browser was first deployed in Decem-ber 1991, allowing users to peruse and down-load items from the ORNL server via agraphical user interface for the X WindowSystem.7

In response to the growing popularity ofthe World Wide Web, HTTP Netlib serverswere established in October 1993 at UTK andORNL; Gopher was enabled in November1993. A Web server interface was implementedat Bell Labs one week after Mosaic arrivedthere; from our perspective it was the cross-platform convenience of that browser, muchmore than HTML or HTTP, that was revolu-tionary. Although the email interface is stillsupported, HTTP—and to a lesser extent,FTP—quickly eclipsed all other sources ofNetlib traffic. Netlib dabbled in software-as-a-service by providing email-delivered transla-tion of Fortran into C, but sister systems suchas Network-Enabled Optimization System(NEOS) were much more influential in thatdirection for the numerical community.

One of Netlib’s unsuccessful features was itsuse of digital signatures. Since large software isdownloaded and compiled without inspectionand may be distributed via mirror sites ofunknown security, we thought people wouldappreciate having PGP-signed MD5 check-sums. We also built mechanisms that wouldallow authors to update their own files, subjectto proper signatures. Neither feature got muchuse, presumably because we did not makethem sufficiently automatic and because therehave been no known instances of securityabuse involving Netlib.

Another security-related feature was popu-lar briefly but faded. Before the days ofpersonal homepages, it could be inconvenientto locate someone’s contact information. TheNA-Net database went far to solve this for thenumerical analysis community, but was neveras complete as journal editors would likewhen, for example, looking for referees. Forits part, Netlib introduced a way to search theSIAM (Society of Industrial and Applied Math-ematics) membership list. Since this wasduring an era when network connectivitywas unreliable, the method Netlib adoptedwas to distribute an encrypted version of thedatabase along with a program to allowedrestricted searches to protect privacy and deterspam.8 Eventually this was phased out in favorof the AMS (American Mathematical Society)Combined Membership List online.

Two innovative features designed for Netlibworked well but were eventually casualties ofbeing tied to a research operating system thatno longer hosts Netlib. The first of these was away of asking for a bundle, specifically in theUnix tar format, for a minimal set of files withno unresolved dependencies. The second was a‘‘historic Netlib’’ server that was a kind of time

Netlib and NA-Net: Building A Scientific Computing Community

34 IEEE Annals of the History of Computing

machine allowing one to see all the changes ina program, much as version control systemsdo. It was pleasant to be able to provide suchfeatures without introducing any new accessprotocols, using the power of the Plan 9operating system’s generalized vision of a file.9

Usage patterns

At least at the UTK and ORNL servers,Netlib usage rose steadily for 1985–2000 (seeFigure 1). Traffic levels have varied in recentyears, possibly in part because of an increaseduse of crawlers and mirrors. Approximately 80percent of the current traffic appears to befrom individual users, 7 percent from mirrors,and 12 percent from Web crawlers, which arepresumably building indices for use by searchengines.

Community impact

A lot of mathematical software is available,either commercial or free. However, not allthat software is of high quality. It can also bedifficult to locate the appropriate software byusing web search engines, since the descrip-tions available for searching may be lacking ormay not match the vocabulary used by the

searcher. A good solution to these problems isto have experts in the field of numericalanalysis—as is the case with Netlib—maintaina moderated collection of high-quality soft-ware that is organized and catalogued withappropriate metadata to enable easy searching.

Netlib was not the first collection of suchsoftware, however; a few of the best-fundedand well-run laboratories had local librariestended by numerical consultants who grew toknow the local needs especially well. As onerelevant example, the Stanford Linear Acceler-ator Center collaborated with Stanford Uni-versity’s Computer Science Department totrain graduate students in numerical analysiswhile contributing to state-of-the-art numeri-cal processing for physics calculations. Stu-dents from that program went on to be activecontributors to Netlib.

The main impact of Netlib was to democ-ratize and reduce the duplication of librariessuch as Stanford’s by making them availableworldwide, promoting the best practice ofreuse to a broader audience. A secondaryunintended impact may have been to weakenthe case for employing local consultants. Wehave no data on the extent of this effect, but

Figure 1. Netlib usage at the University of Tennessee–Knoxville and Oak Ridge National Laboratory.

April–June 2008 35

would argue an experienced consultant isvastly better than any catalog. Whatever harmmay have been done on this score has beenmore than balanced by the increased profes-sional credit earned by authors of widely usedcodes. It is not uncommon in grant proposalsfor authors to cite Netlib download statistics asevidence of their reputation and practicalimpact in the field.

The experience gained from Netlib has beentransferred to other projects designed topromote software collections managementand software reuse. The National HPCC Soft-ware Exchange (NHSE) strove to apply thetechniques and technologies of Netlib to moreloosely coupled software repositories. TheNHSE also made software repository creationand maintenance easier through the Reposi-tory In a Box (RIB) toolkit.

Netlib pioneered network delivery of soft-ware during a period of explosive growth innetworking, requiring the tracking of newdelivery technologies as mentioned above.Additionally, Netlib has shown that it ispossible to manage such collections in adistributed manner, both in terms of infra-structure and administration. The Netlib col-lection itself has stopped growing as rapidly asin its first decade, as algorithm writers found itincreasingly easy to publish their softwaredirectly. However, there is a definite risk thatindividual sites will eventually disappear, withthe contents lost to mankind. There is noguarantee that Netlib will exist forever, but thedistributed administration has already sur-vived several changes in employers that wouldhave spelled disaster for a smaller collection.

Related projects

Netlib has had a number of spin-off efforts,summarized here.

N XNetlib began in 1990, predating the Weband tools like Netscape. XNetlib was a toolfor ‘‘Web based’’ software distribution.Whereas Netlib originally used e-mail asthe user interface to the collection ofpublic-domain mathematical software,XNetlib used its own specialized X-Windowinterface and Unix socket-based communi-cation.

N NHSE. For 1994–2004, the NHSE (NationalHPCC [High-Performance ComputingCouncil] Software Exchange) existed as adistributed collection of software, docu-ments, data, and information of interestto the high-performance and parallel-com-

puting community. The significance of thecollaborative effort is evident through themany useful reports and tools generated aswell as the many repositories that havebeen (and still are being) created, with theRepository-In-a-Box (RIB) toolkit developedin 1996. However, continued operation ofthe site without funding has becomeimpractical.

N RIB provides a toolkit for building andmaintaining metadata repositories. RIBwas developed by the NHSE TechnicalTeam at the University of Tennessee,Knoxville. Initially, RIB only providedtools for the creation of software reposito-ries. The recent release of RIB v2.0 allowsRIB to create general metadata reposito-ries. The creation of software metadatarepositories remains the primary applica-tion of RIB.10

N NetBuild is a tool that automates the processof selecting, locating, downloading, con-figuring, and installing computational soft-ware libraries from over the Internet.Additional tools aid in the constructionand cataloging of libraries in the formatused by NetBuild.11

NA-NetThe Numerical Analysis Net (NA-Net) is a

collection of several services designed to fosterinformation and a sense of community amongnumerical analysts. It currently consists of anemail forwarding service, the NA-Digest, and awhite-pages database.

One of the primary

benefits of NA-Net’s

forwarding facility was

that it provided all of its

subscribers with a

uniform address. Before

the Internet, email

traveled a hodgepodge of

dissimilar networks.

Netlib and NA-Net: Building A Scientific Computing Community

36 IEEE Annals of the History of Computing

Software and hosting

NA-Net traces its origins to a list of emailaddresses maintained by Gene Golub atStanford University and distributed to otherswithin the numerical analysis community. Atsome point an electronic mail system wasspecially configured so that mail to na.lastname@su-score (later, score.stanford.edu)would be forwarded to the person on the listwith that last name. This was originallyimplemented by manually copying entriesfrom Golub’s list into a system alias file(requiring interaction by the system adminis-trator each time the list was updated). Even-tually software was written by Mark Kent andRay Tuminaro to allow the forwarding aliasesand address list (for human perusal) to begenerated from a common database.12

At an early stage, Golub collected addressesof numerical analysts. Jim Wilkinson (a leaderin the field of numerical linear algebra) onceasked, ‘‘Whom can I contact on the Net?’’ andGolub passed along his file of analyst contacts.It then occurred to Golub that we ought to haveuniversal addresses. So at Stanford a system wasset up where a user could address a person asna.,lastname.@score. We felt that the NA-Netwould enable our community to communicatemore easily with one another, especially whenNet addresses were—at the time—so complicat-ed. (Some users commented that they couldonly communicate via NA-Net!)

One of the primary benefits of NA-Net’sforwarding facility in those days was that itprovided all of its subscribers with a uniformaddress. Before Internet access became ubiqui-tous, email traveled over a hodgepodge ofdissimilar networks, each with its own ad-dressing scheme. When sending a messagebetween dissimilar networks, it was oftennecessary to ‘‘source route’’ the messagethrough a gateway by embedding the recipi-ent’s address inside the address of the gateway.Because each network had its own addressingconvention, an address that worked to reach arecipient from one location of the networkwould not necessarily work from another. Butwhen sending mail to any NA-Net subscriber itwas only needed to understand how to sendmail to one na.lastname address; the samepattern would work for any other NA-Netsubscriber. This eliminated some of the guess-work of mailing between networks, at leastwhen mailing to other NA-Net subscribers.

A broadcast facility was also set up so that amessage sent to [email protected] would beforwarded to everyone on the list. Eventuallytraffic volume and accidental misuse became

significant enough that some sort of modera-tion was required for the broadcast facility, andso it was converted to an email ‘‘digest.’’ Amoderator would review messages sent to [email protected], and the selected messageswould then be sent out to everyone. The firstissue of the digest was 13 February 1987. At firstthe digest was sent out at irregular intervals butsoon settled into a weekly publication.

In December 1990, the NA-Net was movedto Oak Ridge National Laboratory, using newsoftware written by Bill Rosener. The newsoftware preserved the na.lastname forwardingand digest functions, but also allowed individ-uals to add themselves to the NA-Net, removethemselves, or change addresses, unlike previ-ous versions that required manual mainte-nance of the subscriber list.

To subscribe, a user sent an email messageto [email protected] with the followingfields in the message body:

Firstname: user’s first name

Lastname: user’s last name

E-mail: user’s email address

The NA-Net server would then reply with amessage indicating whether the user wassuccessfully added. To unsubscribe, a userwould send a similar message to the [email protected]; to change an ad-dress the message was sent to [email protected]. Each function had its ownemail address and its own requirements forthe format of the data to be supplied.13

In May 1991, a ‘‘white pages’’ facility wasadded to NA-Net. Members could store infor-mation about themselves in the white pagesdatabase, such as their interests and home andwork addresses. This information would thenbe made available in response to queries sentby email to [email protected].

By June 1993, the service had become sopopular that the server was having difficultyhandling the email traffic. At this time theentire NA-Net software package was entirelyrewritten by Keith Moore to improve scalability(especially of email distribution) and robust-ness, but the user interface remained the sameas before. This software remains in use today,with only minor changes. In November 1994, aweb interface was added to rid users of theburden of having to submit requests in text-based email with rigidly defined syntax.

NA-Digest history and content

In the early days the NA-Digest came aboutbecause Gene Golub’s secretary was sending

April–June 2008 37

the digest to the whole NA-Net address list.Gene was the original editor of the NA-Digestuntil July 1987, when he began a sabbaticalleave from Stanford, and at that time CleveMoler of MathWorks began editing the digest.With only occasional absences, Cleve contin-ued to edit the digest until September 2005.Currently, the digest is edited by Tamara Koldaof Sandia National Labs.

As the name suggests, the intent of the NA-Digest has been to have short announcementssummarizing more extensive material availableelsewhere. Today, almost all of the digestcontributions have URLs pointing to morecomplete announcements available on the Web.

The digest has generally contained any-thing of interest to the numerical analysis andmathematical software community. This hasincluded both technical discussions and in-formation about community members. Ex-amples of material commonly appearing inthe digest include announcements of confer-ences, workshops, software releases, change ofaddresses, and new books; advertisements ofjobs; and notices about community members:awards, significant achievements, and deaths.

Reasonably complete archives of the digestexist, hosted at http://www.Netlib.org/na-digest-html/. The archives contain a great deal ofmaterial of interest to anyone studying thehistory of numerical analysis software.

Usage patterns

Based on early reports in the NA-Digest andon log files maintained since 1993, thenumber of NA-Net subscribers has increased

steadily from 821 subscribers in May 1997 to11,295 subscribers today (see Figure 2). (Weare missing some data, as the result of lost logfiles, in the interval 1991–1993. The disconti-nuity in 2000 was caused by removal of manyaddresses that were no longer reachable, suchas Bitnet addresses.)

Several thousand messages per day areforwarded through NA-Net’s email forwardingservice. From a perusal of log file entries, manyof these unfortunately appear to be spam. Formany years the NA-Digest subscriber list wasmade available for download over Netlib, aholdover from the days when the network wassmall and most people with network accesscould be assumed to act reasonably. The listwas also available to anyone who sent mail [email protected]. Even after the listwas removed from Netlib and the sendlistaddress was disabled, however, subscriberaddresses were occasionally found to havebeen ‘‘leaked’’ by various means.

Community impact

The international numerical analysis com-munity is small enough that it still has acohesive, ‘‘family’’ feeling. We believe the NAdigest has helped maintain that feeling;moreover, the fact that the digest is stilldistributed via low-tech, simple text emailmeans that it is accessible to anyone withaccess to a computer and a network connec-tion. This is particularly important to subscrib-ers around the world who are unable to travelto many meetings. Cleve Moler reports that hehas several times met people who recognize

Figure 2. NA-Net subscriber growth.

Netlib and NA-Net: Building A Scientific Computing Community

38 IEEE Annals of the History of Computing

his name primarily as the ‘‘guy who sends meemail every week.’’

We are indebted for the feedback wereceived of a referee’s comment (made anon-ymously for a paper’s review in July 2007) forobserving that

NA-Net’s existing context was probably SIGNUM.This ACM Special Interest Group for NumericalComputation built a community around thequarterly SIGNUM Newsletter, which was mailedto members. (Within the ACM, some SIGs areheld together by strong conferences that definethem, while others are held together by theexchange of information in a newsletter.SIGNUM never had a strong recurring confer-ence—SIAM served that role better—so it wasprimarily a ‘‘newsletter SIG.’’) NA Digest provideda much better forum for this type of informationexchange, delivering it more quickly, eventuallyreaching a much wider community, and provid-ing other valuable services. SIGNUM foldedaround the year 2000 after being in existencefor more than 25 years. The success of NA-Netwas probably a strong reason for its demise.

Lessons for funding modelsOne reason to consider the history of

successful projects such as Netlib and NA-Netis to try to generalize and see what decisionmakers today might learn for encouragingother successes. In this section, we speculateon the critical issue of motivation and support.

The idea that computational modeling andsimulation represents a new branch of scien-tific methodology, alongside theory and ex-

perimentation, was introduced about threedecades ago. It has since come to symbolizethe enthusiasm and sense of importance thatpeople in our community feel for the workthey are doing. But when we try to assess howmuch progress we have made and where thingsstand along the developmental path for thisnew ‘‘third pillar of science,’’ recalling somehistory about the development of the otherpillars can help keep things in perspective. Itseems clear that while computational sciencehas had many remarkable youthful successes, itis still at a very early stage of growth.

Many of us today who want to hasten thatgrowth believe that the most progressive stepsin that direction require much more commu-nity focus and funding on the vital core ofcomputational science: software and themathematical models and algorithms it en-codes. But when it comes to advancing thecause of computational modeling and simula-tion as a new part of the scientific method,there is no doubt that the complex software‘‘ecosystem’’ it requires must take center stage.

At the application level, the science must becaptured in mathematical models, which inturn are expressed algorithmically and ulti-mately encoded as software. Accordingly, ontypical scientific projects, the majority offunding supports this translation process,beginning with scientific ideas and endingwith executable software, and which over itscourse requires intimate collaboration amongdomain scientists, computer scientists, andapplied mathematicians. This process alsorelies on a large infrastructure of mathematicallibraries, protocols, and system software builtup over many years and which must bemaintained, ported, and enhanced for manymore years to come if the value of theapplication codes that depend on it is to bepreserved and extended. The software thatencapsulates all this time, energy, andthought, routinely outlasts (usually by years,sometimes by decades) the hardware it wasoriginally designed to run on, as well as theindividuals who designed and developed it.

The life of computational science, there-fore, revolves around a multifaceted softwareecosystem. In the early days, Netlib and NA-Net supplemented commercial libraries, con-ferences, and journals in building such anecosystem. But today there is (and should be)a real concern that this ecosystem of compu-tational science, with all its complexities, isnot ready for the major challenges that willsoon confront the field. Domain scientistsnow want to create much larger, multidimen-

Many of us who want to

hasten growth in

computational science

believe that to do so

requires much more

community focus and

funding on its vital core:

software and the

mathematical models and

algorithms it encodes.

April–June 2008 39

sional applications in which a variety ofpreviously independent models are coupledtogether, or even fully integrated. They hopeto be able to run these applications onpetascale systems with tens of thousands ofprocessors, to extract a good fraction of theperformance these platforms can deliver, torecover automatically from the processorfailures that regularly occur at this scale, andto do all this without sacrificing good pro-grammability. This vision of what computa-tional science wants to become containsnumerous unsolved and exciting problemsfor the software research community. Unfor-tunately, it also highlights aspects of thecurrent software environment that are eitherimmature, underfunded, or both.

The efforts of Netlib and the NA-Net’s NA-Digest, for the most part, have been throughvolunteers. These services have been viewed bythe community as important and successful,which by itself has been motivation enoughfor us to devote substantial time to the effort,and enough goodwill for our employers toallow that effort. Advancing to the next stageof growth for computational simulation andmodeling will require the solution of hard,basic research problems in computer scienceand applied mathematics as well as creatingand promulgating a new paradigm for thedevelopment of scientific software. So theobvious questions arise: How should this kindof activity be supported? Where should it bedone? universities? companies like Math-Works? We’ve been told many times by thefunding agencies that ‘‘it isn’t research.’’ Theseare difficult, but important, questions goingforward. With less industrial research supportfor broad ecosystems efforts than has beenavailable in past decades, we believe progresswill require a greater level of sustained fundingfrom governmental sources.

SummarySoftware distributed by Netlib comes with

the disclaimer that ‘‘anything free comes withno guarantee.’’ In contrast to commercialvendors like the Numerical Algorithm Group(NAG) and IMSL Numerical Libraries, Netliboffers no support beyond whatever documen-tation contributing authors choose to providewith their code. On the other hand, Netlibprovides free, easy access to a large body ofhigh-quality code, and the phenomenalgrowth of Netlib attests to the value of thisservice. We hope that Netlib, by making high-quality code even more accessible, has encour-

aged software developers to make their sourcecodes freely available and will continue tomake good programming still easier for thescientific computing community.

Both the NA-Digest and NA-Net haveprovided a singular resource for the advance-ment of science and education by helping tocreate a community—and, at the same time, asense of community—of those interested inscientific computation. It has allowed forcross-pollination of ideas and techniquesamong scientists, engineers, and numericalanalysts, both from academia and industry.

AcknowledgmentsThe authors thank Don Fike for assistancewith Netlib statistics, and Mark Crispin andMark Kent for information about early NA-Netimplementation. Mel Ciment, who was at theNational Science Foundation, provided seedfunding for Netlib and NA-Net when they firststarted. Sandy Fraser gave crucial executivesupport at Bell Labs, not only in budget but byaccepting the legal uncertainties of networksoftware distribution in the early days.

References and notes1. D. Frey and R. Adams, !%@ A Directory of Electronic

Mail Addressing and Networks, 3rd ed., O’Reilly &

Associates, 1993.

2. E. Krol, The Whole Internet User’s Guide & Catalog,

2nd ed., O’Reilly & Associates, 1994.

3. C. Tomovich, ‘‘MOSIS-A Gateway to Silicon,’’ IEEE

Circuits and Devices Magazine, vol. 4, no. 2, 1988,

pp. 22-23.

4. C. Partridge, C. Mooers, and M. Laubach, ‘‘The

CSNET Information Server: Automatic Document

Distribution Using Electronic Mail,’’ SIGCOMM

Comput. Commun. Rev., vol. 17, no. 4, 1987,

pp. 3-10.

5. W.H. Press et al., Numerical Recipes in C: The Art of

Scientific Computing, 2nd ed., Cambridge Univ.

Press, 1992.

6. E. Grosse, ‘‘Repository Mirroring,’’ Transactions on

Mathematical Software, vol. 21, no. 1, 1995,

pp. 89-97.

7. J. Dongarra, T. Rowan, and R. Wade, ‘‘Software

Distribution Using XNETLIB,’’ Transactions on Mathe-

matical Software, vol. 21, no. 1, 1995, pp. 79-88.

8. J. Feigenbaum, E. Grosse, and J.A. Reeds,

‘‘Cryptographic Protection of Membership Lists,’’

Newsletter of the International Association for

Cryptologic Research, vol. 9, no. 1, 1992, pp. 16-20;

ftp://cm.bell-labs.com/cm/cs/doc/91/4-12.ps.gz.

9. R. Pike et al., ‘‘Plan 9 from Bell Labs,’’ Proc. Summer

1990 UKUUG Conf., London, July 1990, pp. i-9.

10. S. Browne, P. McMahan, and S. Wells, Repository

In a Box Toolkit for Software and Resource Sharing,

Netlib and NA-Net: Building A Scientific Computing Community

40 IEEE Annals of the History of Computing

tech. report UT-CS-99-424, Dept. of Computer

Science, Univ. of Tennessee, Knoxville, 1999;

http://icl.cs.utk.edu/publications/tech_reports/

1999/ut-cs-99-424.ps.

11. K. Moore and J. Dongarra, NetBuild (version 0.02),

tech. report UT-CS-01-461, Dept. of Computer

Science, Univ. of Tennessee, Knoxville, 2001;

http://www.cs.utk.edu/,library/TechReports/

2001/ut-cs-01-461.ps.Z.

12. M. Kent, The Numerical Analysis Net (NA-NET),

Tech. Report 85, Institut fur Informatik,

Eidgenossische Technische Hochschule Zurich,

Jan. 1988.

13. J. Dongarra and B. Rosener, NA-Net / Numerical

Analysis Net, Tech. Report ORNL/TM-11986, Oak

Ridge Nat’l Laboratories, Dec. 1991.

Jack Dongarra is a University

Distinguished Professor of

Computer Science at the Uni-

versity of Tennessee and

holds the title of Distin-

guished Research Staff in the

Computer Science and Math-

ematics Division at Oak Ridge

National Laboratory. He is also an adjunct profes-

sor in the Computer Science Department at Rice

University, and Turing Fellow at the University of

Manchester. He is a Fellow of the AAAS, the ACM,

and the IEEE and a member of the National

Academy of Engineering. Dongarra has an MS in

computer science from the Illinois Institute of

Technology and a PhD in applied mathematics

from the University of New Mexico.

Gene H. Golub was born in

Chicago in 1932 and died 16

November 2007 in Stanford,

California. The Fletcher Jones

Professor of Computer Science

at Stanford University, Golub

was a member of the National

Academy of Engineering, the

National Academy of Sciences, and the American

Academy of Arts and Sciences, and an AAAS fellow.

His honors include the B. Bolzano Gold Medal for

Merits in the Field of Mathematical Sciences. One

of his best-known books is Matrix Computations

(Johns Hopkins Univ. Press, 3rd ed., 1996) co-

authored with Charles F. Van Loan. He received

a BS, an MA, and a PhD (all in mathematics) from

the University of Illinois at Urbana-Champaign.

Eric Grosse is an engineering

director for security at Goo-

gle. Earlier, he was at Alcatel-

Lucent’s Bell Laboratories

working on products and re-

search in network security,

systems, algorithms for nu-

merical approximation, and

visualization and scientific computing software.

Grosse received a PhD in computer science from

Stanford University. He is a member of the IEEE,

the SIAM, and the ACM.

Cleve Moler is chairman and

chief scientist at the Math-

Works. He has been a profes-

sor at the University of Michi-

gan, Stanford University,

University of New Mexico,

and the University of Califor-

nia, Santa Barbara. He is the

original author of Matlab, and of three textbooks

on numerical methods. Moler received a BS from

the California Institute of Technology, and an MS

and a PhD from Stanford, all in mathematics.

Keith Moore was a research

associate in the Computer

Science Department, Universi-

ty of Tennessee, from 1991

until 2007. He has participated

in several Internet Engineering

Task Force standardization

working groups since 1990,

which produced the MIME format for electronic

mail messages; extensions to the SMTP protocol for

negotiation of the message size and for delivery

reporting options; standard formats for reporting

electronic mail delivery successes, failures, delays,

and receipt notifications; definition and resolution

mechanisms for Uniform Resource Names; and

transition mechanisms for Internet Protocol ver-

sion 6. Moore received a BS in electrical engineering

at Tennessee Technological University and an MS

in computer science at the University of Tennessee.

Readers may contact Jack Dongarra about this

article at [email protected].

For further information on this or any other

computing topic, please visit our Digital Library

at http://computer.org/csdl.

April–June 2008 41