3
Digital Humanities 2010 1 “It’s Volatile”: Standards- Based Research & Research-Based Standards Development Walsh, John A. [email protected] Indiana University Hooper, Wally Indiana University You even have my field guide. It's you I love. I have believed so long in the magic of names and poems I hadn't thought them bodiless at all. Tall Buttercup. Wild Vetch. "Often I am permitted to return to a meadow." It all seemed real to me last week. Words. You are the body of my world, root and flower, the brightness and surprise of birds. I miss you, love. Tell Leif you're the names of things. —Robert Hass, “Letter” It's volatile because anciently painted with wings in this manner whence came this character for mercury. — Sir Isaac Newton, “Praxis,” Babson Collection (Burndy Library Collection) MS. 420, Huntington Library Digital humanities scholarship often integrates humanities scholarship (literary studies, historical studies, and so on) with technological research and development. Some of this technological work takes the form of standards development. The most noteworthy example of such standards development in the digital humanities community is the Text Encoding Initiative (TEI). The TEI provides Guidelines for encoding texts for scholarly and general use. The TEI is pervasive in digital humanities and digital library contexts. It is a de facto standard developed and evolved over the past twenty some years through the efforts of a number of dedicated scholars, librarians, and technologists, and with input from the larger community of TEI users. Another standard of significance to the digital humanities community is Unicode. Our paper presents a case-study of a successful effort to have included in the Unicode standard dozens of characters required by the Chymistry of Isaac Newton, an ongoing digital humanities project to digitize and edit, study and analyze the alchemical works of Isaac Newton and to develop various scholarly tools around the collection. Unicode has become the universal character encoding standard. Unicode is nothing more, as it is certainly nothing less, than a massive mapping of characters to numbers, a mapping that seeks to accommodate all the world’s languages and writing systems, including symbols of all sorts—mathematical symbols and operators, astronomical and astrological symbols, Zapf Dingbats, and many more. Operating systems, and the applications built upon them— databases, word processors and text editors, browsers, graphics software, and games— depend on such mappings, or encodings, to reliably reference, store, input, output, and display textual data. The Unicode Consortium’s “What is Unicode” page http://unicode.org/s tandard/WhatIsUnicode.html accurately reports the standard’s significance: "Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends." In spite of Unicode’s impressive comprehensiveness, it does not include every character ever used. It does not at present, for instance, include many of the alchemical symbols found in Isaac Newton’s alchemical writings. Unicode provides a “private use area,” a series of reserved code points (the numbers assigned to characters) for projects and products to use “privately” for mapping to characters not represented in Unicode. A project like the Chymistry of Isaac Newton can make use of this private use area to map to characters that are not already described in the standard. A pitfall of the Private Use Area is that it is meant to be used privately; it is not suitable for easily interchangeable or interoperable data.

Development Chymistry of Isaac Newton, an ongoing digital

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Development Chymistry of Isaac Newton, an ongoing digital

Digital Humanities 2010

1

ldquoItrsquos Volatilerdquo Standards-Based Research ampResearch-Based StandardsDevelopment

Walsh John AjawalshindianaeduIndiana University

Hooper WallyIndiana University

You even havemy field guide Its you I love

I have believed so longin the magic of names and poems

I hadnt thought them bodilessat all Tall Buttercup Wild VetchOften I am permitted to return

to a meadow It all seemed real to melast week Words You are the bodyof my world root and flower thebrightness and surprise of birds

I miss you love Tell Leifyoure the names of things

mdashRobert Hass ldquoLetterrdquo

Its volatile because anciently paintedwith wings in this manner whence came

this character for mercurymdash Sir Isaac Newton ldquoPraxisrdquo

Babson Collection (Burndy Library Collection)MS 420 Huntington Library

Digital humanities scholarship often integrateshumanities scholarship (literary studieshistorical studies and so on) with technologicalresearch and development Some of thistechnological work takes the form of standardsdevelopment The most noteworthy exampleof such standards development in the digitalhumanities community is the Text EncodingInitiative (TEI) The TEI provides Guidelinesfor encoding texts for scholarly and generaluse The TEI is pervasive in digital humanitiesand digital library contexts It is a de factostandard developed and evolved over the pasttwenty some years through the efforts of anumber of dedicated scholars librarians andtechnologists and with input from the largercommunity of TEI users

Another standard of significance to thedigital humanities community is Unicode Ourpaper presents a case-study of a successfuleffort to have included in the Unicodestandard dozens of characters required by theChymistry of Isaac Newton an ongoing digitalhumanities project to digitize and edit studyand analyze the alchemical works of IsaacNewton and to develop various scholarly toolsaround the collection Unicode has becomethe universal character encoding standardUnicode is nothing more as it is certainlynothing less than a massive mapping ofcharacters to numbers a mapping that seeksto accommodate all the worldrsquos languagesand writing systems including symbols of allsortsmdashmathematical symbols and operatorsastronomical and astrological symbols ZapfDingbats and many more Operating systemsand the applications built upon themmdashdatabases word processors and text editorsbrowsers graphics software and gamesmdashdepend on such mappings or encodings toreliably reference store input output anddisplay textual data The Unicode ConsortiumrsquosldquoWhat is Unicoderdquo page httpunicodeorgstandardWhatIsUnicodehtml accurately reportsthe standardrsquos significance Unicode is requiredby modern standards such as XML JavaECMAScript (JavaScript) LDAP CORBA 30WML etc and is the official way to implementISOIEC 10646 It is supported in manyoperating systems all modern browsers andmany other products The emergence of theUnicode Standard and the availability of toolssupporting it are among the most significantrecent global software technology trends

In spite of Unicodersquos impressivecomprehensiveness it does not include everycharacter ever used It does not at presentfor instance include many of the alchemicalsymbols found in Isaac Newtonrsquos alchemicalwritings Unicode provides a ldquoprivate use areardquoa series of reserved code points (the numbersassigned to characters) for projects and productsto use ldquoprivatelyrdquo for mapping to charactersnot represented in Unicode A project like theChymistry of Isaac Newton can make use ofthis private use area to map to characters thatare not already described in the standard Apitfall of the Private Use Area is that it ismeant to be used privately it is not suitablefor easily interchangeable or interoperable data

Digital Humanities 2010

2

One projectrsquos implementation of the PrivateUse Area could conflict with another projectrsquosAnd fonts would not typically include charactersfor Private Use Area code points since bytheir nature these codepoints are not assignedpermanently to any one character but areperpetually open for private assignment not aspart of the public standard

So when a project stumbles upon a richcollection of important characters and symbolsthat are relevant and useful beyond theinterior confines of onersquos own project one canmake a significant scholarly contribution bydocumenting and describing these charactersand proposing them for inclusion in theUnicode encoding standard The alchemicalsymbols so common in Isaac Newtonrsquos chymicalmanuscripts are common also throughoutmanuscript and print alchemical literature Thegraphically and semantically rich symbols alsohave potential utility in design computer artand even gaming applications Even the fewsymbols that are potentially unique to Newtonare worthy of consideration in the Unicodestandard given Newtonrsquos stature as one of thegiants of science and the vast wealth of scientifichistorical biographical and popular literaturerelated to Newton

Figure 1 Basil Valentine ldquoA Table of Chymicall ampPhilosophicall Charecters with their signsrdquo The LastWill and Testament of Basil Valentine 1671 Theseand other symbols are commonly found in Newton

The process by which one moves a Unicodeproposal through the development reviewand approval process is formal and rigorousIt is very rewarding in fostering a betterunderstanding of onersquos source material and inpointing the way to undiscovered or avoidedbasic research questions To encode and identifycharacters and symbols one must name thethings and naming is indeed a very difficultand powerful task a task often challenged andenriched by puzzling ambiguity and obscurityThe process is very rewarding also becauseit is very much peer-reviewed Our proposalgreatly benefited from an iterative review andexcellent advice challenging questions andconstructive criticism from a number of verysmart helpful interested experts serving on theUnicode Technical Committee (UTC)

Digital Humanities 2010

3

Our paper provides a case-study of one projectrsquosnavigation through the Unicode proposalreview and approval process We also providea more theoretical discussion illustrationand examination of the mutually beneficialrelationship between technical standardsdevelopment and basic humanities research

ReferencesUnicode Consortium (15 June 2009) Whatis Unicode http unicodeorgstandardWhatIsUnicodehtml (accessed 15 Nov 2009)

Newman William R (ed) (9 May 2008)The Chymistry of Isaac Newton httpwwwchymistryorg (accessed 15 Nov 2009)

Page 2: Development Chymistry of Isaac Newton, an ongoing digital

Digital Humanities 2010

2

One projectrsquos implementation of the PrivateUse Area could conflict with another projectrsquosAnd fonts would not typically include charactersfor Private Use Area code points since bytheir nature these codepoints are not assignedpermanently to any one character but areperpetually open for private assignment not aspart of the public standard

So when a project stumbles upon a richcollection of important characters and symbolsthat are relevant and useful beyond theinterior confines of onersquos own project one canmake a significant scholarly contribution bydocumenting and describing these charactersand proposing them for inclusion in theUnicode encoding standard The alchemicalsymbols so common in Isaac Newtonrsquos chymicalmanuscripts are common also throughoutmanuscript and print alchemical literature Thegraphically and semantically rich symbols alsohave potential utility in design computer artand even gaming applications Even the fewsymbols that are potentially unique to Newtonare worthy of consideration in the Unicodestandard given Newtonrsquos stature as one of thegiants of science and the vast wealth of scientifichistorical biographical and popular literaturerelated to Newton

Figure 1 Basil Valentine ldquoA Table of Chymicall ampPhilosophicall Charecters with their signsrdquo The LastWill and Testament of Basil Valentine 1671 Theseand other symbols are commonly found in Newton

The process by which one moves a Unicodeproposal through the development reviewand approval process is formal and rigorousIt is very rewarding in fostering a betterunderstanding of onersquos source material and inpointing the way to undiscovered or avoidedbasic research questions To encode and identifycharacters and symbols one must name thethings and naming is indeed a very difficultand powerful task a task often challenged andenriched by puzzling ambiguity and obscurityThe process is very rewarding also becauseit is very much peer-reviewed Our proposalgreatly benefited from an iterative review andexcellent advice challenging questions andconstructive criticism from a number of verysmart helpful interested experts serving on theUnicode Technical Committee (UTC)

Digital Humanities 2010

3

Our paper provides a case-study of one projectrsquosnavigation through the Unicode proposalreview and approval process We also providea more theoretical discussion illustrationand examination of the mutually beneficialrelationship between technical standardsdevelopment and basic humanities research

ReferencesUnicode Consortium (15 June 2009) Whatis Unicode http unicodeorgstandardWhatIsUnicodehtml (accessed 15 Nov 2009)

Newman William R (ed) (9 May 2008)The Chymistry of Isaac Newton httpwwwchymistryorg (accessed 15 Nov 2009)

Page 3: Development Chymistry of Isaac Newton, an ongoing digital

Digital Humanities 2010

3

Our paper provides a case-study of one projectrsquosnavigation through the Unicode proposalreview and approval process We also providea more theoretical discussion illustrationand examination of the mutually beneficialrelationship between technical standardsdevelopment and basic humanities research

ReferencesUnicode Consortium (15 June 2009) Whatis Unicode http unicodeorgstandardWhatIsUnicodehtml (accessed 15 Nov 2009)

Newman William R (ed) (9 May 2008)The Chymistry of Isaac Newton httpwwwchymistryorg (accessed 15 Nov 2009)