34
Advancing the International Plant Names Index (IPNI) Nicky Nicolson, Alan Paton, Jim Croft, James Macklin, Paul Morris, Greg Whitbread, Kanchi Gandhi

Advancing the International Plant Names Index (IPNI)

  • Upload
    nickyn

  • View
    722

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Advancing the International Plant Names Index (IPNI)

Advancing the International Plant Names Index (IPNI)

Nicky Nicolson, Alan Paton, Jim Croft, James Macklin, Paul Morris, Greg Whitbread, Kanchi Gandhi

Page 2: Advancing the International Plant Names Index (IPNI)

Advancing IPNI

• Current - where IPNI is now• Issues • Future - where we’d like to go and how to get

there

Page 3: Advancing the International Plant Names Index (IPNI)

What data?

• What data types:– ICBN governed nomenclatural acts– Standardised author list– Publications

• Which groups:– Vascular plants

• Which ranks:– Family and below

Page 4: Advancing the International Plant Names Index (IPNI)
Page 5: Advancing the International Plant Names Index (IPNI)
Page 6: Advancing the International Plant Names Index (IPNI)
Page 7: Advancing the International Plant Names Index (IPNI)
Page 8: Advancing the International Plant Names Index (IPNI)
Page 9: Advancing the International Plant Names Index (IPNI)
Page 10: Advancing the International Plant Names Index (IPNI)
Page 11: Advancing the International Plant Names Index (IPNI)
Page 12: Advancing the International Plant Names Index (IPNI)
Page 13: Advancing the International Plant Names Index (IPNI)

How is data entered?

• Data entry:– From literature scanning, journals received by library at

Kew, Harvard, Canberra (2 years - 95%)– User reports of missing nomenclatural acts, usually

accompanied by a link to digitised literature page (BHL)• How many?

– About 7400 names entered in average year– About 6100 nomenclatural acts published / year– … of these about 2800 are tax. novs.

Page 14: Advancing the International Plant Names Index (IPNI)

How is data managed?• Full audit history on core objects – names /

authors / publications.• Average 300,000 edits on name records / year• Standardisation effort ongoing :

– Epithet– Author citation – Publication title– Collation– Year

Page 15: Advancing the International Plant Names Index (IPNI)

Standardisation – author and titleAuthor and Title standardization

30%

40%

50%

60%

70%

80%

90%

standardized author citations standardized publication title

Page 16: Advancing the International Plant Names Index (IPNI)

Standardisation – epithet updates

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

2006

-01

2006

-03

2006

-05

2006

-07

2006

-09

2006

-11

2007

-01

2007

-03

2007

-05

2007

-07

2007

-09

2007

-11

2008

-01

2008

-03

2008

-05

2008

-07

2008

-09

2008

-11

2009

-01

2009

-03

2009

-05

2009

-07

2009

-09

2009

-11

2010

-01

2010

-03

2010

-05

2010

-07

2010

-09

2010

-11

2011

-01

2011

-03

2011

-05

Page 17: Advancing the International Plant Names Index (IPNI)

Standardisation of epithets

• Why important – Main search criterion– Improving epithets enables other improvements

in dataset e.g.:• basionym linkage• de-duplication

– Errors propagate

Page 18: Advancing the International Plant Names Index (IPNI)

Rhus keamcyi was an OCR error for Rhus kearneyi but the incorrect value persists in datasets derived from IPNI

Page 19: Advancing the International Plant Names Index (IPNI)

Statistics

• Dataset can be used for trends analysis:– Publication rates– Combination rates– Author collaborations

• Audit history used to determine changes in data-set over time

http://www.ipni.org/stats.html

Page 20: Advancing the International Plant Names Index (IPNI)

http://www.ipni.org/stats.html

Page 21: Advancing the International Plant Names Index (IPNI)

As well as the data…

• IPNI editors respond to user queries about the data, dealing with c. 50 cases / month

• Includes an expert service re interpretation of ICBN

• Can provide worked examples illustrating particular articles of the code

Page 22: Advancing the International Plant Names Index (IPNI)

Why should anyone care?

• c55,000 searches / dayBUT• dataset is not being used to full advantage• inputs not being handled efficiently:

– limited to partnership– missing out on community input

• expertise is hidden

Page 23: Advancing the International Plant Names Index (IPNI)

Future

• Increase efficiency of input– provision of core data– annotating and linking existing data– solving nomenclatural problems

• Increase output– usage of IPNI data– benefit from on-going curation effort– benefit from nomenclatural expertise

Page 24: Advancing the International Plant Names Index (IPNI)

Data in - contributor services

• Pre-publication data entry• Batch submission of datasets• Annotation• Addition of links within dataset• Facilitate interpretation of nomenclatural

issues• Accreditation – credit for helping improve the

data

Page 25: Advancing the International Plant Names Index (IPNI)

Pre-publication data entry• Workflow currently being trialled

– Author or publisher submits data to IPNI once article has been accepted for publication

– Generated record suppressed until publication effective under the code

– But this not yet automated!

Page 26: Advancing the International Plant Names Index (IPNI)

Electronic Publication Example - Phytokeys

A nomenclator of Pacific oceanic island Phyllanthus (Phyllanthaceae), including Glochidion

Warren L. Wagner, David H. Lorence

• 5. Phyllanthus atalotrichus (A.C. Sm.) W.L. Wagner & Lorence, comb. nov.

urn:lsid:ipni.org:names:77112693-1

PhytoKeys 4: 67–94 (2011)doi: 10.3897/phytokeys.4.1581www.phytokeys.com

Page 27: Advancing the International Plant Names Index (IPNI)

Pre-publication issues• Name squatting – mitigated by only entering

names which are in papers accepted for publication

• Curation of record throughout publication process

• Electronic and effective publication – before this the record will not be visible

• IPNI editors provide visible expert service re validity of name

Page 28: Advancing the International Plant Names Index (IPNI)

Where IPNI data are placed

Any name occurrence: e.g. specimens, reports, literature citation

concepts

Standard form of name

Page 29: Advancing the International Plant Names Index (IPNI)

Data out - links

• To concept layer:– embed IPNI identifiers– storage of factual concepts / links to concept layer

• To name occurrence layer:– seed lexical reconciliation projects (e.g. GNI)

• To allied information:– literature– types

Page 30: Advancing the International Plant Names Index (IPNI)

Links to concept layerEmbed IPNI identifiers in externally held names lists• IPNI holds curated name data, labelled with persistent

identifiers.

• Need a tool to seed IPNI identifiers into datasets (in prototype)

• Can devolve curation of name elements in other systems to IPNI

Benefit from on-going curation:• 300,000 edits per year

Report on changes in name list since date

Page 31: Advancing the International Plant Names Index (IPNI)

Links to the Concept LayerExample The Plant List

Page 32: Advancing the International Plant Names Index (IPNI)

Link to name occurrence layer

• IPNI’s version history can be used to seed lexical reconciliation projects (GNI), e.g.:– Plectranthus macrophylius -> Plectranthus macrophyllus

• These editorialised translations of higher value than programmatically derived operations of the same edit distance, e.g:– Plectranthus microphyllus -> Plectranthus macrophyllus

• Standardisation tools and techniques opened up for use in allied projects

Page 33: Advancing the International Plant Names Index (IPNI)

Conclusion

• Faciliate electronic publication - pilot registration

• Foster larger community to support the data and automate workflows

• Stronger links between:– the people who produce names– the places where they are published– the downstream users

• Technical redevelopment

Page 34: Advancing the International Plant Names Index (IPNI)