14
Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, [email protected] Dauvit King, The Open University, UK, [email protected] ViBRANT/BeBOL/JEMU workshop, RBINS, 11 June 2013 ViBRANT Virtual Biodiversity

Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK, [email protected]@open.ac.uk

Embed Size (px)

Citation preview

Virtual BiodiversityViBRANT

Literature Mining and Mark-upViBRANT’s text processing tools

David Morse, The Open University, UK, [email protected] King, The Open University, UK, [email protected]

ViBRANT/BeBOL/JEMU workshop, RBINS, 11 June 2013

ViBRANTVirtual Biodiversity

Virtual BiodiversityViBRANT

2 of

Literature Mining

14

ViBRANT is for taxonomists, so we look for:• Taxon names• Authors• Locations

Also interested in:• Citations• Relationships

Mining for Names and Concepts

Virtual BiodiversityViBRANT

Literature Mining – harder than you thinkM

BRITISH MUSEUM

(NATURAL HiSi

26JU

PRESENTED GENERAL UC.-lARY

Bulletin ofthe

BritishMuseum (Natural History)

The ichneumon-fly genus Banchus in the OldWorld

(Hymenoptera)

M. G. Fitton series

Entomology Vol51 Nol 25 July 1985

3 of 14

Virtual BiodiversityViBRANT

4 of

GoldenGATE

14

Sautter, G., Agosti, D., and Böhm. K. (2007) Semi-Automated XML Markup of Biosystematics Legacy Literature with the GoldenGATE Editor. In Proceedings of PSB 2007, Wailea, HI, USA, 2007

Downloadable from http://psb.stanford.edu/psb-online/proceedings/psb07/sautter.pdf

Virtual BiodiversityViBRANT

5 of

GoldenGATE

14

Virtual BiodiversityViBRANT

6 of

GoldenGATE in OBOE

14

Virtual BiodiversityViBRANT

7 of

GoldenGATE in OBOE

14

Virtual BiodiversityViBRANT

8 of

GoldenGATE in OBOE

14

Virtual BiodiversityViBRANT

9 of

GoldenGATE in OBOE

14

Virtual BiodiversityViBRANT

10 of

Visualising mark up

14

Virtual BiodiversityViBRANT

11 of

Taxonomic XML schemas

14

Lyubomir Penev, Christopher Lyal, Anna Weitzman, David Morse, David King, Guido Sautter, Teodor Georgiev, Robert Morris, Terry Catapano, and Donat Agosti. (2011) XML schemas and mark-up practices of taxonomic literature. ZooKeys 150: 89-116.

Downloadable from http://dx.doi.org/10.3897/zookeys.150.2213

Virtual BiodiversityViBRANT

12 of

Linked Open Data

14

Virtual BiodiversityViBRANT

13 of

Other tools

14

KEAKeyphrase Extraction Algorithm

GNRDGlobal Names Recognition and Discovery

LinnaeusUsed for molecular data

Virtual BiodiversityViBRANT

14 of

Conclusion

14

Developing Literature Mining services deployed through OBOE.

Initially aimed at ViBRANT’s core audience.

Setting up workflow integrated with Scratchpads.

Yet still permitting large, slow jobs.