35
literature to tell us about metabolism Peter Murray-Rust, Reader Emeritus, Dept of Chemistry, Univ Cambridge and Founder TheContentMine Lhasa, Leeds, UK, 2017-01-12 contentmine.org is supported by a grant to PMR as a

Asking the scientific literature to tell us about metabolism

Embed Size (px)

Citation preview

Page 1: Asking the scientific literature to tell us about metabolism

Asking the scientific literature to tell us about metabolism

Peter Murray-Rust, Reader Emeritus, Dept of Chemistry, Univ Cambridge

and Founder TheContentMine

Lhasa, Leeds, UK, 2017-01-12

contentmine.org is supported by a grant to PMR as a

Page 2: Asking the scientific literature to tell us about metabolism

Thousands of scientists have to re-type the literature.

Machines should be doing it!

Treat them as friends.

100 clinical trials a day, 5000 articles a day

Page 3: Asking the scientific literature to tell us about metabolism

Software and Special Thanks

Molecular Informatics, CambridgePeter Corbett, OSCAR (chemical entities),Andy Howlett, “OSIRIS” (graphical chemistry)Daniel Lowe, OPSIN (name 2 structure)Lezan Hawizy, ChemicalTagger (recipes)Mark Williamson, integration and deployment

ContentMine Rik Smith-Unna, getpapers, quickscrape (discovery) Tom Arrow, WikiFactMine (Wikimedia semantics)PM-R norma, AMI (platform) CML (semantics)

ALL SOFTWARE IS OPEN (Apache2)

Page 4: Asking the scientific literature to tell us about metabolism

AMI! Tell me what YOU know about monoxidine?

Page 5: Asking the scientific literature to tell us about metabolism

Wikipedia

Page 6: Asking the scientific literature to tell us about metabolism

Wikidata for monoxidine

Page 7: Asking the scientific literature to tell us about metabolism

Wikidata for moxonidine

Page 8: Asking the scientific literature to tell us about metabolism

Entity extraction

OPSIN says this name is wrong! OSIRIS will interpret this structureIncluding the annotation

Page 9: Asking the scientific literature to tell us about metabolism

Reaction Schemes

Page 10: Asking the scientific literature to tell us about metabolism

Tables

Page 11: Asking the scientific literature to tell us about metabolism

Tables

Page 12: Asking the scientific literature to tell us about metabolism

Graphs

Page 13: Asking the scientific literature to tell us about metabolism
Page 14: Asking the scientific literature to tell us about metabolism
Page 15: Asking the scientific literature to tell us about metabolism

Entities

Page 16: Asking the scientific literature to tell us about metabolism

Plot

Plot

Page 17: Asking the scientific literature to tell us about metabolism

Maths?

Page 18: Asking the scientific literature to tell us about metabolism

Models?

Page 19: Asking the scientific literature to tell us about metabolism

What’s the title?

Page 20: Asking the scientific literature to tell us about metabolism

Some demos

Page 21: Asking the scientific literature to tell us about metabolism

What is “Content”?

http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0111303&representation=PDF CC-BY

SECTIONS

MAPS

TABLES

CHEMISTRYTEXT

MATH

contentmine.org tackles these

Page 22: Asking the scientific literature to tell us about metabolism

http://chemicaltagger.ch.cam.ac.uk/

• Typical

Typical chemical synthesis

Page 23: Asking the scientific literature to tell us about metabolism

Automatic semantic markup of chemistry

Could be used for analytical, crystallization, etc.

Page 24: Asking the scientific literature to tell us about metabolism

AMI https://bitbucket.org/petermr/xhtml2stm/wiki/Home

Example reaction scheme, taken from MDPI Metabolites 2012, 2, 100-133; page 8, CC-BY:

AMI reads the complete diagram, recognizes the paths and generates the molecules. Then she creates a stop-fram animation showing how the 12 reactions lead into each other

CLICK HERE FOR ANIMATION

(may be browser dependent)

Page 25: Asking the scientific literature to tell us about metabolism

UNITS

TICKS

QUANTITYSCALE

TITLES

DATA!!2000+ points

VECTOR PDF

Page 26: Asking the scientific literature to tell us about metabolism

Dumb PDF

CSV

SemanticSpectrum

2nd Derivative

Smoothing Gaussian Filter

Automaticextraction

Page 27: Asking the scientific literature to tell us about metabolism
Page 28: Asking the scientific literature to tell us about metabolism
Page 29: Asking the scientific literature to tell us about metabolism

Search on publicly accessible papers on “Zika”

https://rawgit.com/ContentMine/amidemos/master/zika/full.dataTables.html

Page 30: Asking the scientific literature to tell us about metabolism
Page 31: Asking the scientific literature to tell us about metabolism
Page 32: Asking the scientific literature to tell us about metabolism

C) What’s the problem with this spectrum?

Org. Lett., 2011, 13 (15), pp 4084–4087

Original thanks to ChemBark

Page 33: Asking the scientific literature to tell us about metabolism

After AMI2 processing…..

… AMI2 has detected a square

Page 34: Asking the scientific literature to tell us about metabolism
Page 35: Asking the scientific literature to tell us about metabolism

“… simulated by 21cmFAST is in principle independent”

“it is a feature of the 21cmFAST code, and is explained in §3.1.”

SciCodes[1]: Searching for software in arXiv[1]

[1] Proposal to LJ Arnold Foundation (Alice Allen ASCL and PMR)

Using the semi-numerical simulation, 21cmFAST,

[2] arxiv.org: the physics/maths/astronomy.. Preprint server

The language identifies the software!

arxIv has >500 mentions of “21cmFast”