Upload
university-of-rostock
View
249
Download
0
Embed Size (px)
Citation preview
Model Management for Systems biology ProjectsDagmar Waltemath (University of Rostock)
1st RSGLux congress. Belval, Luxembourg. November 2016
All comic-style graphics in this presentation were done either by Anna Zhukova or by Martin Peters. Thank you very much!
Disclaimer
2
Who I am and what I do
Projects. SEMS | de.NBI data management for German Bioinformatics network | SBGN-ED+
Community work. Standard development | COMBINE coordinator | SBML editor
Research interests. Reproducibility of modeling results | Sustainability of scientific outcomes
Other things. Education of young scientists | Open Access & open data | Gender equality in science
SEMS@University of Rostock, Germany (2015)
3
Model management. Or: How I got into this reproducibility topic...
4
Reproduce simulations
Ship & archive modeling results
Detect differences
Understand model evolution
Develop management strategies for models
2008 2012 2014 2016
Why many want data managed
I need support in organising the data for my thesis.
Funders say I must make all project data available for the next 10 years.
I need to share parts of my data with collaborators and want to keep track.
These are only some examples.5
...and why they still don’t do it.
This takes time.
The software does not support the format I need for my data.
I do not want to share my data. I want full control.
These are only some examples – there are many, many more. 6
50+ %of research studies are not reproducible*!
But why they should …
7*study performed by Bayer (2011) to check replicability of 67 results in cancer studies. More in: Waltemath & Wolkenhauer (2016) IEEE TBME
Problem: Many data items Characteristics of the data
– Heterogeneous
– Big
– Distributed
– Complex
8
Problem: Many data items Characteristics of the data
– Heterogeneous
– Big
– Distributed
– Complex
9
Requirements of the field
– Long-term availability
– Thorough documentation/trust
– High data quality
– Interoperability & reusability
How do we manage the data
10
… once we have it?
science sucks - sterni4ever
Use & follow a data management planData management
● procedures and actions that help to store, preserve, organize and control the data generated during a (research) project.
Examples & resources
● Data management plans provided by funders, e.g. NIH
● Checklist for a data management plan
11
Use & follow a data management planKey principles
● Avoid re-collection of data
● Keep control of data at all steps of the data life cycle
● Justify data collection Specify the collected data
● Perform data audit
● Archive the data
12
Is the data archived properly? What are the planned
destruction mechanisms?
What kind of data is collected? How was it processed?
Is the data fit for purpose and held securely?
Is the data useful and the data collection effective?
Use a dedicated model management system
Benefits
– Your data is organised and documented.
– Your data is kept safe (backup) and secure.
– User and sharing management for small and large projects, and for work groups.
– Management functionality comes for free, e.g. interlinks to other databases, version control, search!
13
Use a dedicated model management system
Example: FAIRDOMHub
– Data & model management for Systems Biology
– Follows the FAIR principles (Wilkinson et al 2016)
– User support, PALs meetings, online tutorials
– Project based instances, ISAtab, but flexible
14More information at: https://www.fairdomhub.org/
Use a dedicated model management system
15More information at: https://www.fairdomhub.org/
Use a dedicated model management system
16More information at: https://www.fairdomhub.org/
Use a dedicated model management system
17
Version 2 Version 4
More information at: https://www.fairdomhub.org/
Use standards for data sharing and interoperability
18Fig.: Mosaic of standards, adapted from Chelliah et al (2009) DILS
Guidelines, ontologies and standards for modeling & simulation of biological systems.
Use standards for data sharing and interoperability
19Figure: Draeger and Palsson (2014). More on COMBINE at: http://co.mbine.org
Help developing standardsAccess to all specificationsTutorials, forums, mailing listsEvents
Guidelines, ontologies and standards for modeling & simulation of biological systems.
Publish, share & archive your study in a model repository
20
CuratedOpen
Standard formats
Repositories: BiGG, BioModels, JWS Online Model Database, Physiome Model Repository
Publish, share & archive your study in a model repository
21
CuratedOpen
Standard formats
Repositories: BiGG, BioModels, JWS Online Model Database, Physiome Model Repository
Care for your models’ quality
● MIASE and MIRIAM Guidelines → read, understand, implement.
● COMBINE annotations (RDF / OWL / Bio-ontologies)
– To annotate models: COPASI, libSBML
– To annotate simulations: SED-ML Web Tools, JWS Online Simulator
– Specifically: Add SBO terms wherever possible to improve later conversion between standards*
22*Format converters for COMBINE standards Rodriquez et al (2016)
Semanticannotations
to bio-ontologiesQ
ualit
y en
hanc
er
Care for your models’ quality
● Open publication in model repositories, e.g.: in BioModels, JWS Online Model Database, Physiome Model Repository
● Full documentation of provenance, e.g.: Research Object frameworkExport and publish study as COMBINE Archive, e.g.: using COMBINE Archive Web, JWS Online, SED-ML Web Tools
23
Documented, reproducible
simulation studyQ
ualit
y en
hanc
er
Link: JWS Online Simulation Database. Peters et al (2016, under revision)
Care for your models’ quality
● Functional curation (testing models under a range of perturbations), e.g.: in the Cardiac Electrophysiology Web Lab
● Documentation of origin for all parameter values
● Linking model – simulation studies – experimental data – conditions – simulation data – publication
24
Validation of modelbehavior
Qua
lity
enha
ncer
Care for your models’ quality
25
Validation of modelbehavior
Qua
lity
enha
ncer
Figures: Electrophysiology Web Lab Cooper et al (2016)
In summary: Make your study valuable & sustainable
Check reproducibility prior to publication!
26Steps towards making a study reproducible: Henkel et al (2013), Springer – closed access :(
If your work is available,
documented, and open
We can index it, so it can be retrieved by
others.
27
Collecting & integrating modeling dataMASYMOS: Store models
28Figure (left): Visualising database content for 6 BioModels & versions (courtesy M. Peters), Figure (right): Henkel et al (2013) DATABASE
Collecting & integrating modeling data
29
JWS Online: Find simulations
Figure: Peters et al (2016) under revision
Provenance – who changed what when where and why?BiVeS: Keep track of changes in a model
30More information in: Scharm et al (2015) BIOINFORMATICS, https://sems.uni-rostock.de/projects/bives/
Provenance – who changed what when where and why?
31Figure: courtesy V. Touré, Scharm et al (in preparation), http://most.sems.uni-rostock.de
version 3 05-06-2006
version 5 05-01-2007
version 4 03-10-2006
BIVES diff 3-4
BIVES diff 4-5
version 13 26-01-2010
version 15 15-04-2011
version 14 30-09-2010
BIVES diff 13-14
BIVES diff 14-15
MOST: Keep track of changes in public model repositories
Reusable models
Fully featured COMBINE archive
Example of a complete COMBINE archive (BIOM 144).
Recon 2
Reconstruction of human metabolism reuses existing networks.
Whole cell model
Based on >170 publications. All model-related data & code available.
These are only some examples. Much to explore on BioModels, FAIRDOMHub, biosharing, ...32
Thank you!Contact me if you want:
• help with our tools
• help with COMBINE standards
• set up a FAIRDOMHub project
• get involved in all the exciting efforts.
Ron HenkelMASYSMOS
Martin PetersM2CAT, JWS, MASYMOS
Martin ScharmBiVeS, Web Lab
Tom GebhardtMOST
Vasundra TouréSBGN-ED
Mariam NassarRanking, MASYMOS@dagmarwaltemath
Orcid: 0000-0002-5886-5563