Upload
nextbio
View
923
Download
0
Tags:
Embed Size (px)
Citation preview
Ilya Kupershmidt
Cofounder, VP Products
Public Data- Thousands of studies and quickly growing
- GEO, Array Express (AE), caBIG, dbGAP, etc.
The Data Problem
Internal Data- Thousands of datasets across silos
- Data sitting on user’s desktops across the company
Instant access to all data- Seamlessly interrogate studies across therapeutic areas, projects, groups
- Connect results across data types and organisms
- Automatically update your previous findings based on new data
Query data in real time- What experiments have been done that are similar to my own experiment?
- Which compounds (approved/proprietary) inhibit my pathway of interest?
- What tissues/organs will be affected by my drug in animal models or in human?
- Check whether experiment you are planning has already been done
Ideal Scenario
Many Data Sources
- Disconnected repositories
- Non-standardized annotations
Data Is Heterogeneous
- Diverse organisms and platforms
- Orthogonal data types
Requires Scalable Systems
- Large quantities of data can’t be analyzed in real time
- Need to pre-compute important associations
- Number of pre-computations increases in N2 fashion
Why connecting data is complex
Data Pre-Processing
Total Pre-computed Scores: > 15 Billion
Data Correlation And Meta Analysis
NextBio Framework
NextBio Enterprise
Data Integration & Access
Case Study
Einstein discovered relativity theory without running a single new experiment!
In many cases most relevant experiments have already been done…
Leveraging Existing Data
Jane Su
Suman Sundaresh
Anoop Grewal
Satnam Alag
James Flynn
NextBio Team
Special Thanks
Thank You