Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference

Ilya Kupershmidt

Cofounder, VP Products

Public Data- Thousands of studies and quickly growing

- GEO, Array Express (AE), caBIG, dbGAP, etc.

The Data Problem

Internal Data- Thousands of datasets across silos

- Data sitting on user’s desktops across the company

Instant access to all data- Seamlessly interrogate studies across therapeutic areas, projects, groups

- Connect results across data types and organisms

- Automatically update your previous findings based on new data

Query data in real time- What experiments have been done that are similar to my own experiment?

- Which compounds (approved/proprietary) inhibit my pathway of interest?

- What tissues/organs will be affected by my drug in animal models or in human?

- Check whether experiment you are planning has already been done

Ideal Scenario

Many Data Sources

- Disconnected repositories

- Non-standardized annotations

Data Is Heterogeneous

- Diverse organisms and platforms

- Orthogonal data types

Requires Scalable Systems

- Large quantities of data can’t be analyzed in real time

- Need to pre-compute important associations

- Number of pre-computations increases in N2 fashion

Why connecting data is complex

Data Pre-Processing

Total Pre-computed Scores: > 15 Billion

Data Correlation And Meta Analysis

NextBio Framework

NextBio Enterprise

Data Integration & Access

Case Study

Einstein discovered relativity theory without running a single new experiment!

In many cases most relevant experiments have already been done…

Leveraging Existing Data

Jane Su

Suman Sundaresh

Anoop Grewal

Satnam Alag

James Flynn

NextBio Team

Special Thanks

Thank You

Technology

Ilya Kupershmidt speaks at the Molecular Medicine Tri-Conference