25

Mark Kaganovich, SolveBio // Data Infrastructure for Genomics

Embed Size (px)

Citation preview

Precision Medicine

We have a lot of data

and don’t know what to do

with it yet... medicine

Precision medicine?

Books you don’t want to see at your doctor’s office.

Not quite there yet...

Are we there yet?

We have the technology...

Illumina

28 Billion market cap

More data!

Even more data!

The horrawful truth!

Good luck with that...

Known unknowns

20 Billion new variants

will be observed in 5yrs

150,000,000

VARIANTS OBSERVED

2015

VARIANTS WE UNDERSTAND

Challenge accepted!

BIOINFORMATICS EXPERT

Rare disease go-to-guy

Center for Rare Jewish Genetic DisordersBrooklyn, NY

Variants:

Diagnosis:Family:

Hospital:

UnclassifiedUnknown

UnsatisfiedJob complete

OUTCOMES

ONE YEAR LATER

Different familyDifferent hospital

Same story

ClinVar

The goverment’s solution.Yet another FTP site.

Submitting to ClinVar

Super painful process.

You’ll never want to submit again.

Data infrastructure for genomics

CLINICAL REPORTDNA

MiSeq

SolveBio Beta

solve.bio/signup

ClinVar on SolveBio

Dataset.retrieve('ClinVar/3.1.0-2015-01-13/Variants').query()

p Variant Explorer

GRCh37:chr7:117199644-117199647>ADate Generated - 2012 / 12 / 08 12:01:45PM EST

Rare VariantCLINICAL EVIDENCE

Reported Pathogenic

F F

POPULATION GENETICS

<1% GMAF

EFFECT PREDICTION

Inframe deletion

VARIANT IDENTIFICATION

7

CHR

Deletion

TYPE

3bp

SIZE

117,199,647117,199,644

START STOP

ATCT A

REF ALT

NG_016465.3:g.98809_98811delCTTNC_000007.13:g.117199646_117199648delCTTNC_000007.14:g.117559592_117559594delCTTNG_016465.1:g.84630_84632delCTT

CODING DNA PROTEIN GENOMIC

NM_000492.3:c.1521_1523delCTTXM_006715842.1:c.1845_1847delCTT

NP_000483.3:p.Phe508delNP_000483.3:p.Phe508delPheXP_006715905.1:p.Phe616del

HGVS NM_000492.3:c.1521_1523delCTT

117,199,667

117,199,644

117,199,647

117,199,624

3’ ALIGNMENT5’ ALIGNMENT

Better way to explore the genome