12
DNA Storage Karin Strauss and the DNA Storage Team

Karin Strauss - DNA Storage, July 2016

Embed Size (px)

Citation preview

DNA StorageKarin Strauss

and the DNA Storage Team

Next generation of big data

7/29/2016 2

Video + sensorsGenomics data

© Microsoft Research. All rights reserved.

The digital universe is growing

7/29/2016 3

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

30,000,000

35,000,000

40,000,000

45,000,000

50,000,000

Petabytes

Digital Universe

Installed Capacity

Capacity Shipped

Source: IDC

© Microsoft Research. All rights reserved.

Dense, really dense

Vs.

Cold Storage: 1EB, Size: Two Walmart Supercenters

7/29/2016 4

It’s here!

3-4 orders of magnitude

denser than tape

© Microsoft Research. All rights reserved.

Durable

7/29/2016 5

(Illustration: Philipp Stössel/ETH Zurich)

DNA synthetic fossils survive:

Source: Grass et al. Robust Chemical Preservation of Digital

Information on DNA in Silica with Error-Correcting Codes

And readers never become obsolete

Time Temperature

1 week 70˚C

2,000 years 10˚C

2,000,000 years -18˚C

© Microsoft Research. All rights reserved.

A DNA storage primer

7/29/2016 6

G

0101000101011100

A C T A CG

GA C Tbases:

A C T

addressdata

© Microsoft Research. All rights reserved.

Encoding Synthesis

G

A

C

T

AG

CA

C

T

Sequencing Decoding

Random

Access

(write) (read)

Preservation

Copying DNA

7/29/2016 7

C GTGCGAG GA C T

address

G C C TG A C T G A C T

primer target primer target

C T G A

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

primer

© Microsoft Research. All rights reserved.

Random Access with DNA

7/29/2016 8

0101000101011100

C GTGCGAG GA C T

address

G A C TA A G A A A C G

primer target primer target

C GTGCGAG GA C T

address

G C C TG A C T G A C T

primer target primer target

C T G A

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

C GTGCGAG GA C TG C C TG A C T G A C T

© Microsoft Research. All rights reserved.

selecting one

item out of two

DNA storage works

7/29/2016 9© Microsoft Research. All rights reserved.

Encoding Synthesis

G

A

C

T

AG

CA

C

T

Sequencing Decoding

Random

Access

(write) (read)

200MB 200MB

latency: ~day latency: ~hoursArchival Storage

Improvements by biotechnology industry

7/29/2016 10

Source: Robert Carlson

1.00E+02

1.00E+03

1.00E+04

1.00E+05

1.00E+06

1.00E+07

1.00E+08

1.00E+09

1.00E+10

1.00E+11

1.00E+12

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020

YearTransistors per chip Reading DNA (bases/person/day) Writing DNA (bases/person/day)

Carlson’s Curves

Moore’s Law

DNA Reads

DNA Writes

© Microsoft Research. All rights reserved.

DNA: ultimate storage

© Microsoft Research. All rights reserved.7/29/2016 11

Dense

Durable

Never obsolete

How would you use it?

Questions?

7/29/2016 12© Microsoft Research. All rights reserved.