16
Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian Institution

Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Global Genomic Resources for Biodiversity Research

1

Jonathan Coddington Global Genome Initiative Smithsonian Institution

Page 2: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Before After

2

Hard-to-find, ambiguous quality tissues ambiguously owned by individual PI’s

Discoverable, genomic samples in institutional biorepositories (best practices & int. treaties)

“Boutique” sequencing of a few genomes

Affordable, coordinated, sequencing of a thoughtful synopsis of all of Life

Phenotypic, expert-limited taxonomy Dispersed environmental biology, evolution, conservation, ecology, biotech

Approximate “mesoscale” IDs of most organisms anywhere Precise, scalable, cheap tools

Global Genome Initiative

Page 3: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Four Ways to Be a Genomic Sample

3

Fo

Cold Living (good)

Cold Dead (less good)

Warm Dead (ok…)

Warm Living (best)

Living

Dead

Warm Cold

Wildlands (!!), parks (!), zoos, botanical gardens, aquaria

Cell cultures, seed banks, biobanks, biorepositories

Museums, herbaria, other collections

biobanks, biorepositories, research freezers, etc.

3

LIFE ON ICE!

Page 4: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Eukaryotes

Global Genome diversity = Σ (branch lengths)?

Just one Genome!

Page 5: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Discovery of “Families”

5

-

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 2025

Per

cent

202

0 To

tal

Earliest Description

Discovery of Major Lineages (Families)

Angiosperms

Chordata

Animalia

Foraminifera

Fungi

Page 6: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Big Data for 9,911 families

6

Source Total Records % Families Missing MaxGBIF 955,459,561 0.93 710 51,162,821 BHL 3,603,706 0.79 2,064 31,017 NCBI 219,978,814 0.77 2,288 17,150,286 OTOL* 1,903,704 0.76 2,407 39,969 BOLD 6,373,906 0.57 4,282 424,231 EOL 99,886 0.38 6,132 2,707 GGBN 1,632,440 0.36 6,334 165,650 Total 1,187,419,577 196

*OTOL = Open Tree of Life

Only 196 families absent from all 7 DBs

High Priority for IBOL2: 4,282 families?

Page 7: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

AllPhyla

3

16

GlobalGenomeBiodiversityNetworkTissues/DNA(68)

NCBIGenBank

DNABarcodes

(37)

CatalogueofLife(101)

10

20

Lastupdated03October2018

Results exclude namesfrom GCM that did notmatch to CoL. MismatchrateforGCMwas2%.

GlobalCatalogueofMicroorganismsCultures(57)

33 4

10

5

Animalia: Cycliophora, Dicyemida, Entoprocta, Gastrotricha, Gnathostomulida, Micrognathozoa, Myxozoa, Nematomorpha, Onychophora, Orthonectida, Placozoa, Chromista: Acavomonidia, Picozoa, Radiozoa, Plantae: Anthocerotophyta, Protozoa: Calcitarcha, Choanozoa, Metamonada, Microsporidia

Page 8: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

AllFamilies

787

4,942

GlobalGenomeBiodiversityNetworkTissues/DNA(3,457)

NCBIGenBank

DNABarcodes(3,248)

CatalogueofLife(9,858)

186

2,118

Lastupdated03October2018

Results exclude namesfromthatdidnotmatchtoCoL. Mismatch rates:GGBN 7%, GenBank 8%,GCM7%.

GlobalCatalogueofMicroorganismsCultures(1,221)

363157

515

790

Page 9: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

AllGenera

11,722

126,141GlobalGenomeBiodiversityNetworkTissues/DNA(17,783)

NCBIGenBank

DNABarcodes(24,375)

CatalogueofLife

(160,938)

344

11,730

Lastupdated03October2018

Results exclude namesfromthatdidnotmatchtoCoL. Mismatch rates:GGBN 6%, GenBank 10%,GCM7%.

GlobalCatalogueofMicroorganismsCultures(6,402)

766579

4,713

4,943

Page 10: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

GGBN Solves a Problem data model for tissues, DNAs, etc.

GBIF specimens &

vouchers

NCBI, BOLD sequences

GGBN Tissues, DNAs, RNAs, physical

genetic resources

The Missing

Link

Page 11: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

83 members, 30 countries, 2M samples, 3,899 families, 20K genera, 45K species

Page 12: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

GGBN DarwinCore Extension (156 fields, 9 mandatory)

http://terms.tdwg.org/wiki/GGBN_Data_Standard

Page 13: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

[CELLRANGE]

[CELLRANGE]

[CELLRANGE] [CELLRANGE] [CELLRANGE]

[CELLRANGE] [CELLRANGE]

8

96 313

1,365

9,446

156,056

1,885,450

0

1

2

3

4

5

6

7

1

10

100

1,000

10,000

100,000

1,000,000

10,000,000

Kingdoms Phyla Classes Orders Families Genera Species

Nam

es in

the

Cat

alog

ue o

f Life

Taxonomic Rank

In GGBN Not in GGBN

GGBN Progress as of May 2018

Page 14: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Eukaryo,c Genome Quality (n=3311)

-

50

100

150

200

250

300

350

400

450

500

1.2.2

1.4.4

1.6.6

2.2.2

2.3.3

2.3.5

2.3.7

2.4.5

2.4.7

2.5.5

2.5.7

2.6.6

2.6.8

2.7.8

3.4.5

3.4.7

3.5.5

3.5.7

3.5.9

3.6.7

3.7.7

3.7.9

4.6.6

#Gen

omes

GenomeQuality

Animals

Plants

Fungi

ProUsts

zOther

XAxis0.xGenomeLevel1=conJg2=scaffold3=chromosome4=complete0.x.xlog(conJgn50)0.x.x.xlog(scaffoldn50

ASerLewin&Al_2018_EarthBioGenomeProject:Sequencinglifeforthefutureoflife.www.pnas.org/cgi/doi/10.1073/pnas.1720115115

Page 15: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

0.7XAgaroseGel

PFGE,1XAgaroseGel(200ng)

*

*

FEMTOPulse(min10fg)

Pipe\eshearing?

Femto Pulse 165 kb ladder

*Solemya velum genomic DNA

Page 16: Global Genomic Resources for Biodiversity Research · 2019-04-16 · Global Genomic Resources for Biodiversity Research 1 Jonathan Coddington Global Genome Initiative Smithsonian

Conclusions and Thanks!

•  Big Data: OK! •  Biodiversity Discovery: OK! •  Preserving the genome = Σ branch lengths

Gap Analysis (known unknowns): – Especially useful to set priorities – Quantitative metrics possible – Enviro management, conservation – Build community, bridges to biodiversity genomics

16