38
Metagenomics and the microbiome

Metagenomics and the microbiome. What is metagenomics? Looking at microorganisms via genomic sequencing rather than culturing Environmental use

Embed Size (px)

Citation preview

Page 1: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Metagenomics and the microbiome

Page 2: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

What is metagenomics?

Looking at microorganisms via genomic sequencing rather than culturing

Environmental use case: ag, biofuels, pollution monitoring

Health use case: The human microbiome

Page 3: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Why care about microbiome?

You = 1013 your cells + 1014 bacterial cells

More actionable genomics

Source: http://www.med-health.net/Best-Time-To-Take-Probiotics.htmlhttp://www.mayo.edu/research/labs/gut-microbiome/projects/fecal-microbiota-transplant-c-diff-colitis

Page 4: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Why care about microbiome?

Diagnostic or modulatory implications in:

Obesity, Diabetes, Fatigue, Pain disorders

Anxiety, Depression, Autism

Antibiotic resistant bacteria

IBD and other gut disorders

Cardiac function, cancer

Page 5: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Diseases and the microbiome

Source: The human microbiome: at the interface of health and disease. Nature reviews genetics

Page 6: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Why care about microbiome?Publications containing ‘microbiome’ by date on Science Direct

Page 7: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Goal 1: CompositionSource: The human microbiome: at the interface of health and disease, Nature Reviews Genetics

http://huttenhower.sph.harvard.edu/metaphlan

Page 8: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Diversity measures

Alpha diversity: how diverse is this population? Simpson’s index, Shannon’s index, etc

Difference in alpha diversity before and after antibiotics

Beta diversity: Taxonomical similarity between 2 samples

Finding compositional associations between disease cohort and microbial makeup

Page 9: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Sequencing for diversity

Pyrosequencing the 16s ribosomal RNA subunit

< 10 taxa appear in > 95% of people in HMP

Recall the implicated diseases. Looks like GWAS common disease, small effect size + common disease, rare variant

Page 10: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Goal 2: Functional profiling

Source: The human microbiome: at the interface of health and disease. Nature reviews genetics

Page 11: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Functional profiling

Current: Which genes are present and are being transcribed

In development: proteomics, metabolomics

Page 12: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Sequencing for function

Whole microbiome sequencing

Avoids primer biases and is more kingdom agnostic

Assembly is hard, especially where reference genomes don’t exist

Page 13: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Two big problems

Can’t understand the body without understanding the microbiome

Can’t understand the microbiome by only looking at bacteria

Read fragment assembly is very very hard in metagenomics

Page 14: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Kingdom-Agnostic Metagenomics

Page 15: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

The players in your body

Your cells

Metabolites

Bacteria

Bacteriophages

Other viruses

Fungi

Page 16: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

That’s not complexity

Source: A comprehensive map of the toll like receptor signaling network. Molecular Systems Biology‐

Page 17: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Prokaryotic virome: bacteriophages

Infect prokaryotic bacteria

Transfer genetic material among prokaryotic bacteria

Rapidly evolving

Put constant selection pressure on bacterial microbiome

Page 18: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Bacteriophages: deep sequencing results

60% of sequences dissimilar from all sequence databases

More than 80% come from 3 families

Little intrapersonal variation

Large interpersonal variation, even among relatives

Diet affects community structure

Antibiotic resistance genes found in viral material

Page 19: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Bacteriophages and function

Cross the intestinal barrier possibly affecting systemic immune response

Adhere to mucin glycoproteins potentially causing immune response in gut epithelium

IBD/Chron’s: relative increase in Caudovirales bacteriophages

Affect bacterial composition and/or host directly

Page 20: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Eukaryotic virome

Fecal samples from healthy children shows complex community of typically pathogenic viruses

Includes plant RNA viruses from food

Anelloviruses and circoviruses present in nearly 100% by age 5, likely from industrial ag

Page 21: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Eukaryotic viruses and function

Simian immunodeficient experiment showed enteric virome expansion

Increased gut permeability and caused intestinal lining inflammation

Acute diarrhea subjects showed novel viruses and highly divergent viruses with less than 35% similarity to catalogued viruses at amino acid level

Page 22: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Meiofauna

Fungi, protazoa, and helminths (worms)

No experiments conducted with sampling to saturation, much more work to be done

18S sequencing showed 66 genera of fungi in gut and fungi were found in 100% of samples

Most subjects had less than 10 genera

But high fungal diversity is bad: increases in IBD, increases with antibiotic usage

Page 23: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

But it’s very hard

Amplicon-based don’t work well for viruses

Heterogeneous sample-prep is required

Large differences in genome sizes from a few kb in viruses to 100+Mb in fungi

Small genomes+divergence require lots of coverage to get contigs

Page 24: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Getting the whole picture

Source: Meta'omic Analytic Techniques for Studying the Intestinal Microbiome. Gastroenterology.

Page 25: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

The assembly problem

Page 26: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Isn’t assembly easy?

Recall: 500-1000 species of bacteria in the gut, but about 30 of them make up 99% of composition

33% of bacterial microbiome not well-represented in reference databases, > 60% for bacteriophages

Page 27: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Coverage

Coverage: mean number of reads per base

L=read length, N=number of reads, G=genome size

Problem, with 2nd gen WMS technologies, L is low and G is astronomical or unknown

Thus, “full or sometimes even adequate coverage may be unattainable”

Source: A primer on metagenomics

Page 28: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Sequence length and discovery

Source: A primer on metagenomics

Page 29: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

All is not lostCan use rarefaction curves to estimate our coverage

Page 30: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

All is not lost

For composition analysis the phylogenetic marker regions (18S, 16S) work pretty well

For functional analysis: can still find ORFs fairly reliably and can be aligned to homologs in databases

Barring this, clustering and motif-finding yield some information

Page 31: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Different sequencing approaches?

Single-cell microfluidics in the future

Now: hybrid long/short read approaches. “finishing” with Sanger sequencing

Pacific biosciences SMRT approach

SMRT errors are random, unbiased

De novo assembly is 99.999% concordant with reference genomes

Page 32: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

HGAP: the SMRT assembly algorithm

1) Select longest reads as seeds

2) Use seed reads to recruit short reads

3) Assemble using off the shelf assembly tools

4) Refine assembly using sequencer metadata

Source: Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods

Page 33: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Seed selection

Order reads according to length

Considering reads above length L ~ 6kb

Rough end-pair align reads until ~20x coverage is reached

17.7k seed reads, averaging 7.2kb in length, already at 86.9% accuracy compared to reference

Page 34: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Recruiting short reads

Align all reads to the seed reads

Each read can be mapped to multiple seed reads, controlled by –bestn parameter

-bestn must be chosen so that the coverage of seeds + short aligned reads is about equal to the expected coverage of the sequenced genome

Use MSA and consensus to error correct long reads

Result is 17.2k reads of length 5.7kb with 99.9% accuracy

Page 35: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Overlap layout consensus assembly

Source: Overview of Genome Assembly Algorithms. Ntino Krampis.http://www.slideshare.net/agbiotec/overview-of-genome-assembly-algorithms

Page 36: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Refinement

Use Quiver algorithm which looks at raw physical data from sequencer

Uses an HMM and observed data to tell classify base calls as genuine or spurious

Do a final consensus alignment, conditioned on Quiver’s probabilities

Final result: 17.2k reads, length of 5.7kb, accuracy of 99.999506%

Page 37: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

Summary

Most of the cells in your body aren’t yours

But looking at bacteria alone is insufficient

Expanding our view causes us to look for needles in haystacks which is beyond most conventional approaches

Motif-finding and hybrid approaches will work until 3rd gen sequencing arrives

Page 38: Metagenomics and the microbiome. What is metagenomics?  Looking at microorganisms via genomic sequencing rather than culturing  Environmental use

References

Cho, Ilseung, and Martin J. Blaser. "The human microbiome: at the interface of health and disease." Nature Reviews Genetics 13.4 (2012): 260-270.

Wooley, John C., Adam Godzik, and Iddo Friedberg. "A primer on metagenomics." PLoS computational biology 6.2 (2010): e1000667.

Chin, Chen-Shan, et al. "Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data." Nature methods 10.6 (2013): 563-569.

Human Microbiome Project Consortium. "Structure, function and diversity of the healthy human microbiome." Nature 486.7402 (2012): 207-214.

Norman, Jason M., Scott A. Handley, and Herbert W. Virgin. "Kingdom-agnostic metagenomics and the importance of complete characterization of enteric microbial communities." Gastroenterology 146.6 (2014): 1459-1469.

Morgan, X. C., and C. Huttenhower. "Meta'omic Analytic Techniques for Studying the Intestinal Microbiome." Gastroenterology (2014).