Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
SNPs and Human Diseases XV
November 14th, 2018
Microbiome
Robert Kraaij, PhD
Erasmus MC, Internal Medicine
Metagenomics - terminology
the study of metagenomes
genetic material recovered from environmental samples
ecological community of microorganisms
symbiosis
commensal
mutual
parasitic
OPTION 1:
microbiota community of microorganisms
microbiome genomes of the microbiota
Metagenomics - terminology
the study of metagenomes
genetic material recovered from environmental samples
ecological community of microorganisms
symbiosis
commensal
mutual
parasitic
OPTION 2:
microbiota collection of microorganisms
metagenome genomes of the microbiota
microbiome community of microorganisms and host
Microbiota: more than just bacteria…
Archaea
Bacteria
Protozoa
Viruses
human viruses
bacteriophages
Fungi
molds
yeasts
Microbiota: more than just bacteria…
Archaea
Bacteria
Protozoa
Viruses
human viruses
bacteriophages
Fungi
molds
yeasts
The human gut microbiota
- the forgotten organ
1013 bacterial cells = 1013 body cells
~106 bacterial genes vs ~20,000 human genes
many unique functions
involved in health and disease!
Density of microbiota increases along GI tract
Walter and Ley (2011) Annu Rev Microbiol. stool (1011 cells/ml)
Stool as ‘proxy’ of gut (distal colon) microbiota
collection
storage
type 1 type 2 type 3 type 4 type 5 type 6 type 7
profiling
metadata - Bristol stool scale
- Rotterdam Study RS-IV
- n = 836
Human microbiota: more than just the gut…
urine stool
nose tooth
eye
skin
PHENOTYPE
GENOTYPE
ENVIRONMENT
DIET
LIFE-STYLE
MICROBIOME
Microbiota
Gut microbiome and disease associations
obesity
Crohn’s disease
ulcerative colitis
eczema
asthma
diabetes
depression
etc
hype cycle
Overview
Microbiota profiling
Data analysis
Microbiota profiling
WHO ARE THEY?
WHAT DO THEY DO?
culture-based techniques
culturomics
16S rRNA marker gene
arrays (hitChip)
ISpro
sequencing
microbiome array
shotgun sequencing (metagenomics)
Microbiota profiling
IS-proTM profiling
16S-23S interspace (IS region)
taxonomy based on size differences
prokaryotic
rRNA operon 16S 23S 5S
Bacteroidetes
Firmicutes, Actinobacteria,
Fusobacteria, Verrucomicrobia
Proteobacteria
IS region
FAFV
NCBI database - 16S – 23S rRNA
- 8990 entries
Budding et al. (2010) FASEB J.
fragment size (nt) a
bu
nd
an
ce
culture-based techniques
culturomics
16S rRNA marker gene
arrays (hitChip)
ISpro
sequencing
microbiome array
shotgun sequencing (metagenomics)
Microbiota profiling
16S ribosomal RNA gene amplicon
highly conserved in bacteria and archaea
species-independent PCR amplification
variable regions
taxonomic classification
16S rRNA
16S rRNA amplicons
prokaryotic
rRNA operon 16S 23S 5S
1500bp Oxford Nanopore long read sequencing
IS region
~400bp Illumina MiSeq short read sequencing
16S RNA analysis pipeline
DNA isolation
(NorDiag Arrow)
16S rRNA
amplicon
Analysis
(QIIME)
Sequencing
(Illumina MiSeq)
16S rRNA amplicon and sequencing
Fadrosh et al. (2014)
Illumina MiSeq
QIIME-based analysis pipeline
Silva database, version 128 Max Planck Institute for Marine Microbiology and Jacobs University, Bremen, Germany
September 2016
8,430,487 entries
Read-pair merging
Q-score > 19
Chimera filtering
Sample QC
Reads > mean – 2 SD
OTU calling
Taxonomy
Phylogeny
OTU table (anonymous)
Biome table (taxonomy)
Phylogenetic tree
OTU abundancy filtering
> 0.005% of total reads
Caporaso et al. (2010)
Read-pair merging
Chimera filtering
- chimeras are PCR artifacts
Chimera filtering
Query
Chunk Chunk Chunk Chunk
Ref DB
Hits
Query
A
Query
A
B
normal chimera
4x
OTU clustering
Operational taxonomic units (OTUs)
clustering on basis of homology of the reads (97%)
OTUs can be aligned to reference databases
unknown OTUs can still be used in analyses
Closed reference calling
Each read is compared directly to the database
Database determines phylogenetic tree
Standardized taxonomy > allows for collaboration
culture-based techniques
culturomics
16S rRNA marker gene
arrays (hitChip)
ISpro
sequencing
microbiome array
shotgun sequencing (metagenomics)
Microbiota profiling
Affymetrix Axiom Microbiome array
culture-based techniques
culturomics
16S rRNA marker gene
arrays (hitChip)
ISpro
sequencing
microbiome array
shotgun sequencing (metagenomics)
Microbiota profiling
Shotgun metagenomics
Flaws of 16S rRNA profiling
selection introduced by PCR amplification
no eukaryotic species such as fungi
phylotyping will not give insights into
the gene functions of unknown species
Shotgun metagenomics
Direct sequencing of DNA
High output sequencing
2 x 100 bp
reads are too short for proper annotation
de novo assembly is preferred
need for compute power
2 x 100 bp
paired-reads de novo assembly ~1 kbp contigs
Metagenomics technology push
MetaHIT
European FP7 project
Human Microbiome Project (HMP)
NIH-sponsored project
Profiling of shotgun data
phylotyping databases
metagenomic species (MGS)
~7000 MGS specified
gene catalogue
8.1 million genes from 760 samples
functional databases
Phylotyping of shotgun data
Arumugam et al., 2011 MetaHIT
Functional analysis of shotgun data
Arumugam et al., 2011 MetaHIT
Taxonomic vs functional profiling
large taxonomic differences are not reflected in functional profiles
The Human Microbiome Project Consortium (2012)
Samples ordered by taxonomic profiles
Samples ordered by functional profiles
Profiling the gut microbiome
WHO ARE THEY?
16S TAXONOMY
METAGENOMICS
WHAT CAN THEY DO?
METAGENOMICS
WHAT ARE THEY DOING?
METATRANSCRIPTOMICS
METAPROTEOMICS
WHAT HAVE THEY DONE?
METABOLOMICS
Profiling the gut microbiome
Overview
Microbiota profiling
Data analysis
Complex multi-dimensional data
no normal or mean profile
enterotypes?
sparse data
many zero abundances
limited by technique
count data
dependent on technique
how to normalize?
compositional data
relative abundances add up to 1
Diversities
α-diversity
diversity within a sample
biological metric
number of species * evenness
β-diversity
diversity (distance or dissimilarity) between samples
UniFrac distances
OTU table
OTU id sample_01 sample_02 sample_03 sample_04 …
OTU_12 3 0 456 343
OTU_318 34 45 3 2
OTU_37 567 2134 478 675
… … … … … …
Total 5,975 4,952 6,735 5,374
Rotterdam 16S rRNA datasets
Domain Phylum Class Order Family Genus OTUs
(2) (11) (18) (24) (43) (183) (777)
Class Domain Phylum Order Family Genus OTUs
(1) (7) (15) (19) (36) (152) (661)
Shannon Diversity Index Shannon Diversity Index
5 major phyla
N=2,111
N=156
N=1,427
N=1,135
N=1,106
Generation R Study
9-11 year-olds
Rotterdam Study
adults
Radjabzadeh et al. (2018) in preparation
Children vs adults - Generation R Study vs Rotterdam Study
N=2,111 N=1,427
GenR RS
*** 8
7
6
5
4
3
2
Sh
an
no
n d
ive
rsit
y i
nd
ex
average phylum-level profiles
Radjabzadeh et al. (2018) in preparation
Shannon alpha diversity
MiBioGen consortium
Meta-analyses of gut microbiome GWAS
> 20 cohorts (still including)
> 20,000 samples
16S rRNA profiling (Illumina)
226 genera
8M HRC1.1 imputed SNPs
NGRC
Traits
Shannon alpha-diversity
Binary trait (presence/absence)
Quantitative trait (abundance)
Beta-diversity
MiBioGen consortium
Cohort name Population 16S domain Genotyping method N Description
1 LLD Netherlands (Caucasian) V4 Illumina Immunochip, Cytochip 1089 Representative of population
2 NGRC Netherlands (Caucasian) V1-V2 PsychChip (Broad Institute, Boston, USA) 153 Healthy group + ADHD group
3 RS Netherlands (Caucasian) V3-V4 Illumina 550k 1427 Representative of population
4 GENR Netherlands (multi-ethnic) V3-V4 Illumina 610k 2111 Representative of population
5 NTR Netherlands (Caucasian) V4 Affymetrix 6.0 499 Twins
6 MIBS_Co Netherlands (Caucasian) V4 Illumina OmniExpressExome 111 Healthy volunteers
7 FGFP Belgium (Caucasian) V4 Illumina OmniExpress 2482 Representative of population
8 SHIP Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip 1904 Representative of population
9 SHIP-TREND Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip (-) Representative of population
10 FOCUS Germany (Caucasian) V1-V2 Illumina Immunochip, Exome 1555 Representative of population
11 BSPSPC Germany (Caucasian) V1-V2 Illumina 550K, Immunochip, Metabochip, Affymetrix 6.0, Axiom 912 Representative of population
12 TwinsUK UK (Caucasian) V4 HumanHap300, Hap610Q, 1M-Duo, 1.2M-Duo 1793 Twins
13 CHRIS Italy (Caucasian) ? ? ? ?
14 COPSAC Denmark (Caucasian) V4 Illumina OmniExpress 424 Representative of population
15 POPCOL Sweden (Caucasian) V1-V2 Illumina MiSeq 250 Representative of population
16 METSIM Finland (Caucasian) V4 Illumina OmniExpressExome 531 Representative of population
17 PNP Israel (Israeli) V3-V4 Metabolochip 1066 Healthy volunteers
18 GEM_HCE_v12 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals
19 GEM_HCE_v24 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals
20 GEM_ICHIP_HCE Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip 1543 Healthy individuals
21 CARDIA USA (Caucasian and African-American) V3-V4 Illumina Exome, Affymetrix 6.0 282 Representative of population
22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population
23 KSCS Korea (Asian) V3-V4 Illumina HumanCore BeadChips 12v 833 Representative of population
23 cohorts >21,000 samples
MiBioGen consortium
55 bacterial taxa (1,232 SNPs)
GWAS quantitative trait
226 genera
8M SNPs
Meta-analysis
MiBioGen consortium (2018) unpublished
LCT locus
Questions…
?
?