Metagenomics of the Human Intestinal Tracthttp://www.metahit.eu
ISAPP 2010, Barcelona, SpainS. Dusko Ehrlich
Metagenome can be defined as the ensemble of genes & genomes of the
microbes from a given ecological niche.Metagenomics is a powerful approach to characterize composition, properties
and dynamics of the microbiome via metagenome.
The human intestinal microbiota is a forgotten organ…
100 trillion microorganisms ; 10-fold more cells than the human body; 2 kg of mass!Interface between food and epithelium In contact with the 1st pool of immune cells and the 2nd pool of neural cells of the body
…with a major role in health & disease !
Frailty in seniors Van Tongeren et al., 2005Crohn Seksik et al., 2003; Sokol et al., 2006, 2008, 2009Ulcerative colitis Sokol et al., 2008; Martinez et al., 2008Pouchitis Lim et al., 2009, Kühbacher et al., 2006Obesity Ley et al., 2007; Kalliomäki et al., 2008Type-2 diabetes Cani and Delzenne, 2009Type-1 diabetes Dessein et al., 2009; Wen et al., 2008Coeliac disease Nadal et al., 2007; Collado et al., 2009Allergy Kirjavainen et al., 2002; Björkstén, 2009Autism Finegold et al., 2002; Paracho et al., 2005Colorectal cancer Mai et al., 2007; Scanlan et al., 2008 HIV infection Gori et al., 2008
Perturbation of intestinal microbiota as a possible disease factor
* Indications from animal models, effects of antibiotics or probiotics, clinical studies; courtesy of Joël Doré
EU: MetaHIT, 20 M €Ireland: ElderMET, 4 M €Japan: Human Metagenome Consortium, 5 M $Sino-French cooperation: Micro-Obese : 4 M €USA: NIH Human Microbiome Project, 115 M $
The Crohn's & Colitis Foundation of America Gut Microbiome Initiative: Washington U., St. Louis & U. of Denver, Boulder
Worldwide movement to characterise human microbiome
Canada: Microbiome initiative, 14 M $ (August 10, 2010)MSU Wins $7.3M Gut Microbiome Grant | GenomeWeb Daily News | GenomeWeb (August 20, 2010)
International Human Microbiome Consortium
Full MembersCanadian Institutes of Health Research (Canada)European Commission (Europe)Institut National de la Recherche Agronomique (France) Japan Science & Technology Agency, JST (Japan)National Institutes of Health (United States of America)Commonwealth Scientific and Industrial Research Organisation (Australia)Medical Research Council (Gambia)Observing MembersGenome Canada (Canada)Ministry of Education, Cultures, Sports, Sciences and Technology, MEXT (Japan)National Health and Medical Research Council (Australia)Korea Research Institute of Bioscience and Biotechnology (Republic of Korea)ELDERMET (Ireland)
Current co-chairs: G. Weinstock (WashU) & J .Peterson (NIH)
The MetaHIT approach to relation between microbes & us
Establish a reference gene set by metagenomic & genomic sequencing of the Human GI tract microbes
Develop generic tools for profiling the GI tract microbiota genes : arrays and high throughput DNA sequencing
Use the profiling tools to search association of microbial genes and disease in Inflammatory Bowel Diseases and Obesity
Bioinformatics overlays all activities
Carry out function analysis to go from associations to mechanisms
The partners
Most important – the people, a stellar team!
• Thirteen European, one Chinese Institutions
• Eight countries, two continents
• Nine public and four private Institutions
MetaHIT budget : 21.2 M €EC Contribution: 11.4 M €
100 scientists
Starting date: March, 2008Duration: 4 years
Illumina‐based intestinal bacteria reference gene set
Qin et al., 2010
Illumina sequencing
Samples 124 individuals(85 Danes, 39 Spaniards)
Library type15 samples 200bp
109 samples
140bp350bp
Sequencing type Paired-end (PE) sequencing
Read length (bp) 45 b (15 samples)75 b (109 samples)
Sequences per sample 31million ±0.5 million
In total, ~0.58 Terabase of sequence
Contigs
Genes
The pipeline for Human intestinal microbial gene catalog
BGI, Wang Jun et al.
The contig set• SOAPdenovo (de Bruijn graph‐based tool)
• Removal of short contigs (<500bp)
• Removal of redundancy
Total Size Number N50 Size N90 Size Max. Length
10.3 Gb 6.6 Million 2.2 kb 0.7 kb 237.6 kb
• Assembly error rate low: 14/Mb • Comparable to 454 (Newbler): 20/Mb
BGI, Wang Jun et al.
Representation of the human gut microbiome in the contig set
Sequences from three studies were mapped on the contig set• 124 Europeans (0.58Tb Illumina) • 18 US (1.83 Gb 454 Roche)• 13 Japanese (0.79 Gb Sanger)
The contig set represents well the whole human metagenome
BGI, Wang Jun et al.
The gene set
Metagene prediction on the contigs:
• 14 million Open Reading FramesRemoval of redundancy :
≥ 95 % nucleotide identity over at least 90 % of the length of the shorter ORF
• 3.3 million ORFs
150 times human gene complement
The gene set is almost complete
>85% of prevalent genes of the cohort are included in the reference set (by the incidence‐based coverage richness estimator, ICE)Genes are included if present at frequency of >7x 10-7
Human intestinal microbial genes are largely shared in the cohort
Each individual has ~540 000 prevalent genes, on average
40 % of an individual’s genes are shared with at least 50 % of individuals of the cohort
Deeper sequencing reveals more genes
4.5 Gb
The half’n halfrule!
Bacterial species are also largely shared • An overwhelming fraction of the 3.3 million genes, 99%, is
of bacterial origin• A bacterial genome encodes 3,364 non‐redundant genes,
on average
• The cohort carries at least 1000‐1150 prevalent bacterial species• Each individual carries at least 167 prevalent bacterial species
• Prevalent bacterial species must largely be shared in the cohort• They are likely to be shared in the human population
Many sequenced species are shared
Illumina reads used to identify bacterialspecies and measure their abundancein different individuals of the cohort
Individuals SpeciesAll 124 18 >90% 57>50% 75
Abundance of a bacterial speciesvaries 12‐2200 fold in individuals
We are all similar!
Functions encoded by the reference gene set
EMBL, Bork et al.
Well characterised
Well & poorly characterised
All5000 new functions (≥20 proteins)
Well characterised
Well & poorly characterised
All
EMBL, Bork et al.
Minimal metagenome – functions required in the gut ecosystem
Overall view of the minimal genome metabolic pathways (~1200 functions)
The minimal genome functions in sequenced genomes
House‐keepingGutspecific?
Ipath tool, Letunic et al. TIBS, 2008
Bacterial genes/species/communities associated to a disease ?
Take-home messages• 3.3 million prevalent human intestinal bacterial genes were identified in a cohort of 124 individuals, 150 times more than the human gene complement• The gene catalog includes most of the genes identified in the studies over three continents
• Combinations of species ( i.e. bacterial communities!) are associated to disease
After the human genomethe human metagenome!!!
Diagnostic & prognostic tests - soon- arrays, sequencing, Q-PCR; immunomarkers (?)Better treatments - next- personalized medicine
Responders/non-respondersNovel treatments – last - modulation of microbiota
Promoters (pro & pre-biotics)Inhibitors (AB-like??)
- transplantation of microbiota?
Where do these studies lead to and when?
MetaHIT ConsortiumP. Bork J. DoréJ. Raes F. GuarnerM. Arumugam J. Parkhill
O. Pedersen
Wang Jun J. Weissenbach
Qin JunjieLi Ruiquiang
INRA, ex GMN. PonsJ.M. Batto E. Le Chatelier M. Almeida P. Renault
Acknowledgments
Thank you for your attention