63
An introduction to computational methods for studying microbial communities as a complex system اراﺋﻪ دﻫﻨﺪه: ﮐﺎوه ﮐﺎوﺳﯽ ﻋﻀﻮ ﻫﯿﺎت ﻋﻠﻤﯽ ﻣﺮﮐﺰ ﺗﺤﻘﯿﻘﺎت ﺑﯿﻮﺷﯿﻤﯽ ﺑﯿﻮﻓﯿﺰﯾﮏ(IBB) آزﻣﺎﯾﺸﮕﺎه ﺳﯿﺴﺘﻢ ﻫﺎي زﯾﺴﺘﯽ ﭘﯿﭽﯿﺪه و ﺑﯿﻮاﻧﻔﻮرﻣﺎﺗﯿﮏ(CBB) داﻧﺸﮕﺎه ﺗﻬﺮانInstitute of Biochemistry and Biophysics (IBB)

An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

An introduction to computational methodsfor studying

microbial communities as a complex systemکاوه کاوسی: ارائه دهنده

(IBB)عضو هیات علمی مرکز تحقیقات بیوشیمی بیوفیزیک (CBB)و بیوانفورماتیک آزمایشگاه سیستم هاي زیستی پیچیده

دانشگاه تهران

Institute of Biochemistry and Biophysics (IBB)

Page 2: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

• Network Science & Complexity Modeling

• Biological Networks

• Microbiota

• Metagenome & Its Analysis• Controlling Microbial Communities

2

Topics

Page 3: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Network Science: Introduction 2012

Network Science

Page 4: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

[adj., v. kuh m-pleks, kom-pleks; n. kom-pleks]

–adjective

1.

composed of many interconnected parts; compound;composite: a complex highway system.

2.

characterized by a very complicated or involvedarrangement of parts, units, etc.: complex machinery.

3.

so complicated or intricate as to be hard tounderstand or deal with: a complex problem.

Source:Dictionary.com

COMPLEX SYSTEMS

Complexity, a scientific theory whichasserts that some systems displaybehavioral phenomena that are completelyinexplicable by any conventional analysisof the systems’ constituent parts. Thesephenomena, commonly referred to asemergent behaviour, seem to occur inmany complex systems involving livingorganisms, such as a stock market or thehuman brain.

Source: John L. Casti, Encyclopædia Britannica

Network Science: Introduction 2012

Page 5: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

No blueprint

No “master-mind”

Self-organization

Evolution

Adaptation

COMPLEX SYSTEMS

Network Science: Introduction 2012

Page 6: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Stephen HawkingJanuary 23, 2000`

Network Science: Introduction 2012

Page 7: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Network Science: Introduction 2012

Page 8: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

THE ROLE OF NETWORKS

Behind each complexsystem there is a network,that defines the interactionsbetween the component.

2

Page 9: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Keith Shepherd's "Sunday Best”. http://baseballart.com/2010/07/shades-of-greatness-a-story-that-needed-to-be-told/

The “Social Graph” behind Facebook

SOCIETY Factoid:

Network Science: Introduction 2012

Page 10: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

: departments

: consultants

: external experts

www.orgnet.com

STRUCTURE OF AN ORGANIZATION

Network Science: Introduction 2012

Page 11: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Brain

Human Brainhas between10-100 billionneurons.

BRAIN Factoid:

Network Science: Introduction 2012

Page 12: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

The subtle financial networks

Network Science: Introduction 2012

Page 13: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

The not so subtle financial networks: 2011

Network Science: Introduction 2012

Page 14: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

INTERNET

domain2

domain1

domain3

router

Network Science: Introduction 2012

Page 15: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

M. A. Yildirim et al. Nature Biotech. 25, 1119-1126 (2007)

Drug – Target NETWORK

Drug–target network (DT network). The DT network is generated by using the known associations between FDA-approved drugs and their target proteins. Circles and rectangles correspond todrugs and target proteins, respectively. A link is placed between a drug node and a target node if the protein is a known target of that drug. The area of the drug (protein) node is proportional tothe number of targets that the drug has (the number of drugs targeting the protein). Color codes are given in the legend. Drug nodes (circles) are colored according to their AnatomicalTherapeutic Chemical Classification, and the target proteins (rectangular boxes) are colored according to their cellular component obtained from the Gene Ontology database.

5

Page 16: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

THE ROLE OF NETWORKS

Behind each system studied in complexity there is an intricate wiringdiagram, or a network, that defines the interactions between thecomponent.

We will never understandcomplex system unless wemap out and understandthe networks behind them.

Network Science: Introduction 2012

Page 17: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

NETWORK SCIENCE The science of the 21st century

Network Science: Introduction 2012

3The 2nd International Personalized Medicine Congress

Page 18: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

An introduction to BiologicalNetworks

Amitabh Sharma, PhD

Postdoctoral Fellow at CCNR, BarabasiLabNortheastern University & Dana Farber Cancer Institute

Page 19: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Gene Protein Cell structureand function

Tissue structureand function

Organ structureand function

~25,000 genes ~100,000 proteins 210 + cell types 4 tissue types 12 organ systems 1 body

Environment and populations

Organism

OrgansTissues

Cells

Physiome

Signaling pathwaysMetabolic pathways

Cell cycle,contraction,adesion, secretion, sensory,transport..

Transcriptome, Metabolome

Proteins

Proteome

Genome

ATPLigand binding,phosphorylation

Net

wor

k th

eory

Net

wor

k th

eory

Page 20: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

The paradigm of cell as a complexsystem

Insights into the logic of cellular organization can be achieved when we view thecell as a complex network in which the components are connected by functionallinks.

Oltvai & Barabasi, Life’s Complexity Pyramid. Science,2002:298,763-764

Page 21: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Complex Biological system andinteraction

Complexity emerges at all levels of the hierarchy of life

In his Nobel lecture in 1968, Jaques Monod pointed out theimportance of interactions in biological systems: “... anyphenomenon, any event, or for that matter, any ‘knowledge,’ anytransfer of information implies an interaction”

Vidal M, Cusick ME, Barabási AL. Cell. 2011, 144:986-98.

Page 22: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

22

Different Types of Biological Networks

Page 23: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Different type of Biological networkIntra-Cellular Networks

Protein interaction networks Metabolic networks

Gene Regulatory networks

Inter-Cellular NetworksNeural networks

Disease Networks

Protein

Interaction

Enzyme

MetabolitesUp-regulation

Down-regulation

Disease-1Disease-2Disease-3Disease-4

Disease-4Disease-5

Disease-6

None of these networks areindependent, instead they form a

‘network of networks’

Signaling networkActivation

Inhibition

Page 24: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

• Nodes are proteins, metabolites, lipids,second messengers, or peptides

• Interactions designate information flow, canbe activation or inhibition, and are direct andphysical

Gi/o Pathway

Cell Signaling Pathways

Ma'ayan A, et al. Sci Signal. 2:cm1 (2009)

Page 25: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Example of Gene Regulation Networks

25MacArthur et al., PLoS ONE 3: e3086 (2008)

Stem cell differentiation regulation

• Nodes are genes and transcription factors• Interactions can be directional or bidirectional• Interactions can be activation or inhibition

Page 26: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Gene Regulatory NetworkRegulatory Network in Embryonic Stem CellRegulatory Network in developing beta-cells

Nat Immunol. 2010 Jul;11(7):635-43.

Page 27: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Metabolic Networks

27

• Two types of nodes: enzymes and substrates• Reactions can be directional or bidirectional• Bipartite graph, reactions are not connectedand substrates are not connected

Bourqui et al. BMC Systems Biology 1:29 (2007)

Glycolysis

Berg et al. BiochemistryNew York: W. H. Freeman and Co.; c2002

Page 28: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Protein interaction networkA polar map showing the largest connectedcomponent of the human interactome basedon the 1307 proteins and 2441 interactions

J. R. Soc. Interface. 2009 :6 , 881-896

Page 29: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

https://draxe.com/microbiome/29

Microbiota

Page 30: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

30

Functions of Gut Microbiome

Page 31: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

http://learn.genetics.utah.edu/content/microbiome/disease/

31

Importance of Microbiome Studies

Page 32: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Metagenomics is the studyof genomes recovered fromenvironmental or othermicrobiota samples withoutthe need for culturing themand the relationship betweenthem.

Bioinformatics andComputational Biology helpsto process the huge datagathered from microbiome

5/14/2018 Workflow for Metagenomics Project 32

What is Metagenome

Page 33: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 33

Random genomesfragmentation

Genomes assemblyusing overlaps

One genome

Multiple genomes in a single gram of soil, there can be up to 18000

different types of organisms, each with its owngenome.

Random genomefragmentation

Genome assemblyusing overlaps

Whole Gene Shotgun Sequencing for Metagenomics

Page 34: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

How to get real data?

34

Cross Sectional&

Longitudinal

High Throughput DNA Sequencing

How to get real data?

Page 35: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

35

Human Microbiome ProjectNature 2012.DOI:10.1038/nature11234

Different Body Sites, Different Ecosystems

Page 36: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

• Identifying:• Organisms (species, strains, ..)• The abundance of organisms• Genes• Functions

• Analyzing:• Comparative study of metagenome in different conditions

• Treatment• Different phenotypes• Time series• Different food• …

5/14/2018 Workflow for Metagenomics Project 36

Analysis of Metagenome

Page 37: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

• Infectious Disease Diagnosis• Gut Microbe Characterization• Biofuel• Environmental remediation• Biotechnology• Agriculture• Ecology

5/14/2018 Workflow for Metagenomics Project 37

Applications of Metagenomics analysis

Page 38: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

38

A Pipeline for Metagenomic Analysis

Page 39: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

39

A Pipeline for Metagenomic Analysis

First Phase:• Extract high quality data

Page 40: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 40

SRA Toolkit

• The SRA Toolkit, and the source-code SRA SystemDevelopment Kit (SDK), will allow you to programmaticallyaccess data housed within SRA and convert it from the SRAformat to the following formats:

• ABI SOLiD native (colorspace fasta / qual)• fasta• fastq• sff• sam (human-readable bam, aligned or unaligned)• Illumina native

Page 41: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 41

Quality Control and Trimming: FastQC & BBMap

• Control the sequence based on:• Length• Quality

• Trim and/or Remove the low quality or short sequencebased on:

• Length• Quality• Has Adapter or not?• Remove index?• …

Page 42: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

42

A Pipeline for Metagenomic Analysis

Second Phase:• Constructing Gene Catalogue

Page 43: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 43

Assembly

Assembling to larger DNA sequences (contigs and scaffolds)Uses de-Brujin or overlap graphsMight need manual parameter settingBottleneck of the analyzingvery laborious taskrequesting avidly memory processing power resources

various performance issues stemming from it.Approaches:mapping reads to a template genome --> single cell sequencingde novo assembly --> metagenome sequencing

Page 44: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 44

Assembly Tools

• MetaVelvet• MetaSPAdes• SOAPdenovo• MetaSim• Omega• IDBA-UD• …

Page 45: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Assembly

5/14/2018 Workflow for Metagenomics Project 45

Why Not Assembly !!?

• Error in assembling reads to extract contigs / scaffolds• Merge two reads from two different similar taxonomy then

extract false contigs• So we have false genes• So we have false catalogue• So we have false mapping• So we have false network• So we have false results!!!!

• Broadcasting error in pipeline!!!!!!

Page 46: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 46

Binning

• Estimates microbial relative abundances by mappingmetagenomic reads against a catalog of clade-specific markersequences currently spanning the bacterial and archaealphylogenies.

• Clades are groups of genomes (organisms) that can be asspecific as species or as broad as phyla.

• Clade-specific markers are coding sequences (CDSs) that satisfythe conditions of

(i) being strongly conserved within the clade’s genomes and(ii) not possessing substantial local similarity with any sequence outside

the clade.

Page 47: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 47

Binning Tools

• Metaphlan• mOTUs• MEGAN• PaPaRa• MicrobeGPS• Pathoscope• Kraken• PhymmBL• RAIphy• …

Page 48: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 48

Constructing Gene Catalogue

Annotated genes of 390rumen microbiome of

Hungate project

Predicted genes ofMetaPhlAn output

Make a few modification tokeep the right format

Construct a non-redundantversion of genes

Linux terminalcommands

CD-HIT

Page 49: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

49

A Pipeline for Metagenomic Analysis

Third Phase:• Constructing co-abundance network

Page 50: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

13th Conference on Biophysical Chemistry -Ardabil 50

Reasons for co-occurrence• Why would two taxa consistently occur together or avoid

each other across samples?

A Pipeline for Metagenomic Analysis

Page 51: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

5/14/2018 Workflow for Metagenomics Project 51

Similarity based network inference

Page 52: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

52

• Consider X and Y as twovectors with n elements.Then the Pearsoncorrelation of X and Y willbe calculated using thisformula:

Pearson Correlation

cov( , )( , )var( ) var( )

X Yr X Y

X Y

1

2 2

1 1

( )( )

( ) ( )

n

i ii

n n

i ii i

X X Y Yr

X X Y Y

Page 53: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

53

Co-Abundance Network

Page 54: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

54

Co-Abundance Network

Page 55: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

ControllingMicrobialCommunities

55

A Concept From Contol Theory

Page 56: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

56

• Probiotics• Prebiotics• Bacteriotherapy• Antibiotics

https://doi.org/10.1016/j.mib.2013.06.009

How to Maintain/Restore our Healthy Microbiota?

Page 57: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

57

• a direct secretion of substances such as bacteriocins

• ecological competition between the microbes

• immune system modulation

Levy & Borenstein, PNAS (2013), DOI: 10.1073/pnas.1300926110

Inter-species Interactions

Page 58: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

58

Dynamic Bayesian Network(Nature, 2016)

DOI: 10.1038/srep20359

ODE (GLV/ Mechanistic)eLife, 2017

Agent Based Flux BalancePLOS CB, 2017https://doi.org/10.1371/journal.pcbi.1005544

Microbial Communities (MC) modeling approaches

Page 59: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

59

Faust et.al., Current opinions in microbiology (2015),DOI: http://dx.doi.org/10.1016/j.mib.2015.04.004

Methods to predict theonset of multi stability (bifurcation)

Is the microbiome stable?

Page 60: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

The 2nd International Personalized Medicine Congress 60

Interaction Between Microorganisms

Page 61: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

• Identification of minimal sets of driverspecies through which we can achievefeasible control of the entire community.

• Using the proposed method on some realdata (Core microbiota of sea sponge, gutmicrobiota of gnotobiotic mice infectedwith C.difficile)

61

Controllability of Complex Networks

Page 62: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

62

Controllability of Complex Networks

Page 63: An introduction to computational methods for studying ... · An introduction to computational methods for studying microbial communities as a complex system ... The paradigm of cell

Thanks for yourattention

38