Upload
denis-clark
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
In silico Footprinting and Genomic Signature Analysis for Encephalic and
Hemorrhagic Viruses
Willy A. Valdivia-GrandaOrion Integrated Biosciences, Inc.
Outline of this talk
Binary Clustering Analysis of Genomic SignaturesExamples: Flavivirus and Filoviruses
Encephalitic and Hemorrhagic Viruses Arenaviridae, Bunyaviridae, Flaviviridae, Filoviridae
In silico Genomic FootprintingGenomic Signature Detection
Directions in the Use of Genomic SiganturesFlavi-chip, Arena-chip, Bio-detection, Multimeric Vaccines
BioScience, May 2003: One million tons of dust may contain 10 quadrillion microbes (USGS).
Dissemination of Infectious Diseases
Encephalic and Hemorrhagic Viruses: NIAID Cat. A-C BSL4
Flaviviridae Hepatitis, Dengue, Yellow fever, Japanese encephalitis and West Nile viruses
Arenaviridae Argentine, Bolivian, and Venezuelan hemorrhagic fevers Lassa fevers
Bunyaviridae Hantavirus, the Congo-Crimean, Rift Valley fever virus
Filoviridae Ebola and Marburg viruses
West Nile Virus Summer 2003
Ecological Genomics and Biocomplexity of Viruses
100,000X
123,000X
25 Å
4 Å
5’ UTRCAP
3’ UTR
Hepaciviruses (~9.4 Kb)
Pestiviruses (~ 12Kb)
E2C NS2E1 NS3 NS4 NS4B NS5BNS5A
C E2PreM NS2-3E NS1 NS4A NS4B NS5A NS5B5’ UTR
IRE
C NS1 NS4APreM NS2A NS5BNS4BNS3E
Flavivirus (~ 11Kb) 70 Species transmitted by Mosq. Ticks, Non-Vector
NS2B5’ UTRIRES 3’ UTR
3’ UTR
Flaviviridae Family
NS1
Family Genus SpeciesSub-
Species Strain
DNA Sequencing
DNA-DNA reassociation
RT-PCR
Molecular Detection Methods for Viruses
Degenerated-PCR
Immunology
Microarray
2003 PLoS Biology | Volume 1 | Issue 2 | http://biology.plosjournals.org
Wang and DeRisi et al.
Viral sequences were physically scraped, amplified, cloned, and sequenced
Viral Sequence Recovery Using DNA Microarrays
Prototypical Coronavirus Genome Structure
Murine Hepatitis Virus (MHV)52/157 AA
Murine Hepatitis Virus (MHV)33/37 AA
Infectious Bronchitis Virus (IBV)32/32 nt
0
2
4
6
8
10
12
14
16
18
20
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225C NS1 NS4APreM NS2A NS5BNS4BNS3E
Flavivirus (~ 11Kb)
NS2B
Non Structural Protein
5’ UTRIRES 3’ UTR
Nucleotide substitution
10-3 per site per year
Why Aminoacid?
Information Content 3:1
Louping
Tick-borne
Omsk
Langat
Powassan
Deer
Dengue
Murray
Japanese
West
Yellow
Alkhurma
Montana
Rio
Apoi
Modoc
Tamana
Kamiti
Cell
100
100
100
100
100
53
53
100
80
97
90
81
49
46
26
21
0.2
Molecular Detection Methods for VirusesN
ucle
otid
e su
bstit
utio
n
Ungapped Whole Genomic Footprinting
Flavivirus (~ 11Kb)
C NS1 NS4APreM NS2A NS5BNS4BNS3E NS2B
20 20 GenomesGenomes
20 20 GenomesGenomes
Target Target GenerationGeneration
Target Target GenerationGeneration
Profile Comparison
Profile Comparison
Profile Generation
Profile Generation
Genomic Footprinting
Genomic Footprinting
Ungapped Whole Genomic Footprinting
23
23
41
41
41 19
41 19
41 19
45
29
45
45
45
45 29
24
24
42
42
45
45
41
41
11
11
11
11
11
33
33
33
33
33
41
41
41
6 31
31
31
31
31
6
45
5 11
11
33
33
14
14
23 41
22 14
45
45
2
41
41
19
19
41
41
19
19
49 11
11
33
33
11 33
11 33
31
31
31
31
31
31
45 41
41 19
41 19
24 16 45 41
41 41 11 33 31
45 41
42 45 41
24
24
16
42
45 41
45 41
45 36 22 41 19 24 42 45 41 18 11 33 31
45 41 1143 33 31
22 31
3 21 11 33 31
11
1 25 50 75 100 125 150 175 200 225 250 275
Japanese
West Nile
Yellow
Tick-borne
Louping
Murray
Deer
Modoc
Rio
Apoi
Posawan
Langat
Montana
Alkhurma
Dengue
Tamana
14 41 19 11 33 314127 18 42
Cell 40
23
23
41
41
41 19
41 19
41 19
45
29
45
45
45
45 29
24
24
42
42
45
45
41
41
11
11
11
11
11
33
33
33
33
33
41
41
41
6 31
31
31
31
31
6
45
5 11
11
33
33
14
14
23 41
22 14
45
45
2
41
41
19
19
41
41
19
19
49 11
11
33
33
11 33
11 33
31
31
31
31
31
31
45 41
41 19
41 19
24 16 45 41
41 41 11 33 31
45 41
42 45 41
24
24
16
42
45 41
45 41
45 36 22 41 19 24 42 45 41 18 11 33 31
45 41 1143 33 31
22 31
3 21 11 33 31
11
1 25 50 75 100 125 150 175 200 225 250 275
Japanese
West Nile
Yellow
Tick-borne
Louping
Murray
Deer
Modoc
Rio
Apoi
Posawan
Langat
Montana
Alkhurma
Dengue
Tamana
14 41 19 11 33 314127 18 42
Cell 40
C NS1 NS4APreM NS2A NS5BNS4BNS3E
Flavivirus (~ 11Kb)
NS2B
Non Structural Protein
5’ UTRIRES 3’ UTR
Mosquito
Ticks
No-vector
Ungapped Whole Genomic Footprinting
Tic
k-B
orn
e
Mo
squ
ito
Bo
rne
Non Vector
Core genome
Adaptation Region
CyclicRegion
Flavivirus Genomic Signature for NS5
DE
N4r
2Ad
el30
DE
N4r
DE
N4
de
l30
DE
N4r
DE
N4
DE
N4
DE
Np
4(D
elta
30)-
D2-
ME
DE
Np
4(D
elta
30)
DE
Nch
i-p
4-D
2-C
ME
DE
NC
hi-p
4-D
2-M
EK
amit
i-S
R-8
2K
amiti
-SR
-75
Buk
alas
a
Edge
Tyul
eniy
TBV-
Crimea
TBV-1
32
Koutan
go
Cowbone
Rio
Batu Usutu
Sal
Modoc
CareyDakar
Pnomm
Kunjin-pAKUN
Bouboui
Kunjin-FLSDX
SaintWNV-KN3829WNV-hISR2000WNV-VLG-4Kunjin-1WNV-3WNV-2WNV-1
Pow-SPO/B10Pow-791A-52
Pow-2542Montana
Pow-R59266
Pow-IP5001
Pow-CTB30
Pow-M11665
Pow-1982-64
Pow-M1409
Pow-1427-62
Pow-T18-23-81
Pow-64-7062
Jugra
Alfuy
JutiapaYokose
San
Ked
ou
gou
Om
skAp
oi
Sp
on
dw
eni
Sit
iaw
an
Bag
aza
Nta
ya
Tem
bu
su- M
M1
775
Tem
-166
5/9
6
Tem
318
6/98
Tem
-425
6/0
0
Tem
-MM
1775
Tem
-par
tial
DE
N1-
1600
7
DE
N2
DE
N-
DE
N2(
1668
1)
DE
N2-
1668
1-P
DK
53DEN
3-2
DEN
3201
2
DEN
3233
6
DEN3156
7
DEN3170
6
DEN3202
3
Bussuquara
Naranjal-2
5008BanziRocio
CacipacoreIsrael
SokulukKadamKyasanurYFV(17DYFV-HONG8YFV-HONG9
YFV-HONG10YFV-TN-96
SaumarezTBV 1
Langat
YaoundeSepikRoyal
Murray
Kokobera
Ilheus
Gadgets
Aroa
Iguape
Karshi
Meaban
Negishi
Russian
StratfordZika
JEV-1JE
V-2JE
V-3
Ilheuss
Sab
oya Po
tiskum 0.1
59 Species of the Flavivirus Genus
Binary classification of Flaviviruses
000011111122
FGS-1.filterFGS-10.filterFGS-11.filterFGS-12.filterFGS-13.filterFGS-14.filterFGS-15.filterFGS-16.filterFGS-17.filterFGS-18.filterFGS-19.filterFGS-2.filterFGS-20.filterFGS-21.filterFGS-22.filterFGS-23.filterFGS-24.filterFGS-25.filterFGS-26.filterFGS-27.filterDENV 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0YOKV 1 1 1 0 0 1 1 1 1 1 0 1 0 0 1 1 1 0 0 0JEV 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0MVEV 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0WNV 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0APOIV 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1MODV 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1MMLV 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1RBV 1 1 1 0 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1CFAV 1 1 0 0 1 0 1 1 1 1 0 1 0 0 1 0 0 0 0 0TABV 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 0 0LGTV 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1LIV 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1OHFV 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1POWV 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1TBEV 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1YFV 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 0Alkhurma virus1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1Deer tick virus1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1Kamiti River virus1 1 0 0 1 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0
>NP_476520.1|Deer tick virus|ctb30|8160|NS3
>NP_476520.1|Deer tick virus|ctb30|9651|NS5
Arising Biological Questions About Genomic Signatures
Are genomic signatures relevant for pathogen replication?
Silencing of host genes. MHC?
Competitive advantage over other viral serotypes
Role in virus recombination and the generation of new variability
Are some genomic signatures duplications defining host range and are related with vector transmission?
Binary classification of Mosquito Borne Flaviviruses
Powassan-AF310944.1|64-7062 Powassan-AF310945.1|64-7062 Powassan-AF310943.1|T18-23-81 Powassan-AF310942|1427-62 Powassan-AF310941|M1409 Powassan-AF310940|M1409 Powassan-AF310939.1|1982-64 Powassan-AF310937.1|M11665 Possawan-NC 003687.1
Powassan|AF310938.1|SPO/B10 Powassan|AF310950.1|791A-52
Powassan-AF310946.1|CTB30 Powassan-AF310947.1|IP5001 Powassan-AF310948.1|R59266 Powassan-AF310949.1|12542
Tick-borne|U27496.1|RK1424 Tick-borne-U27492.1|TEU27492 Louping|Y07863.1|LIVGEN Tick-borne Louping|NC 001809.1 Tick-borne|L40361.3|L40361 Langat|AF253419.1-TP21 Langat|AF253420.1|attenuated Langat|NC 003690.1|
Tick-borne|U27493.1|TEU27493 Tick-borne|NC 001672.1
Tick-borne|U27490.1|
0.05
Homologous Recombination Regions
- Natural selection - Mechanistic/ecological - Genome segment reassortment
NS5 Genomic Signature Phylogenic Incongruence
J. Virol., April 1, 2004; 78(7): 3319 - 3324.J. Virol., February 15, 2004; 78(4): 2114 - 2120.
Genomic Footprinting Hemorrhagic Viruses
Ebola Virus
23
23
28 4459 66 72 9568Ebola-Reston
Marburg
Ebola-Zaire
Marburg Lake Victoria
1 25 50 75 100 125 150 175 200 225 250 275
23
23
45
45
45
45
28
28
28
27 94
75 48 8
68 40 50 59 66 64 1 75 41 95
4459 66 72 9568 75 48 894
27 94 68 40 50 59 66 64 1 75 41 95
74
74
Ebola-Zaire-Mayinga-subtype-Zaire
Ebola-Zaire-Mayinga-C
Ebola-Zaire-Mayinga-B
Ebola-Zaire-Mayinga-A
Ebola-Zaire-1995
Ebola-EBLPROTG-Zaire
Ebola-Zaire-Mayinga
Ebola-Reston
Ebola-Reston-Pennsylvania
Ebola-Rston-A
Ebola-Maleo
Marburg-1975/Ozolin
Marburg-MRVMBGL
Marburg-Lake.Victoria-pp3
Marburg-MVREPCYC
Marburg-MAVSPAB
Marburg-MVVIRPR
Marburg-NC 001608.2
Marburg-Lake.Victoria-pp4
89
68
67
0.000.020.040.060.080.10
A
F
K
P
U
B
G
L
Q
V
C
H
M
R
W
D
I
N
S
X
E
J
O
T
Y
A
F
K
P
U
B
G
L
Q
V
C
H
M
R
W
D
I
N
S
X
E
J
O
T
Y
A
F
K
P
U
B
G
L
Q
V
C
H
M
R
W
D
I
N
S
X
E
J
O
T
Y
A
F
K
P
U
B
G
L
Q
V
C
H
M
R
W
D
I
N
S
X
E
J
O
T
Y
Phylogenomic Analysis of Flavivirus Genomic Signatures
Valdivia-Granda et al. 2002.
DE
N4r2
Ad
el3
0
WNV-hISR2000WNV-VLG-4
Tem
bu
s u- M
M1
7 75
YFV-TN-96Saumarez
0.1
Oracle DB2 SybaseFlat File XML Other
(Metadatabase layer) JDBC(Metadatabase layer) JDBC
Data adapters are specific implementations of datadrivers for different genomicdatabases
Data MappingFunction
Data MappingFunction
The data mapping function maps objects and their attributesto specific databases
Process Flow
a. The authorization process begins from the client and is passed to the Security and Administration API . This process select the services API.
b. The Administration selects the Data Analysis API and the data request is passed to the Data Abstraction Layer (DAL).
c. The data mapping function is invoked, the specified application and URI are referenced and the proper driver.
d. The various data drivers implement the Metadatabse layer (MDL) produce common requests and result sets are selectively cached in the Data Abstraction Server.
e. Once data is delivered from one of the database, it may be sent to the analysis application.
GlobalSchemaGlobal
Schema
Security and Administration Application
SequenceAnalysis Application
SequenceAnalysis Application
Microarray Analysis Application
Microarray Analysis Application
Proteomic Analysis Application
Proteomic Analysis Application
2D and 3D Sequence Visualization
2D and 3D Sequence Visualization
Cytogenomic MapVisualization
Cytogenomic MapVisualization
TranscriptionalNetwork Visualization
TranscriptionalNetwork Visualization
Client
FirewallFirewall1
2
3
Nu
mb
er
of p
ub
lishe
d p
ap
ers
Nu
mb
er
of p
ub
lishe
d p
ap
ers
Development of New Detection Devices
15840 16 1
1320
0
200
400
600
800
1000
1200
1400
Malaria
HIVViruses
Dengue
Cancer
UC Berkeley
Centers for Diseases Control
University of Zurich, Switzerland
MIT
Walter Reed Army Institute of Research
San Diego Supercomputing Center
UMass Med School
Pasteur Institute, France
Pasteur Institute, Senegal
Sandia National LaboratoriesAugust 2003
Collaborators
December 2002
KDD-cup
(A) (B) (T)
(T+A+B) (T:A:B)
(2T)
Arising Biological Questions About Genomic Signatures
Genomic signatures and the Origen of Life?
Repeated self-replication and simple evolvability http://www.eastman.ucl.ac.uk/~thutton/Evolution/
Hutton, T.J. (2002) Evolvable Self-Replicating Molecules in an Artificial Chemistry. Artificial Life 8(4):341-356.
Lee, D. H.; Granja, J. R.; Martinez, J. A.; Severin, K.; Ghadiri, M. R. "A Self-Replicating Peptide". Nature 1996, 382, 525-28.
Martin A. Nowak, Karl Sigmund Phage-lift for game theory. Nature398, 367 - 368
Paul E. Turner, Lin Chao . Prisoner's dilemma in an RNA virus. Nature398, 441 - 443
Seoul
Sin Nombre
Dobrava
1 25 50 75 100 125 150 175 200 225 250 275
13 6 4174 5
8
13
13
6
6
16
16
16
3
3
3
16
172 5
16
4
1
8
4 15
4
4 17
Ungapped Whole Genomic Footprinting
Ebola-Zaire-Mayinga-subtype-Zaire
Ebola-Zaire-Mayinga-C
Ebola-Zaire-Mayinga-B
Ebola-Zaire-Mayinga-A
Ebola-Zaire-1995
Ebola-EBLPROTG-Zaire
Ebola-Zaire-Mayinga
Ebola-Reston
Ebola-Reston-Pennsylvania
Ebola-Rston-A
Ebola-Maleo
Marburg-1975/Ozolin
Marburg-MRVMBGL
Marburg-Lake.Victoria-pp3
Marburg-MVREPCYC
Marburg-MAVSPAB
Marburg-MVVIRPR
Marburg-NC 001608.2
Marburg-Lake.Victoria-pp4
89
68
67
0.000.020.040.060.080.10
Phylogenomic Analysis of Hemorrhagic Fevers
Lethality
Dispersion
Viral IsolationViral Extinction threshold
Viral Isolation
Viral Extinction threshold
Time
Risk for dengue fever (DF) among travelers to Thailand, 2002.
Christina Frank,* Irene Schöneberg,* Gérard Krause,* Hermann Claus,* Andrea Ammon,* and Klaus Stark* *Robert Koch Institute, Berlin, Germany
http://www.cdc.gov/ncidod/EID/vol10no5/03-0495-G2.htm
The Central Dogma• Gene finding algorithms
• Mutation• Alternative splicing• Folding dynamics
The Central Dogma• Gene finding algorithms
• Mutation• Alternative splicing• Folding dynamics
Pathways• Directionality
• Association accuracy
Pathways• Directionality
• Association accuracy
Functional Modules• Module directionality
• Visualization
Functional Modules• Module directionality
• Visualization
Large Scale Organization• Evolutionary perspective
• Ecological Level?
Large Scale Organization• Evolutionary perspective
• Ecological Level?
Viral Life Cycle
Viral Adaptation
Viral Structural Changes
Sequence Space
Selection
Evolutionary Dinamics
90% of the Zairian cases and 50% of the Sudanese cases resulted in death.
Marburg hemorrhagic fever is between 23-25%.
Dengue vEntebbe
JapaneseKokobera
Modoc viMosquito
Ntaya viRio Brav
Seabird Spondwei
TentativTick-bor
Yellow fzunclass
0
1
2
3
4
5
OIB
FG
S1
Genomic Signature Count
HLA-A1 HLA-A68.1
qvpfcsnhftel Lupoid hepatitis restricted hepatitis B core antigen
A comprehensive genomic analysis of the genus Flavivirus genus suggest the existence of a core viral genome composed by 47 elements each with a length of 12 aminoacids. For 7 of the viral genomes there is at least one copy of each element. However, several genomic signatures are duplicated up to three times. But it remains unclear if the generation of genomic signatures are cyclic events.
Our analysis shown that duplication of genomic signatures and the mutation still a relevant process in viral genome evolution, and could be is involved in viral recombination and self interaction.
As mutation pressures selected fitted individuals, new species with novel characteristics emerged. However, there is a tradeoff between viral pathogenesis and dispersal.
Conclusions
Each of these particles is about 4.5 microns long--about one-twentieth the diameter of a human hair, which is about 100 microns. Alternating gold and silver stripes create the "barcode" pattern on these tiny particles. When viewed in blue light under a microscope, silver is much more reflective than gold, making different-patterned particles easy to identify.
4.5 microns
Aminoacid Usage in the Flavivirus and Filovivirus`
%
Filovirus Flavivirus
23
23
41
41
41 19
41 19
41 19
45
29
45
45
45
45 29
24
24
42
42
45
45
41
41
11
11
11
11
11
33
33
33
33
33
41
41
41
6 31
31
31
31
31
6
45
5 11
11
33
33
14
14
23 41
22 14
45
45
2
41
41
19
19
41
41
19
19
49 11
11
33
33
11 33
11 33
31
31
31
31
31
31
45 41
41 19
41 19
24 16 45 41
41 41 11 33 31
45 41
42 45 41
24
24
16
42
45 41
45 41
45 36 22 41 19 24 42 45 41 18 11 33 31
45 41 1143 33 31
22 31
3 21 11 33 31
11
1 25 50 75 100 125 150 175 200 225 250 275
Japanese
West Nile
Yellow
Tick-borne
Louping
Murray
Deer
Modoc
Rio
Apoi
Posawan
Langat
Montana
Alkhurma
Dengue
Tamana
14 41 19 11 33 314127 18 42
Cell 40
23
23
41
41
41 19
41 19
41 19
45
29
45
45
45
45 29
24
24
42
42
45
45
41
41
11
11
11
11
11
33
33
33
33
33
41
41
41
6 31
31
31
31
31
6
45
5 11
11
33
33
14
14
23 41
22 14
45
45
2
41
41
19
19
41
41
19
19
49 11
11
33
33
11 33
11 33
31
31
31
31
31
31
45 41
41 19
41 19
24 16 45 41
41 41 11 33 31
45 41
42 45 41
24
24
16
42
45 41
45 41
45 36 22 41 19 24 42 45 41 18 11 33 31
45 41 1143 33 31
22 31
3 21 11 33 31
11
1 25 50 75 100 125 150 175 200 225 250 275
Japanese
West Nile
Yellow
Tick-borne
Louping
Murray
Deer
Modoc
Rio
Apoi
Posawan
Langat
Montana
Alkhurma
Dengue
Tamana
14 41 19 11 33 314127 18 42
Cell 40
Genomic Footprinting and Biological Complexity
Microarray Detection Analysis