Upload
tristan-bowes
View
218
Download
3
Embed Size (px)
Citation preview
Bioinformatics and Natural Computing
DISCo Departmental Workshop2010-06-03
DISCo UNIMIB Departmental Workshop
2
Outline
• BIMIB: BIonformatics MIlano Bicocca
• Research areas and new directions
http://bimib.disco.unimib.it
• People• Cooperations
2010-06-03
DISCo UNIMIB Departmental Workshop
3
Research areas and directions
Sequence AnalysisMotif Finding, SNP classification,Haplotyping, Alternative Splicing
Prediction
Statistical Analysis of Biological Experiments
Association Studies, Microarray Analysis, Clustering, Redescriptions
AlgorithmicsApproximation Algorithms for Combinatorial Problems in
Computational Biology (MAST, LCS, Fingerprint clustering …)
Biomedical OntologiesCollaborative Association Studies, Phenotype
Ontology Development
Natural ComputingTheory and applications of Membrane Systems
Splicing Systems and Formal LanguagesDNA Word Design
Evolutionary computing
Systems BiologyModels of biological systems
Stochastic Simulation of Biochemical ProcessesData Mining
2010-06-03
Natural Computing
DISCo UNIMIB Departmental Workshop
5
Natural computing• The work conducted in this area concerns the study of models
of computation that are inspired by nature• The most important research lines that the BIMIB group is
pursuing are centered on– DNA computing– Membrane systems– Evolutionary and Genetic computing
2010-06-03
DISCo UNIMIB Departmental Workshop
6
Natural Computing:basic research• Much of the type of research done in these areas can be
characterized as theoretical computer science, where questions of decidability, computational complexity and expressive power are paramount
• In particular:– Relations with languages in the usual Chomsky hierarchy– Comparison with other computational models– Complexity aspects related to time and space resources– Application of the model to the solution of computationally hard problems– Fitness-driven Importance Sampling techniques for evolutionary algorithms– Operators-Driven Distance Measures
2010-06-03
DISCo UNIMIB Departmental Workshop
7
Natural Computing:applications
• Some applications include:– Description of cellular phenomena or cellular structures
(e.g., Mechanosensitive channels, Sodium-Potassium pump, …)
– Analysis of the behaviour of complex systems, by means of stochastic models
– Design of software simulators to return meaningful information to biologists
– Automatic assessment of system's biology parameters– Automatic mining of microarray datasets
2010-06-03
Bioinformatics
DISCo UNIMIB Departmental Workshop
9
Bioinformatics: sequence analysis applications
• One of the major applications of informatics to the molecular biology lies in the application of string analysis algorithms to the study of nucleic acids and proteomic sequences
2010-06-03
DISCo UNIMIB Departmental Workshop
10
Bioinformatics: sequence analysis applications
• Alternative splicing prediction– Alternative splicing (AS) is considered one of the main
mechanisms able to explain the huge gap between the number of predicted genes and the high complexity of proteome in human.
– Main goal is the development of fast and reliable computational tools for analyzing and predicting AS from Expressed Sequence Tag (ESTs) and other genomic data
– ASPIC (Alternative Splicing PredICtion) tool
2010-06-03
DISCo UNIMIB Departmental Workshop
11
Bioinformatics: sequence analysis applications
• Approximate Pattern Discovery– Given a set of nucleotide or protein sequences, find all the
motifs or conserved patterns, i.e.:• All patterns that occur (with a maximum allowed number of mutations,
insertions or deletions) in every sequence of the set• All patterns that occur (as above) in a “surprisingly” high number of
sequences• The pattern “closer” to the sequences under some distance measure
– Pattern discovery: The WeederWeb System
2010-06-03
DISCo UNIMIB Departmental Workshop
12
Bioinformatics: sequence analysis applications
• Phylogenetic Reconstruction and Comparison– Computational complexity and algorithmic solution of
optimization problems derived by specific instances of the more general problem of comparing phylogenies (or evolutionary networks) to combine them into a single representation (i.e. an evolutionary tree or network).
– A basic problem we investigate in comparative phylogenetics is the reconciliation (or inference) of species tree from gene trees
2010-06-03
DISCo UNIMIB Departmental Workshop
13
Bioinformatics: sequence analysis applications
• Haplotype Inference (HI) and Genetic Variation Analysis– Design and experimentation of algorithm for solving
combinatorial problems related to haplotype inference and genetic variations analysis.
– Specific computational problems of interest are: • inferring the complete information on haplotypes from
(incomplete or partial) haplotypes or genotypes• efficient reconstruction of the perfect phylogeny describing the
evolutionary history of Single Nucleotide Polymorphisms (SNPs) data in presence of recurrent mutations
2010-06-03
Statistical Data Analysis of High Throughput Data
DISCo UNIMIB Departmental Workshop
15
Statistical Data Analysis of Biological Experiments
• The amount of data generated by high-throughput (non-sequencing) biotechnology apparatuses is huge– Microarray– microRNA– Proteomic machinery (cfr. mass-spectrometry)
2010-06-03
DISCo UNIMIB Departmental Workshop
16
Statistical Data Analysis of Biological Experiments• Statistical methods of various kinds are necessary to validate
hypotheses and perform data mining operations• The research pursued by the group in this area concentrated on
– Time course data analysis with kernel methods evaluation of ontological “enrichments”
– Multiple data sources integration for mass-spectrometry data with mutual information scoring
– Application of Evolutionary and Genetic computing for the assessment of features (biological markers and combination of biological markers) in gene assays
2010-06-03
Biomedical Ontologies Engineering
DISCo UNIMIB Departmental Workshop
18
Biomedical Ontologies• The need for common vocabularies and “ontologies”
used to label and/or model data has been recognized as a cornerstone of community research by biologists and physicians
• The BIMIB group worked on using ontologies for two applications– Enrichment studies (cfr., statistical analysis)– Definition of new ontologies for clinical applications and
genotype-phenotype associations
2010-06-03
DISCo UNIMIB Departmental Workshop
19
Biomedical OntologiesNeuroWEB
• The NeuroWEB project was concluded in 2009– The aim of the NEUROWEB project is to support association
studies in the field of neurovascular medicine, with a special commitment to genotype-phenotype relations
– In particular, in the NEUROWEB project, the phenotype is formulated on the basis of the patients’ clinical data, eventually leading to the comprehensive assessment of the patients’ pathological state
2010-06-03
DISCo UNIMIB Departmental Workshop
20
Biomedical OntologiesNeuroWEB
• Three main ontological layers (10 Top Phenotypes - ~200 Low Phenotypes - ~300 Core Data Set elements) is organized in taxonomies
• A set of ontological relations (17 object properties) to:– Connect the leaves of the three layers– Enable complex phenotype construction;
• Accessory layers (anatomical parts, quantitative/qualitative attributes, …)
2010-06-03
DISCo UNIMIB Departmental Workshop
21
Biomedical OntologiesNeuroWEB
2010-06-03
CDS
TOP PHENOTYPEOntoRelations
LOW PHENOTYPE
OntoRelations
Systems BiologySimulation and Analisys
DISCo UNIMIB Departmental Workshop
23
Simulation of biological systems• Systems biology is the study of a biological
system emergent properties once modeled (and simulated) as a set of interacting parts
• Different kinds of simulations are possible– Deterministic (differential equations)– Stochastic (Gillespie’s algorithm, a form of Monte Carlo
algorithms)
2010-06-03
DISCo UNIMIB Departmental Workshop
24
Stochastic Simulation• The modeling formalism:
– Membrane (P) systems
• The simulator– C language– Desktop PC– Cluster DISCo and CINECA with MPI implementation– Algorithm: modified Gillespie’s algorithm with τ-leaping
2010-06-03
DISCo UNIMIB Departmental Workshop
25
Studying stochasticity in biological systems
• 2 kinds of noise:– intrinsic noise - due to the inherent nature of the biochemical
interactions– extrinsic noise - due to the external environmental conditions
• Complex systems such as the biological ones are non-linear and often exhibits many steady states, bifurcations or chaotic behavior
2010-06-03
DISCo UNIMIB Departmental Workshop
26
Stochastic simulations: applications
• Molecular and cellular scale:– transport proteins
• Na+/K+ pump, Ca2+ channels, mechanosensitive channels
– chemical reactions • Belousov-Zhabotinsky, Michaelis-Menten
– cellular signaling pathways • EGFR, Ras/cAMP/PKA
– bacterial colonies • Vibrio fischeri, Pseudomonas aeruginosa
2010-06-03
Biological systems simulations: Colon Rectal Crypts
Three-dimensional schematic of a crypt in the mouse small intestine. The positions of the individual cells show how things might look in a typical crypt. The Paneth cells tend toward the bottom, where they contribute to innate immunity by responding to bacterial infection (Ayabe et al. 2000). The numbers on the cells show the transit cell generation i, as in the Ti of Figure 12.6. The stem cells vary in actual cellular position in the range 3–7, but on average appear to be around cell position 4 when numbered from the bottom. The figure only shows the bottom 7 cell positions of the approximately 15 positions. CSC abbreviates "clonogenic stem cell" (see Figure 12.6). Redrawn from Marshman et al. (2002). Copied from NCBI Frank’s online book
2010-06-03 DISCo UNIMIB Departmental Workshop
27
DISCo UNIMIB Departmental Workshop
28
People BIMIB DISCo• Marco Antoniotti • Paola Bonizzoni• Claudio Ferretti• Alberto Leporati• Giancarlo Mauri• Raffaella Rizzi• Leonardo Vanneschi• Claudio Zandron• Italo Zoppis
• Roslyn Sagaya Mary Antonath• Stefano Beretta• Mauro Castelli• Paolo Cazzaniga• Gianluca Colombo• Antonella Farinaccio• Luca Manzoni • Dario Pescini • Yuri Pirola• Antonio Enrico Porreca• Andrea Valsecchi
2010-06-03
DISCo UNIMIB Departmental Workshop
29
Other People• Francesco Archetti, DISCo• Enza Messina, DISCo• Enzo Martegani, BtBs• Marco Vanoni, BtBs• Riccardo Dondi, Un. Bergamo• Gianluca Della Vedova, Statistica,
UNIMIB• Daniela Besozzi, Un. Milano• Giulio Pavesi, Un. Milano• Graziano Pesole, Un. Bari
• Mario Giacobini, Un. Torino• Paolo Provero, Un. Torino• Manuela Gariboldi, IFOM-IEO• James Reid, IFOM-IEO• Luciano Milanesi, ITB CNR• Marco Pierotti, Istituto Nazionale dei
Tumori• Giovanna Castoldi, Medicina,
UNIMIB• Fulvio Magni, Medicina, UNIMIB
2010-06-03
DISCo UNIMIB Departmental Workshop
30
Other People International• Daniele Merico – Un. Toronto, Toronto, Canada• Gary Bader – Un. Toronto, Toronto, Canada• Bud Mishra – NYU, New York, USA• Naren Ramakrishnan – Virginia Tech, Blacksburg, VA, USA• Victor Moreno – ICOncologia, Barcellona, Spain• Miguel-Angel Pujana – ICOncologia, Barcellona, Spain• Laura Slaughter – National Technical University of Norway (NTNU),
Norway• Aristotelis Chatzioannou – EIE, Athens, Greece• Viktor Malyshkyn – Center for Supercomputing, Russian Academy of
Sciences, Novosibirsk, Russia
2010-06-03
DISCo UNIMIB Departmental Workshop
31
Conferences and Workshops
• Signs Symptoms and Findings Workshop 2009, September 2009, Milan, Italy
2010-06-03
DISCo UNIMIB Departmental Workshop
32
International cooperation
• BIMIB DISCo is the institutional contact point for all initiatives concerning the EC Virtual Physiological Human Network of Excellence (www.vph-noe.eu)
2010-06-03
DISCo UNIMIB Departmental Workshop
33
Funding• Ongoing
– FAR– EnviGP - Improving Genetic Programming for the Environment and Other
Applications, Programa Operacional Factores de Competitividade, Fundação para a Ciência e a Tecnologia (FCT), Portugal (PTDC/EIA-CCO/103363/2008)
– ProteomeNet - Rete Nazionale per lo studio della proteomica umana, FIRB
• Pending– EU FP7 ICT Virtual Physiological Human
• CRControl (coordinator)• BioBridge (partner)
– Regione Lombardia, Programma ASTIL– Regione Lombardia, Programma Quadro/Università– PRIN 2009
2010-06-03
DISCo UNIMIB Departmental Workshop
34
Publications• All publications authored by BIMIB affiliates and
collaborators are listed on the group web site and on the digidisco platform
http://bimib.disco.unimib.it/index.php/Special:Publications/en
2010-06-03
DISCo UNIMIB Departmental Workshop
35
THANK YOU
2010-06-03