01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Fundamentals of Bioinformatics:computation, biology,computational biology
Fundamentals of Bioinformatics:computation, biology,computational biology
Vasilis J. PromponasBioinformatics Research LaboratoryDepartment of Biological Sciences
University of Cyprus
Vasilis J. PromponasBioinformatics Research LaboratoryDepartment of Biological Sciences
University of Cyprus
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
A short self-introductionA short self-introduction● Vasilis, pronounced: “Vass`ilis”● A Frog Physicist turned into a Biologist
– Coincidence: it all happened around 1995-96
– Computational approach
● PhD in Biology (2004, ComputationalBiology/Bioinformatics), University of Athens,Greece
● 2005 – Moved to Cyprus
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Cyprus?Cyprus?Source: http://maps.google.com
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Cyprus?Cyprus?
Source: http://maps.google.com
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Cyprus?Cyprus?
Source: http://maps.google.com
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Cyprus?Cyprus?
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
University of CyprusUniversity of Cyprus
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
OverviewOverview● Introduction
– Some definitions and concepts from (Molecular)Biology
– The rapid growth of Biological Data
● The advent of the Genome Era (a paradigmshift in Biology?)
● Bioinformatics and Computational Biology:Fundamental Problems – Concepts –Applications
● Discussion
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionIntroduction
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionBio::revenge
IntroductionBio::revenge
● Biology IS the science of the 21st century
– Used to be a QUALITATIVE scientific domain● Exceptions have been the Rule
– Turning into QUANTITATIVE
– An Information Rich field● Impact in every aspect of (human) lives
– Food production and Quality Control
– Environment (e.g. Ecology, Monitoring,Management)
– Human activities/welfare (e.g. sports, cosmetics,health)
– ...
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
Source: http://www.google.com
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
Source: http://www.google.com
But what about some viral genomes?
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
Source: http://www.google.com
Contains heritable(?) information
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
✔ Tighly packaged (DNA/RNA/proteins)
✔ 3D-structural organization✔ Contains “functional” regions (a.k.a. genes) and regions of (yet) unknown function
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
✔ A chromosomal region encoding mRNAs, tRNAs, etc.
✔ Useful keyword: Transcription✔ mRNAs: non-terminal
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
IntroductionKey actors
IntroductionKey actors
● Genome(s)● Chromosome(s)● Gene(s)● Protein(s)
✔ Main components of the (cellular) toolkit
✔ Linear polymers ✔ Interact with other biological molecules
✔ Useful keyword: Translation
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Information Flow in BiologicalSystems
The “Central Dogma”
Information Flow in BiologicalSystems
The “Central Dogma”
PROTEINPROTEIN
RNARNA
DNADNA
Replication
Transcription
Translation
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Coupled with a “Universal”genetic code
Coupled with a “Universal”genetic code
The Genetic Code is Degenerate!
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
… some more complexity ...… some more complexity ...
ORF1 ORF2 ORF3 ORF4
Genome
Chromosomes
DNASequence
Gene (fine)Structure
Amino AcidSequence
E1 E2 E3
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
...ACTGTCTGACCGGCAGCA...
...TGACAGACTGGCCGTCGT...
DNA stores information ...
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Proteins get the dirty job done …
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
ProteinsProteins● Assembled from one or more polypeptide
chains (homo-/hetero-polymers)● The functional “toolkit”
● Enzymes● Transport-Storage● Motion● Binding● Molecular Recognition● Signal Transduction
● Structural Proteins● Energy Production● Cell Regulation and
Differentiation● ... (...)
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
F V N Q H L C G S H L V E A L V C G E R G F F Y T P K AY L
G I V E Q C C T S I C S L Y Q L E N Y C Q
SS SS
SS
SS
SS
SS
Chain AChain A
Chain BChain B
Pig Insulin Dimer(PDB_ID:4INS)
MALWTRLLPLLALLALWAPAPAQAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREAENPQAGAVELGGGLGGLQALALEGPPQKRGIVEQCCTSICSLYQLENYCN
Pig Insulin Precursor
Yet, some more complexity(PTMs)
Yet, some more complexity(PTMs)
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Back to the “Central doma”Back to the “Central doma”
For (almost) all proteins
FunctionFunction
3D-structure3D-structure
SequenceSequence Determines
Determines • Glucose Uptake PathwayGlucose Uptake Pathway• Glycogen SynthesisGlycogen SynthesisPathwayPathway• Formation ofFormation oftriglyceridestriglycerides
..VEQCCTSICSLYQL..
Again, this “genetic code”is redundant
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
But, where is the computation inbiology??
But, where is the computation inbiology??
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
BioinformaticsBioinformatics
Biology
Statistics,Mathematics
Physics,ChemistryEngineeringLinguistics,
...
Informatics,Comput
erScience
Bioinformatics
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Bio – related fieldsBio – related fields
● Computational Molecular Biology● Bioinformatics● Theoretical Biology● Biomedical Informatics● ...
● Where are the limits?
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
A (fuzzy?) definition ofBioinformatics
A (fuzzy?) definition ofBioinformatics
● Bioinformatics is the “computational handling and processing of genetic information”
Ouzounis & Valencia, 2003
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Handling Genetic InformationHandling Genetic Information
● Apply existing (or develop custom) efficientmethods for
– Describing and Visualizing
– Storing
– Retrieving
– Integrating
● Large volumes of complex andinhomogeneous data*
*some still call it “Designing and Building Biological Databases”
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Handling Genetic Information(part II)
Handling Genetic Information(part II)
● Particular attention:– Origin and Quality of Biological Data
– Data Annotation [Expert-based,(semi-)automatic)
– Interconnectivity
– Friendly to the end-user
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Processing Genetic InformationProcessing Genetic Information
● Analysing biological data– AIM I: ADRESSING BIOLOGICAL questions.
● What makes Frodo Baggins (the Hobbit) differfrom Spiderman? (consider that Spiderman'skitsch costume is not a valid answer)
● Does molecule A interact with molecule B?● What is the 3D structure adopted by X?● How does the 3D structure of a molecule
specify its function?– AIM II: ADRESSING other SCIENTIFIC or
TECHNICAL questions
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Processing Genetic Information(part II)
Processing Genetic Information(part II)
● Other questions???– Which is the optimal way to store genome data in
a database?
– How can I represent sequences belonging in afamily with a statistical model?
– How can I obtain the optimal pairwiseDNA/RNA/Protein sequence alignment?
– Is their any statistical measure for indicating thesignificance of a sequence comparison score?
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
A parenthesis (...) for solving acommon misunderstanding
A parenthesis (...) for solving acommon misunderstanding
● Traditional biologists often see Bioinformaticsas a “Black box”
– i.e., predict, then go back in the lab to confirmwith experiment …
● However,
– the computational approach to addressingbiological problems is an experimentalfield on its own
– a single difference: experiments areperformed in silico.
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
And finally ... what do you meanby “Genetic Information”?
And finally ... what do you meanby “Genetic Information”?
● It can be quite generic– Nucleotide and amino acid sequences
– Three dimensional molecular structuresstructures (proteins, DNA, RNA, sugars,drugs, ...)
– Gene expression data
– Molecular interaction networks
– Complex biological systems (cells, tissues,organisms, ...)
– ... even text in the biomedical literature ...
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
OMICSOMICS● GenOMICS● TranscriptOMICS● ProteOMICS● MetabolOMICS● KinOMICS● PhylogenOMICS● EpitOMICS
even more ...– BibliOMICS
– DegradOMICS
??? cOMICS ???
Also be aware:
A comprehensive list may be found at the URLhttp://www.genomicglossaries.com/content/omes.asp
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
Importatnly ...Importatnly ...
●Freely available data
●Accessible software[free/open software]
01-05/12/2014 Computational Metagenomics Workshop University of Mauritius
ر ك شthanksઆ ભ ારmerciशुि क या
ευχαριστώ
ر ك شthanksઆ ભ ારmerciशुि क या
ευχαριστώ