43
1 Molecular Medicine in Collaboration with Bioinformatics M. Kamran Azim, Ph.D. International Center for Chemical and Biological Sciences H.E.J. Research Institute of Chemistry, Dr. Panjwani Center for Molecular Medicine and Drug Research University of Karachi

Pcmd bioinformatics-lecture i

Embed Size (px)

Citation preview

Page 1: Pcmd bioinformatics-lecture i

1

Molecular Medicine in Collaboration with Bioinformatics

M. Kamran Azim, Ph.D.International Center for Chemical and Biological Sciences

H.E.J. Research Institute of Chemistry, Dr. Panjwani Center for Molecular Medicine and Drug Research

University of Karachi

Page 2: Pcmd bioinformatics-lecture i

2

What is Bioinformatics?

Bioinformatics is the science of storing, retrieving and analyzing large amounts of biological information. It cuts across many disciplines, including biology, computer science and mathematics. (as defined by EBI)

Page 3: Pcmd bioinformatics-lecture i

3

Application of Bioinformatics in Molecular Medicine

Molecular basis of pathogenicity; e.g. Amyloid protein in neurodegenerative diseases

Novel targets of therapeutic intervention; e.g. Caspase inhibitors in diseases characterized by tissue degradation

Molecular Diagnostics; e.g. Bird Flu

Host-pathogen interaction; e.g. Bacterial adherence factors

Novel Research tools; e.g. GFP-based techniques

Page 4: Pcmd bioinformatics-lecture i

4

How Bioinformatics can support Molecular Medicine?

Genome-level sequence analysis of medically important organisms in order to;

gain comprehensive knowledge for their life cycle,characterization of disease causing factors,identify new targets for therapeutic intervention

Development of Bioinformatics such as novel algorithms, specialized databases and java-based tools for application in genomics and proteomics.

Page 5: Pcmd bioinformatics-lecture i

5

Catalysts for Bioinformatics

Large-scale DNA/genome sequencing projects have led to an explosion of information concerning the DNA and protein sequence data.

Development in the field of computer technology including the use of computerized databases for storing, retrieving and comparing sequences; computer graphics for displaying and manipulating three-dimensional structures.

Page 6: Pcmd bioinformatics-lecture i

6http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html

The explosion in sequence information

billions of bases from over 100,000 species

Page 7: Pcmd bioinformatics-lecture i

7

Bioinformaticsand

Molecular basis of life

Page 8: Pcmd bioinformatics-lecture i

8

Central paradigms of Molecular Biology and

Bioinformatics

DNARNA

ProteinFunction

Genetic InformationProtein

FunctionCell

TissuesOrganismPopulation

Page 9: Pcmd bioinformatics-lecture i

9

Page 10: Pcmd bioinformatics-lecture i

10

DNA for Information

Protein for Execution

Bioinformatics as the Science of Sequence

Page 11: Pcmd bioinformatics-lecture i

11

Molecular Biology in Urdu poetry

Page 12: Pcmd bioinformatics-lecture i

12

Page 13: Pcmd bioinformatics-lecture i

13

Page 14: Pcmd bioinformatics-lecture i

14

Page 15: Pcmd bioinformatics-lecture i

15

Page 16: Pcmd bioinformatics-lecture i

16

Page 17: Pcmd bioinformatics-lecture i

17

Page 18: Pcmd bioinformatics-lecture i

18

Page 19: Pcmd bioinformatics-lecture i

19

Page 20: Pcmd bioinformatics-lecture i

20

Page 21: Pcmd bioinformatics-lecture i

21

Frederick Sanger and the Science of Sequence

at MRC, Cambridge University First Nobel Prize (1958)

was awarded for developing methods to determine the order (sequence) of the building blocks of the protein, insulin.

Second Nobel Prize (1980) for developing several and ever-improving methods to sequence nucleic acids (DNA and RNA).

Page 22: Pcmd bioinformatics-lecture i

22

Prof. Zafar H. Zaidi and Bioinformatics

Pioneered Protein Chemistry;Protein Sequencing;Sequence analysis(1975-2001)

Initiated Bioinformatics;Protein Structure Prediction,Homology modeling(1991-2001)

Page 23: Pcmd bioinformatics-lecture i

23

Scope of topics Biological databases (utilization, development and

integration etc.) Analyses of nucleotide and protein sequence

information Analyses of 3D structural data of macromolecules. Assessment of how small molecules interact with

macromolecules in biological systems. Studies on networks of protein-protein

interactions Simulation of biological processes More

Page 24: Pcmd bioinformatics-lecture i

24

Scope of topics Biological databases (utilization, development and

integration etc.) Analyses of nucleotide and protein sequence

information Analyses of 3D structural data of macromolecules. Assessment of how small molecules interact with

macromolecules in biological systems. Studies on networks of protein-protein

interactions Simulation of biological processes More

Page 25: Pcmd bioinformatics-lecture i

25

Bioinformatics Resources

Sequence Databases 1960s; The first sequences to be collected were those of proteins by Margaret Dayhoff at the NBRF, Washington, USA. [Protein sequence atlas; PIR]

1970s; First DNA sequences databases were (a) the GenBank at Los Alamos National Labotaroy, New Maxico, USA (b) EMBL at the European Molecular Biology Laboratory at Heidelberg, Germany.

Page 26: Pcmd bioinformatics-lecture i

26

Primary Bioinformatics Databases

DNA sequence databasesGenBank, EMBL and DDBJ

Genome Centers databasesSanger Center, TIGR

Protein sequence DatabasesSwissProt, PIR, UniProt

Protein 3D structure databasesPDB, SCOP, CATH

Specialized databasesMEROPS, Protein Kinase Resource

Page 27: Pcmd bioinformatics-lecture i

27

Accessing Bioinformatics Databases

ENTREZ; a window-based program with a web-based interface developed at the NCBI, USA.

SRS; similar service at the EBI, UK.

Page 28: Pcmd bioinformatics-lecture i

28

Page 29: Pcmd bioinformatics-lecture i

29

Specialized databases useful in Molecular Medicine

OMIM- Online Mendelian Inheritance in Man. This database is a catalog of human genes and genetic disorders.

ENSEMBL- is designed to allow free access to all the genetic information available about the Human Genome.

Human Gene Mutation DB- contains sequences and phenotypes of human disease-causing mutations.

KEGG- to computerize knowledge of molecular interactions namely metabolic pathways, regulatory pathways and molecular assemblies.

dbSNP- Single Nucleotide Polymorphisms DB GeneCards- an integrated DB of human genes that

includes automatically-mined genomic, proteomic and transcriptomic information, as well as orthologies, disease relationships, SNPs, gene expression, gene function etc.

Page 30: Pcmd bioinformatics-lecture i

30

Scope of topics Biological databases (utilization, development and

integration etc.) Analyses of nucleotide and protein sequence

information Analyses of 3D structural data of macromolecules. Assessment of how small molecules interact with

macromolecules in biological systems. Studies on networks of protein-protein

interactions Simulation of biological processes More

Page 31: Pcmd bioinformatics-lecture i

31

Sequence Analysis

Sequence Analysis Programs As more DNA sequences became available

in the late 1970s, interest also increased in developing computer programs to analyze the sequences.

In early 1980s, the Genetics Computer Group (GCG) was started at the University of Wisconsin, USA, offering a set of programs for sequence analysis.

Page 32: Pcmd bioinformatics-lecture i

32

Sequence Analysis

Methods for Comparing Sequences

The Dot Matrix method (DOTPLOT, COMPARE) Dynamic programming matrices Word or k-tuple methods (FASTA, BLAST)

Page 33: Pcmd bioinformatics-lecture i

33

Sequence analysis by DotPlotsK A M R A N

K *A * *M *R *A * *N *

KAMRANKAMRANAlignment

K A M R A N

K *EM *R *A * *N *

KAMRANKEMRAN

Substitution

K A M R A A N

K *EM *R *A * *N *

KAMRAANKEMRA-N

Insertion/deletion

Page 34: Pcmd bioinformatics-lecture i

34

DotDlot analysis; repetitive sequencesK A M R A N K A M R A N

K * *E

M * *R * *A * * * *N * *K * *E

M * *R * *A * * * *N * *

Page 35: Pcmd bioinformatics-lecture i

35

Dynamic Programming for sequence alignment

identity and substitution scoring, gap penalty

Page 36: Pcmd bioinformatics-lecture i

36

Sequence Analysis

Sequence comparison and alignmentPairwise sequence alignment

FASTA; BLASTMultiple sequence alignment

PILEUP; ClustalW Pattern search; PROSITE Phylogenetic analysis; Phylip Genome-level sequence analysis

Page 37: Pcmd bioinformatics-lecture i

37

Pairwise sequence alignment of(a) human and chicken cathepsin B and (b) human and hookworm cathepsin B.

Identical residues are indicated as dark blocks.

Page 38: Pcmd bioinformatics-lecture i

38

Multiple sequence alignment

Page 39: Pcmd bioinformatics-lecture i

39

Multiple sequence alignmentof the family of kunitz-type proteinase inhibitors

Page 40: Pcmd bioinformatics-lecture i

40

Phylogenetic Analysis of kunitz-type proteinase inhibitors based on multiple sequence alignment

Page 41: Pcmd bioinformatics-lecture i

41

Scope of topics Biological databases (utilization, development and

integration etc.) Analyses of nucleotide and protein sequence

information Analyses of 3D structural data of macromolecules.

three dimensional strutures and Structural Bioinformatics

Assessment of how small molecules interact with macromolecules in biological systems.

Studies on networks of protein-protein interactions Simulation of biological processes More

Page 42: Pcmd bioinformatics-lecture i

42

End Note Bioinformatics is the body of Knowledge; A wealth of data on sequences and structures.

Key Resource is KNOWLEDGE

And the key technology is INFORMATION HANDLING

Page 43: Pcmd bioinformatics-lecture i

43

Leading Bioinformatics InstitutionsEuropean Bioinformatics Institute, Cambridge, UKNational Center for Biotechnology Information, USANational Human Genome Research Institute, USAEMBL, Heidelberg, GermanyJ. Craig Ventor Institute, USA[formerly The Institute of Genome Research (TIGR)]The Sanger Institute, UK

Bioinformatics Journals and BooksBioinformaticsGenome ResearchNucleic Acid ResearchBioinformatics by D.W. MountIntroduction to Bioinformatics by AttwoodStructural Bioinformatics by P.E. BourneBioinformatics; A beginner’s Guide by ClaverieBioinformatics Computing by B. Bergeron

Bioinformatics SocietiesInternational Society for Computational Biology (ICSB)Asia Pacific Bioinformatics Network (APBioNet)European Conference on Computational Biology (ECCB)