53
BIO-TRAC 25 (Proteomics: Principles and Methods) BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 October 10, 2003 NIH, Bethesda, MD NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Senior Bioinformatics Scientist, Protein Information Resource Protein Information Resource National Biomedical Research Foundation, GUMC National Biomedical Research Foundation, GUMC Tutorial: Tutorial: Bioinformatics Resources Bioinformatics Resources

BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

BIO-TRAC 25 (Proteomics: Principles and Methods)BIO-TRAC 25 (Proteomics: Principles and Methods)October 10, 2003October 10, 2003 NIH, Bethesda, MDNIH, Bethesda, MD

Zhang-Zhi Hu, M.D. Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Senior Bioinformatics Scientist, Protein Information ResourceProtein Information ResourceNational Biomedical Research Foundation, GUMCNational Biomedical Research Foundation, GUMC

Tutorial: Tutorial: Bioinformatics ResourcesBioinformatics Resources

Page 2: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

2

What is Bioinformatics?What is Bioinformatics?

NIH Biomedical Information Science and Technology NIH Biomedical Information Science and Technology Initiative (BISTI) Working Definition (2002)Initiative (BISTI) Working Definition (2002) - Research, - Research, development, or application of computational tools and development, or application of computational tools and approaches for expanding the use of biological, medical, approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.organize, archive, analyze, or visualize such data.

BioinformaticsBioinformatics is the application of information technology is the application of information technology to the analysis, organization and distribution of biological to the analysis, organization and distribution of biological data in order to answer complex biological questions.data in order to answer complex biological questions.

Page 3: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

3

Bioinformatics ResourcesBioinformatics Resources

The Molecular Biology Database Collection: The Molecular Biology Database Collection: An Online An Online Compilation of Relevant Database ResourcesCompilation of Relevant Database Resources 2003 update: 2003 update: http://www3.oup.co.uk/nar/database/ Nucleic Acids Research Database Issues (January Annually) Nucleic Acids Research Database Issues (January Annually)

(2003 - (2003 - http://nar.oupjournals.org/content/vol31/issue1/))

DBcat: DBcat: A Catalog of > 500 Biological DatabasesA Catalog of > 500 Biological Databases http://www.infobiogen.fr/services/dbcat/

Page 4: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

4

Molecular Biology Database Collection Molecular Biology Database Collection (http://nar.oupjournals.org/cgi/content/full/31/1/1#GKG120TB1)

Page 5: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

5

The Molecular Biology Database Collection: The Molecular Biology Database Collection: 2003 update (Baxevanis, A.D.)2003 update (Baxevanis, A.D.)

---- An online resource of 386 key databases of 18 categoriesAn online resource of 386 key databases of 18 categories

Major sequence repositoriesMajor sequence repositories

Comparative GenomicsComparative Genomics

Gene ExpressionGene Expression

Gene Identification and Gene Identification and StructureStructure

Genetic and Physical MapsGenetic and Physical Maps

Genomic DatabasesGenomic Databases

Intermolecular InteractionsIntermolecular Interactions

Metabolic Pathways and Metabolic Pathways and Cellular RegulationCellular Regulation

Mutation DatabasesMutation Databases

PathologyPathology

Protein Sequence MotifsProtein Sequence Motifs

Proteome ResourcesProteome Resources

Retrieval Systems and Retrieval Systems and Database StructureDatabase Structure

RNA SequencesRNA Sequences

StructureStructure

TransgenicsTransgenics

Varied Biomedical ContentVaried Biomedical Content

Page 6: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

6

OverviewOverview

Protein Sequence AnalysisProtein Sequence AnalysisII. Sequence Similarity Search and Alignment. Sequence Similarity Search and Alignment

IIII. Family Classification Methods. Family Classification Methods

IIIIII. Structure Prediction Methods. Structure Prediction Methods

Molecular Biology DatabasesMolecular Biology DatabasesIVIV. Protein Family Databases. Protein Family Databases

VV. Database of Protein Functions. Database of Protein Functions

VIVI. Databases of Protein Structures. Databases of Protein Structures

Proteomic ResourcesProteomic ResourcesVIIVII. 2D-gel databases. 2D-gel databases

VIIIVIII. Proteomic analyses. Proteomic analyses

Page 7: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

7

I. Sequence Similarity SearchI. Sequence Similarity Search

Find a protein sequence: Find a protein sequence: text searchtext searchBased on Based on Pair-Wise ComparisonsPair-Wise Comparisons BLOSUMBLOSUM scoring matrix scoring matrix PAMPAM scoring matrix scoring matrixDynamic Programming AlgorithmsDynamic Programming Algorithms Global Similarity: Global Similarity: Needleman-WunschNeedleman-Wunsch ( (GAP/BestFitGAP/BestFit)) Local Similarity: Local Similarity: Smith-WatermanSmith-Waterman ( (SSEARCHSSEARCH))Heuristic Algorithms (Sequence Database Searching)Heuristic Algorithms (Sequence Database Searching) FASTAFASTA: Based on K-Tuples (2-Amino Acid): Based on K-Tuples (2-Amino Acid) BLASTBLAST: Triples of Conserved Amino Acids: Triples of Conserved Amino Acids Gapped-BLASTGapped-BLAST: Allow Gaps in Segment Pairs (NREF): Allow Gaps in Segment Pairs (NREF) PHI-BLASTPHI-BLAST: Pattern-Hit Initiated Search (NCBI): Pattern-Hit Initiated Search (NCBI) PSI-BLASTPSI-BLAST: Iterative Search (NCBI): Iterative Search (NCBI)

Page 8: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

8

Sequence Search by Text or Unique IDSequence Search by Text or Unique IDEntrez (http://www.ncbi.nlm.nih.gov/Entrez/)

(http://pir.georgetown.edu/pirwww/search/textsearch.html)

Page 9: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

9

Pair-Wise Pair-Wise ComparisonsComparisons

Scoring matrix Global lobal and local local

Similarity: Similarity: Dynamic Dynamic ProgrammingProgramming((Needleman-Wunsch,Smith-Waterman)

((http://www.ebi.ac.uk/emboss/align/))

Page 10: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

10

FASTA SearchFASTA Search

(http://www.ebi.ac.uk/fasta33/)

(http://pir.georgetown.edu/pirwww/search/fasta.html)

Page 11: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

11

Gapped-BLAST SearchGapped-BLAST Search(http://pir.georgetown.edu/pirwww/search/pirnref.shtml)

(http://www.ncbi.nlm.nih.gov/BLAST/)

Page 12: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

A BLAST ResultA BLAST Result

Page 13: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

13

PSI-BLAST Iterative SearchPSI-BLAST Iterative Search

(http://www.ncbi.nlm.nih.gov/BLAST/)

Page 14: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

14

PSI-BLASTPSI-BLAST

Page 15: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

15

II. Family Classification MethodsII. Family Classification Methods

Multiple Sequence AlignmentMultiple Sequence Alignment and Phylogenetic Analysis and Phylogenetic Analysis ClustalW Multiple Sequence AlignmentClustalW Multiple Sequence Alignment Alignment Editor & Phylogenetic TreesAlignment Editor & Phylogenetic Trees

Searches Based on Searches Based on Family InformationFamily Information PROSITE Pattern SearchPROSITE Pattern Search Motif and Profile SearchMotif and Profile Search Hidden Markov Model (HMMs)Hidden Markov Model (HMMs)

Page 16: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

16

Multiple Sequence AlignmentMultiple Sequence Alignment ClustalW (http://pir.georgetown.edu/pirwww/search/multaln.html)

Page 17: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

17

Alignment Editor (Jalview)Alignment Editor (Jalview)(http://www.ebi.ac.uk/clustalw/)

Page 18: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

18

Alignment Editor (GeneDoc)Alignment Editor (GeneDoc)(http://www.psc.edu/biomed/genedoc/)

Page 19: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

19

Phylogenetic AnalysisPhylogenetic AnalysisTree Programs: (Tree Programs: (http://evolution. http://evolution. genetics.washington.edu/phylip.htmlgenetics.washington.edu/phylip.html)) Tree Searches: (http://pauling.

mbu.iisc.ernet.in/~pali/index.html)

Page 20: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

20

Phylogenetic Trees Phylogenetic Trees (IGFBP Superfamily)

(Radial Tree)

(Phylogram)

Page 21: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

21

PROSITE Pattern SearchPROSITE Pattern Search(http://pir.georgetown.edu/pirwww/search/patmatch.html)

Page 22: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

22

Profile SearchProfile Search(http://bmerc-www.bu.edu/bioinformatics/profile_request.html)

Page 23: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

23

Hidden Markov Model Search Hidden Markov Model Search (http://www.sanger.ac.uk/Software/Pfam/search.shtml)

(http://smart.embl-heidelberg.de)

Page 24: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

24

III. Structural Prediction MethodsIII. Structural Prediction Methods

Signal Peptide: SIGFIND, SignalP

Transmembrane Helix: TMHMM, TMAP

2D Prediction (-helix, -sheet, Coiled-coils): PHD, JPred

3D Modeling: Homology Modeling (Modeller, SWISS-MODEL), Threading, Ab-initio Prediction

Page 25: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

25

StructureStructurePrediction:Prediction:A GuideA Guide

(http://speedy.embl-heidelberg.de/gtsp/flowchart2.html)

Page 26: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

26

Protein Protein Prediction Prediction ServerServer

(http://www.cbs.dtu.dk/services/)

Page 27: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

27

Signal Peptide PredictionSignal Peptide Prediction(http://www.stepc.gr/~synaptic/sigfind.html)

(http://www.cbs.dtu.dk/services/SignalP-2.0)

Page 28: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

28

Transmembrane HelixTransmembrane Helix

(http://www.cbs.dtu.dk/services/TMHMM/)

Page 29: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

29

Protein Structure PredictionProtein Structure Prediction(http://cmgm.stanford.edu/WWW/www_predict.html)

(http://restools.sdsc.edu/biotools/biotools9.html)

Page 30: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

30

Structure Prediction ServerStructure Prediction Server(http://cubic.bioc.columbia.edu/predictprotein/)

(http://www.compbio.dundee.ac.uk/WWW_Servers/JPred/jpred.html)

Page 31: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

31

3D-Modelling3D-Modelling(http://www.salilab.org/modeller/modeller.html)

(http://www.expasy.ch/swissmod/SWISS-MODEL.html)

Page 32: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

32

IV. Protein Family DatabasesIV. Protein Family Databases

Whole Proteins PIR: Superfamilies and Families COG (Clusters of Orthologous Groups) of Complete Genomes ProtoNet: Automated Hierarchical Classification of Proteins

Protein Domains Pfam: Alignments and HMM Models of Protein Domains SMART: Protein Domain Families

Protein Motifs PROSITE: Protein Patterns and Profiles BLOCKS: Protein Sequence Motifs and Alignments PRINTS: Protein Sequence Motifs and Signatures

Integrated Family Databases iProClass: Superfamilies/Families, Domains, Motifs, Rich Links InterPro: Integrate Pfam, PRINTS, PROSITES, ProDom, SMART

Page 33: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

33

Protein ClusteringProtein Clustering((http://www.ncbi.nlm.nih.gov/COG/))

Page 34: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

34

Protein DomainsProtein DomainsPfam (http://www.sanger.ac.uk/Software/Pfam/)

SMART (http:// smart.embl-heid elberg.de/smart/ show_motifs.pl)

Page 35: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

35

Protein MotifsProtein Motifs PROSITE is a database of protein families and domains. It

consists of biologically significant sites, patterns and profiles. (http://www.expasy.ch/prosite/)

Page 36: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

36

Integrated Family ClassificationIntegrated Family ClassificationInterProInterPro: An integrated resource unifying PROSITE, PRINTS, ProDom, Pfam, SMART, and TIGRFAMs, PIRSF. (http://www.ebi.ac.uk/interpro/search.html)

Page 37: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

37

V. Databases of Protein FunctionsV. Databases of Protein Functions

Metabolic Pathways, Enzymes, and Compounds Enzyme Classification: Classification and Nomenclature of Enzyme-Catalysed

Reactions (EC-IUBMB) KEGG (Kyoto Encyclopedia of Genes and Genomes): Metabolic Pathways LIGAND (at KEGG): Chemical Compounds, Reactions and Enzymes EcoCyc: Encyclopedia of E. coli Genes and Metabolism MetaCyc: Metabolic Encyclopedia (Metabolic Pathways) WIT: Functional Curation and Metabolic Models BRENDA: Enzyme Database UM-BBD: Microbial Biocatalytic Reactions and Biodegradation Pathways Klotho: Collection and Categorization of Biological Compounds

Cellular Regulation and Gene Networks EpoDB: Genes Expressed during Human Erythropoiesis BIND: Descriptions of interactions, molecular complexes and pathways DIP: Catalogs experimentally determined interactions between proteins RegulonDB: Escherichia coli Pathways and Regulation

Page 38: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

38

KEGG Metabolic & Regulatory PathwaysKEGG Metabolic & Regulatory Pathways

(http://www.genome.ad.jp/dbget-bin/show_pathway?hsa00590+874)

KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks, the information of genes and proteins, and of chemical compounds and reactions. (http://www.genome.ad.jp/kegg/kegg2.html)

Page 39: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

39

BioCycBioCyc (EcoCyc/MetaCyc Metabolic Pathways) (EcoCyc/MetaCyc Metabolic Pathways) The BioCyc Knowledge Library is a collection of Pathway/Genome

Databases (http://biocyc.org/)

Page 40: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

40

Protein-Protein Interactions: DIPProtein-Protein Interactions: DIP(http://dip.doe-mbi.ucla.edu/)

Page 41: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

41

Protein-Protein Interaction: BINDProtein-Protein Interaction: BIND((http://www.bind.ca/))

Page 42: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

42

BioCarta Cellular PathwaysBioCarta Cellular Pathways(http://www.biocarta.com/index.asp)

Page 43: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

43

VI. Databases of Protein StructuresVI. Databases of Protein Structures

Protein Structure and Classification PDB: Structure Determined by X-ray Crystallography and NMR CATH: Hierarchical Classification of Protein Domain Structures SCOP: Familial and Structural Protein Relationships FSSP: Protein Fold Family Database

Protein Sequence-Structure Relationship PIR-NRL3D: Protein Sequence-Structure Database PIR-RESID: Protein Structure/Post-Translational Modifications HSSP: Families and Alignments of Structurally-Conserved

Regions

Page 44: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

44

PDB Structure DataPDB Structure Data(http://www.rcsb.org/pdb/)

Page 45: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

45

PDBsum:PDBsum:

Summary and AnalysisSummary and Analysis (http://www.biochem.ucl.ac.uk/bsm/pdbsum)

Page 46: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

46

Protein Structural Protein Structural ClassificationClassification

CATH: Hierarchical domain classification of protein structures (http://www.biochem.ucl.ac.uk/bsm/cath_new/)

Page 47: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

47

Protein Structural ClassificationProtein Structural Classification

(http://scop.mrc-lmb. cam.ac.uk/scop/)

The SCOP database aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known, including all entries in the PDB.

Page 48: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

48

VII. Proteomic ResourcesVII. Proteomic Resources

GELBANK (GELBANK (http://gelbank.anl.gov): 2D-gel patterns from completed ): 2D-gel patterns from completed genomes; SWISS-2DPAGE (genomes; SWISS-2DPAGE (http://www.expasy.org/ch2d/))

PEP: Predictions for Entire Proteomes: (PEP: Predictions for Entire Proteomes: (http://cubic.bioc.columbia.edu/ pep/): Summarized analyses of protein sequences): Summarized analyses of protein sequences Proteome BioKnowledge Library: (http://www.proteome.com): Detailed Proteome BioKnowledge Library: (http://www.proteome.com): Detailed information on human, mouse and rat proteomesinformation on human, mouse and rat proteomesProteome Analysis Database (http://www.ebi.ac.uk/proteome/): Online Proteome Analysis Database (http://www.ebi.ac.uk/proteome/): Online application of InterPro and CluSTr for the functional classification of application of InterPro and CluSTr for the functional classification of proteins in whole genomesproteins in whole genomesExpression Profiling databases: GNF Expression Profiling databases: GNF (http://expression.gnf.org/cgi-bin/index.cgi, human and mouse (http://expression.gnf.org/cgi-bin/index.cgi, human and mouse transcriptome), SMD transcriptome), SMD (http://genome-www5.stanford.edu/MicroArray/SMD/, Stanford (http://genome-www5.stanford.edu/MicroArray/SMD/, Stanford microarray data analysis), EBI Microarray Informatics microarray data analysis), EBI Microarray Informatics (http://www.ebi.ac.uk/microarray/ index.html , (http://www.ebi.ac.uk/microarray/ index.html , managing, storing and managing, storing and analyzing microarray dataanalyzing microarray data))

Page 49: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

49

2D-Gel Image Databases (1)2D-Gel Image Databases (1)(http://gelbank.anl.gov/2dgels/index.asp)

Page 50: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

50

2D-Gel Image Databases (2)2D-Gel Image Databases (2)(http://us.expasy.org/ch2d/2d-index.html)

(http://us.expasy.org/cgi-bin/nice2dpage.pl?P06493)

Page 51: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

51

VIII. Proteome AnalysisVIII. Proteome Analysis(http://www.ebi.ac.uk/proteome)

Page 52: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

52

Expression ProfilingExpression Profiling Human and Mouse Transcriptome

(http://expression.gnf.org/cgi-bin/index.cgi)

(http://genome-www. stanford.edu/serum/)

Page 53: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information

53

Lab:Lab: Visit selected websites and analyze some protein sequences of

your own choices. - List of Bioinformatics Resources of this tutorial available: http://pir.georgetown.edu/~huz/bioinfo_resource.html

Try some of the following sequences for analysis: 1) well characterized proteins: PIR:A26366(CYP17), JS0747(Sp1) 2) less characterized proteins: PIR:A59000(MATER) TrEMBL:Q9QY16(GRTH) 3) hypothetical protein: PIR:T12515, T00338 , T47130 SWISS-PROT:Q9BWT7