View
128
Download
1
Tags:
Embed Size (px)
Citation preview
Data Retrieval
Access to Distributed dataBiological data is widely distributed over the WWW.
Data can be retrieved by, 1. Search engines
2. Data retrieval tools
Search EnginesExamples for Search Engines, Google Yahoo! Search LeapFish
Bing Using Search engines 1. Can find relevant web pages 2. It is difficult to find desired information 3. Difficult to find specific information.
Leapfish
bing
Data retrieval toolsDedicated to access information for molecular biologists.Most widely used are, 1. Entrez 2. DBGET 3. SRS Each of these allows, - Text based searching of a no. of linked DBs. - Sequence searching.They differ in, - The DBs they cover - How the retrieved information is accessed and presented.
Entrez- WWW-based data retrieval system.
- Developed by NCBI (National Centre for Biotechnology Information).
- Integrates information held in different DBs.
EntrezData bases covered by Entrez are,Nucleic acid - GenBank, RefSeq, PDB.Protein seqs - SWISS-PROT, PIR.3D structures – MMDBGenomes – Many sourcesPopSet – From GenBankOMIM – OMIMTaxonomy – NCBI taxonomy databaseBooks- BookshelfProbeSet – GEO (Gene Expression Omnibus)Literature - PubMed
Entrez
Entrez
DBGETAn integrated data retrieval system developed and maintained by, - The Institute for Chemical Research (Kyoto University) - The Human Genome Center (University of Tokyo)Data bases covered are, Nucleic acid Seqs – GenBank, EMBL Protein Seqs – SWISS-PROT, PIR 3D structures – PDB Seq motifs – PROSITE Enzyme reactions – LIGAND Literature – LITDB Medline etc.,
DBGET
SRSSRS - Sequence Retrieval System - Data retrieval tool developed by EBI - Integrates 80 molecular biology DBs - An Open source software (Can be installed locally)
SRS has an associated scripting language called Icarus
SRSSRS - Sequence Retrieval System - Data retrieval tool developed by EBI - Integrates 80 molecular biology DBs - An Open source software (Can be installed locally)
SRS has an associated scripting language called Icarus
Genomics
GenomicsWhat is Genomics? The study of genomes.
In addition to the coding regions (genes), genomics comprise: Control elements Introns and exons Gene clusters Elements common to all chromosomes Episomal elements
GenomicsBenefits of Genomics:Genome sequencing helps in, - Identifying new genes (Gene discovery) - Looking at chromosome organization and structure - Finding gene regulatory seqs - Comparative genomicsThese in turn lead to advances in, - Medicine - Agriculture - Animal husbandry - Biotech - Evolution
GenomicsBranches of Genomics,1. Structural Genomics – Building genomic maps, 3D structures.2. Functional Genomics – Transcriptomics, Proteomics, Metabolimomics, Enzymes3. Comparative Genomics – Population distribution and Phenotypic associations4. Evolutionary Genomics – Phylogenetic relationships5. Pharmacogenomics – Interaction of drugs with genomes, Drug discovery
GenomicsTools required for Genomics,Robotics- SequencingStatistics- SoftwareHigh throughput assays- MicroarraysHigh speed computing- Database workBioinformatics- Algorithms, Graphics
Proteomics Proteome is the protein complement of the genome Proteomics is the study of proteomes Human genome = 30,000 to 60,000 genes Human proteome = 300,000 to 12,00,000 Reasons for Proteome>Genome: - Multiple ORFs - PTM - Internal peptide products
ProteomicsGoal: Identify all the proteins expressed by a cell or tissue.
Why to study proteomics? Analysis of mRNA does not always correlate with expressed
proteins Some samples – Serum, Urine – can't be used for mRNA
studies. PTM can not be detected from mRNA Location of proteins can not be known from mRNA
ProteomicsSpecialized proteomics1. Expression Proteomics2. Cell Map Proteomics3. PTM4. Protein- Protein interactions5. Protein- Ligand Interactions6. Protein structure
ProteomicsProteomics approach, Separation of proteins using 2D electrophoresis. Stain gel Excise spots of interest Digest with trypsin Characterize peptides by MS/MALDI TOF Compare peptide seqs with database of seqs. Identify the class of proteins
ProteomicsMethods to study Protein-Protein interactions1. Yeast 2 Hybrid2. AP-MS (Affinity purification-MS)
Protein Microarrays can use immobilized- Proteins- Peptides- Carbohydrates- Antibodies- Small molecules to study other interactions.
Proteomics
Applications:1. Protein mining2. Differential expression profiling3. Network mapping4. Study protein modifications