View
216
Download
0
Category
Tags:
Preview:
Citation preview
Sequence Analysis with Artemis &
Artemis Comparison Tool (ACT)South East Asian Training Course on
Bioinformatics Applied to Tropical Diseases - 2005
(Sponsored by UNDP/World Bank/WHO/TDR)
International Centre For Genetic Engineering And Biotechnology ,New Delhi, INDIA
Overview of the genome sequencing and sequence analysis.
Demonstration of Artemis.
Hands on guided exercise in Artemis.
Demonstration of ACT .
Hands on guided exercise in ACT
Generating ACT comparison files
Workshop Overview
Wellcome Trust Photo Library
The Wellcome Trust Sanger Institute
•Funded by The Wellcome Trust, a registered charity.•Established in 1993 to begin the Human genome project. •First Draft (2000) complete (2003-4)
Data release policy:
All sequence data is released immediately and is freely available via the internet in order to maximise its benefit for research.
http://www.sanger.ac.ukftp://ftp.sanger.ac.uk/
Wellcome Trust Photo Library
Generating the complete genome sequence
Infrastructure
Levels of automation
Colony pickingrobots
Plasmid prepsrobots
TOTAL:140
ABI3700
ABI3730
Automated sequencing
Each ABI reads 96 DNAsequences at once.
The machines are run10 times a day,7 days a week.
Throughput of 1,200 to 1,300 96-well plates per day ± 120,000 DNA samples read each day.
Each day, the Sanger Institute reads 60 million base pairs. That’sequal to one of the smaller human chromosomes and many timesthat of an average bacterial genome.
Pathogen Sequencing Unit
http://www.sanger.ac.uk/Projects/Microbes
Bacteria:M. tuberculosisM. lepraeY. pestisS. typhiC. DiphtheriaeBordetella spp. x3B. pseudomalleiS. aureus MRSAS. aureus MSSAE. carrotovora
Yeasts and Fungi:Saccharomyces cerevisiaeSchizosaccharomyces pombeAspergillus fumigatusCandida dubliniensisCandida parapsilosis
Protozoa:Plasmodium falciparum X3Plasmodium spp. X5Leishmania spp.Trypanosoma spp.EimeriaTheileriaBabesia
The Pathogen Group is funded by the Beowulf Genomics Initiativeto sequence the genomes of a wide range of small Eukaryotes and microbes.
Sequencing strategy and assembly
Contiguous sequence
DNA
pUC cloneend sequence
physical gapsequence gap
Shotgun sequencing – strategy
‘Draft sequence’Order of contigs?
95% coverage, 4-5x depth.
‘A genome in a day’‘15 in a month’‘High-quality draft sequence’
Contiguous sequence
DNA
pUC cloneend sequence
large cloneend sequence
physical gapsequence gap
Shotgun sequencing – strategy
Finished sequence: 100% coverage, 10x depth.
Repeats!!!
Shotgun assembly - Yersinia pestis
PrimaryDNA sequence
Dotter BlastN BlastX
Gene finders
tRNA scan
Repeats Pseudo-genesrRNAGenes
tRNA
Manual curation
PrimaryDNA sequence
Dotter BlastN BlastX
Gene finders
tRNA scan
Repeats Pseudo-genesrRNAGenes
tRNA
Fasta BlastP Pfam Prosite Psort SignalP TMHMM
Manual curation
Manual curation
Annotatedsequence
PSU Projects
Organism
Annotated genome
Finished genome
Database entry
Artemis
Artemis
• Sequence viewer and analysis tool
– Visualization of sequence features• DNA• Six frame translation
– Perform and view analysis• Basic analysis• Launch more complex analysis and searches• Import and view the results of other searches
Outline of Artemis demonstration
• Artemis window features • Open a genome sequence• Changing the view• Getting around
– Goto Menu– Navigator– Feature Selector
• Basic analysis– Edit a feature– Fasta search– Show feature plots
Artemis
Sliders
Sliders
Drop Down Menus
Entry Button Line
Main Sequence View Panel
Magnified Sequence View Panel
Feature Menu
Drop Down Menus
Entry Button Line
Main Sequence View Panel
Magnified Sequence View Panel
Feature Menu
Artemis
Curating gene models in ArtemisUse of multiple lines of evidence
Curating gene models in ArtemisUse of FASTA evidence
EST sequencing & mapping
AAAAAAAAAACAP
AAAAAAAAAACAP
TTTTTTTTT
TTTTTTTTT
intron exon5’UTR Mstop 3’UTR
EST
EST
cDNA
mRNA
ESTs
Curating gene models in ArtemisUse of EST evidence
Curating gene models in ArtemisUse of EST evidence
Curation of gene models in ArtemisMapping proteome fragments to genome
Curation and annotation in ArtemisMapping InterPro domain hits to genome
Finished sequenceFinished sequence
Gene FinderPHAT
GlimmerOrpheus
FASTABLAST
EST
Primary gene modelPrimary gene model
Annotation of pathogen genomes at the PSU (using ARTEMIS)
Complete Annotation Complete Annotation
Organism-specific gene familiesFunctional classification (GO / Riley)
Comparative genomics (using ACT)
Refined gene modelRefined gene model
InterPro scan
HMMPfamHMMSMARTPRINTSPROSITEProDomTIGRFAMs
Manual curationSignalP
TMHMM
t-RNA scan
Gene model annotation Gene function
Top tips!
Manual annotation.
Use a several lines of evidence:
- Run several available gene finding programs
- Search programs: local (BLAST) and global (FASTA) alignments
-Protein domains and motifs: Interpro (Pfam, prosite, SMART etc.)
-Transmembrane / signal peptide prediction (TMHMM, SignalP)
- Base your annotation on characterised proteins where possible (e.g. UNIPROT entry)
- Read the literature (Pubmed entry)
Sanger Front page
Recommended