Upload
ormand
View
42
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Exploring the tools of the trade. Tools and Datasets. Sequence Databases. Understanding EMBL Entries Understanding SWISS-PROT Entries. Understanding EMBL Entries. Understanding SWISS-PROT Entries. General Concepts and Methods. Predictions and Validation. - PowerPoint PPT Presentation
Citation preview
Tools and Datasets
Exploring the tools of the trade
Sequence Databases
● Understanding EMBL Entries
● Understanding SWISS-PROT Entries
Understanding EMBL Entries
Understanding SWISS-PROT Entries
General Concepts and Methods
● Predictions and Validation
Maxim 17.1
Recognise the difference between the validation of a model and the testing of it for
self-consistency
True/False/Negative/Positive
Maxim 17.2
Generally, False Negative predictions are considered more acceptable than False
Positives
Assessment/Validation Procedure and Possible Outcomes
figOUTCOME.eps
Balancing the errors
Maxim 17.3
With False Negatives we could come back next year and find the ones we missed, and these
are preferred to False Positives, where we can waste time studying them this year, only to find out that the time was wasted. It all depends on
the circumstances
Maxim 17.4
Sometimes all those false positives are maybe, just maybe, trying to tell you something. So, if
you aspire to a Nobel prize ...
Using multiple algorithms to improve performance
Maxim 17.5
Use a fast if inaccurate algorithm to protect your slow, accurate second-stage algorithm
An overview of tRNA: 2D, 3D and Gene Structure
figTRNA.eps
http://www.ncbi.nlm.nih.gov/Education/
Introducing Bioinformatics Tools
http://www-igbmc.u-strasbg.fr/BioInfo/
ftp://ftp.ebi.ac.uk/pub/software
ClustalW
ClustalX operating under Windows XP
figCLUSTALX.eps
$ gzip -d clustalw1.83.UNIX.tar.gz
$ tar -xvf clustalw1.83.UNIX.tar
$ cd clustalw1.83
$ make
$ ./clustalw
$ ./clustalw -h
$ ./clustalw -INFILE=../MerAHMAs_MerP.swp -OUTFILE=../Mer.aln
Algorithms and Methods
Substitution/scoring matrices
BLAST
Maxim 17.6
Exactly which BLAST is best depends on the circumstances
$ cd
$ mkdir blast
$ cp blast-2.2.6-ia32-linux.tar.gz blast
$ cd blast
$ gzip -d blast-2.2.6-ia32-linux.tar.gz
$ tar -xvf blast-2.2.6-ia32-linux.tar
[NCBI]
Data="/home/michael/blast/data"
Installing NCBI-BLAST
$ mkdir databases
$ cd databases
$ mv ../All_Mer_Proteins.fsa .
$ ../formatdb -i All_Mer_Proteins.fsa -p T -o T -n Merproteins
$ blastall -p blastp -d databases/Merproteins -i test_seq.fsa
$ sed 's/sw|/sp|/' All_Mer_Proteins.fsa > Mer_db.prot
$ ../formatdb -i Mer_db.prot -p T -o T -n Merproteins
Preparation of database files for faster searching
$ fastacmd -d databases/Merproteins -I
$ fastacmd -d databases/Merproteins -s MERA_SHIFL
$ blastclust -d databases/Merproteins | head
The different types of BLAST search
Where To From Here