26
Tools and Datasets Exploring the tools of the trade

Tools and Datasets Exploring the tools of the trade

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Tools and Datasets Exploring the tools of the trade

Tools and Datasets

Exploring the tools of the trade

Page 2: Tools and Datasets Exploring the tools of the trade

Sequence Databases

● Understanding EMBL Entries

● Understanding SWISS-PROT Entries

Page 3: Tools and Datasets Exploring the tools of the trade

Understanding EMBL Entries

Page 4: Tools and Datasets Exploring the tools of the trade

Understanding SWISS-PROT Entries

Page 5: Tools and Datasets Exploring the tools of the trade

General Concepts and Methods

● Predictions and Validation

Page 6: Tools and Datasets Exploring the tools of the trade

Maxim 17.1

Recognise the difference between the validation of a model and the testing of it for

self-consistency

Page 7: Tools and Datasets Exploring the tools of the trade

True/False/Negative/Positive

Page 8: Tools and Datasets Exploring the tools of the trade

Maxim 17.2

Generally, False Negative predictions are considered more acceptable than False

Positives

Page 9: Tools and Datasets Exploring the tools of the trade

Assessment/Validation Procedure and Possible Outcomes

figOUTCOME.eps

Page 10: Tools and Datasets Exploring the tools of the trade

Balancing the errors

Page 11: Tools and Datasets Exploring the tools of the trade

Maxim 17.3

With False Negatives we could come back next year and find the ones we missed, and these

are preferred to False Positives, where we can waste time studying them this year, only to find out that the time was wasted. It all depends on

the circumstances

Page 12: Tools and Datasets Exploring the tools of the trade

Maxim 17.4

Sometimes all those false positives are maybe, just maybe, trying to tell you something. So, if

you aspire to a Nobel prize ...

Page 13: Tools and Datasets Exploring the tools of the trade

Using multiple algorithms to improve performance

Page 14: Tools and Datasets Exploring the tools of the trade

Maxim 17.5

Use a fast if inaccurate algorithm to protect your slow, accurate second-stage algorithm

Page 15: Tools and Datasets Exploring the tools of the trade

An overview of tRNA: 2D, 3D and Gene Structure

figTRNA.eps

Page 16: Tools and Datasets Exploring the tools of the trade

http://www.ncbi.nlm.nih.gov/Education/

Introducing Bioinformatics Tools

Page 17: Tools and Datasets Exploring the tools of the trade

http://www-igbmc.u-strasbg.fr/BioInfo/

ftp://ftp.ebi.ac.uk/pub/software

ClustalW

Page 18: Tools and Datasets Exploring the tools of the trade

ClustalX operating under Windows XP

figCLUSTALX.eps

Page 19: Tools and Datasets Exploring the tools of the trade

$ gzip -d clustalw1.83.UNIX.tar.gz

$ tar -xvf clustalw1.83.UNIX.tar

$ cd clustalw1.83

$ make

$ ./clustalw

$ ./clustalw -h

$ ./clustalw -INFILE=../MerAHMAs_MerP.swp -OUTFILE=../Mer.aln

Algorithms and Methods

Page 20: Tools and Datasets Exploring the tools of the trade

Substitution/scoring matrices

Page 21: Tools and Datasets Exploring the tools of the trade

BLAST

Page 22: Tools and Datasets Exploring the tools of the trade

Maxim 17.6

Exactly which BLAST is best depends on the circumstances

Page 23: Tools and Datasets Exploring the tools of the trade

$ cd

$ mkdir blast

$ cp blast-2.2.6-ia32-linux.tar.gz blast

$ cd blast

$ gzip -d blast-2.2.6-ia32-linux.tar.gz

$ tar -xvf blast-2.2.6-ia32-linux.tar

[NCBI]

Data="/home/michael/blast/data"

Installing NCBI-BLAST

Page 24: Tools and Datasets Exploring the tools of the trade

$ mkdir databases

$ cd databases

$ mv ../All_Mer_Proteins.fsa .

$ ../formatdb -i All_Mer_Proteins.fsa -p T -o T -n Merproteins

$ blastall -p blastp -d databases/Merproteins -i test_seq.fsa

$ sed 's/sw|/sp|/' All_Mer_Proteins.fsa > Mer_db.prot

$ ../formatdb -i Mer_db.prot -p T -o T -n Merproteins

Preparation of database files for faster searching

Page 25: Tools and Datasets Exploring the tools of the trade

$ fastacmd -d databases/Merproteins -I

$ fastacmd -d databases/Merproteins -s MERA_SHIFL

$ blastclust -d databases/Merproteins | head

The different types of BLAST search

Page 26: Tools and Datasets Exploring the tools of the trade

Where To From Here