22
GCG vs EMBOSS Gary Williams

GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

  • View
    225

  • Download
    4

Embed Size (px)

Citation preview

Page 1: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

GCG vs EMBOSSGary Williams

Page 2: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Which is better GCG or EMBOSS?

You must decide for yourselves You may find other packages that do what you

want Use the tools that do the job This is a comparison of GCG and EMBOSS to

help you decide

Page 3: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Interfaces

Web W2H available for both EMBOSS W2H still has rough edges PISE Others under development

X-Windows GCG - Seqlab EMBOSS - SPIN, (+ others coming)

Telnet/xterm/Character-based emnu

Page 4: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Command line is very similar

The UNIX command line interfaces of GCG and EMBOSS are very similar.

You type the name of the program You can add any options you want to the

command-line Press the RETURN key Any mandatory information that was not on the

command-line will be prompted for.

Page 5: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

GCG command-line

% name -other=thing

This is the name program that reads a sequence and writes out something.

NAME what sequence ? embl:hsfau1

Begin (* 1 *) ?

End (* 2016 *) ?

Reverse (* No *) ?

What should I call the output (* hsfau.name *) ?

Page 6: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

EMBOSS command-line% name -other thing

Reads in sequences and writes a thing

Input sequence(s): embl:hsfau1

Output data [hsfau1.name]:

Use ‘-ask’ to make EMBOSS programs prompt for the start and end of sequences

Page 7: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Some common options

Running in scripts, don’t prompt, just fail if command-line is insufficient GCG: -default EMBOSS: -auto

Help on options GCG: -check EMBOSS: -help or -help -verbose

Boolean options (Yes/No, True/False) GCG: -thing, -nothing EMBOSS: -thing, -nothing, -thing=T, -thing=F,

-thing=1, -thing=0, -thing=Y, -thing=N

Page 8: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Sequence options in EMBOSS

"-sequence" related qualifiers -sbegin integer first base used

-send integer last base used, def=seq length

-sreverse bool reverse (if DNA)

-sask bool ask for begin/end/reverse

-slower bool make lower case

-supper bool make upper case

-sformat string input sequence format -ufo string UFO features

Page 9: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Sequence options in EMBOSS

"-outseq" related qualifiers

-osformat string output sequence format

-ossingle bool separate file for each entry

Page 10: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

EMBOSS general options -debug bool write debug output to program.dbg

-auto bool turn off prompts

-stdout bool write standard output

-filter bool read standard input, write standard output

-options bool prompt for required and optional values

-verbose bool report some/full command line options

-help bool report command line options

Page 11: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Data files GCG uses ‘..’ to divide comments from data EMBOSS does not use ‘..’ In general, EMBOSS uses ‘#’ to mark a comment

line Use ‘embossdata’ to extract and check on data

files. As in GCG, data files copied into the current or

home directory are used in preference to the originals.

Page 12: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

List files (files of file names) Similar to GCG lists files, but no ‘..’ Comment lines start with ‘#’ Can contain the names of other list files:

# This is my list file

embl:hsfau

embl:ggg*

myfile.seq:clone10

file.seq

@list2

Page 13: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

File formats

GCG only GCG format, MSF and RSF

EMBOSS many formats automatically recognised can specify using ‘::’ or ‘-osf’ eg:

clustal::globin.aln

-osf gcg

Page 14: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

One file, many sequences GCG

Only one sequence per GCG file EMBOSS

One or more sequences per file Default is to write all sequences to one file -ossingle will change to writing many files GCG, Staden and plain format files can only hold

one sequence per file.

Page 15: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Features

GCG No concept of feature tables

EMBOSS Many programs now write out results as GFF Soon, all programs that find things will write the

results as GFF GFF will become another sequence format Programs to manipulate and display sets of

features are planned c.f. showfeat, coderet, maskfeat, diffseq

Page 16: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Databases

EMBOSS is poor at grouping many databases under one name

E.G. Need a way of referring to ‘embl’ and ‘emblnew’ as one database.

This will be done, but currently, a list file containing the following seems best:

embl:*

emblnew:*

Page 17: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

Command line wildcards

GCG: embl:* - no problem

EMBOSS: embl:* - UNIX complains it can’t find the files solution is to quote it: “embl:*” or: embl:\*

Page 18: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

HELP

GCG: genman, genhelp

EMBOSS tfm

Page 19: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

What program does what? See David Martin’s list of equivalences:http://www.no.embnet.org/Programs/SAL/EMBOSS/fromGCG.php3

NB this doesn’t list EMBOSS programs with no equivalent in GCG!

Page 20: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

What EMBOSS does NOT do The major deficiencies in the EMBOSS package

are: BLAST, FASTA, ASSEMBLY You should use the publicly available software:

Blast - NCBI, HGMP, many other sites Fasta - HGMP Assembly - Staden package

Page 21: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

What EMBOSS does do

Giving ‘stdout’ as the output file name makes output go to the screen.

Much effort is put into removing arbitrary limits. E.g. Max. sequence length: 2Gb Many programs limited only by available memory

Source code available for inspection, change and writing your own programs

EMBOSS is FREE! GNU Public Licence Open Source Software

Page 22: GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use

THE END