Upload
alexa-hansen
View
241
Download
4
Tags:
Embed Size (px)
Citation preview
Martin John Bishop
UK HGMP Resource CentreHinxtonCambridge CB10 1 [email protected]://www.hgmp.mrc.ac.uk
Bioinformatics scope
Genome sequences - DNA Transcripts - RNA Proteins Protein interactions Macromolecular assemblies Development and cellular function Genetic linkage analysis
Molecular biology needs bioinformatics
Biological data - molecules Sequences Structures Gene expression Proteomes Pathways Evolution
Computer analysis – methods Comparison Modelling Co-regulation Mass spectrometry Knowledge bases Phylogenetics
Molecular biology is about information
Central dogma DNA
<-> RNA -> protein -> phenotype <- DNA
Molecules Processes
Central paradigm Genome repository <-> RNA world -> Protein sequence -> Protein structure -> Protein function -> Phenotype<- Fed back to genome
Information processing
The activities of HGMP-RC
B io in fo rm a tics S e rv ices
M H C F u g u M o u se se q ue n c ing T e ch n o lo gyd e ve lop m e nt
R e se a rch
B io lo g ica l m a te ria lsb y m a il o rd er
B io lo g ica l se rv ice sin c lu d in g
h o te l fa c ilit ies
C o n tra c t R & D
B io lo g y S e rv ices
H G M P -R C
On-line service
M a ilN e tw o rk N e w sF ile s /B a ckup
S e rv ice s
D a ta L in ks
U n re s tric te d
P u b licD a ta
P riva teD a ta
R e g is te re d u se rs
In fo rm a tion A n a ly tica l to o ls
O n -lin e se rv ice
HGMP-RC SERVICE
Web menu X (or VNC) Java Telnet
Telnet menu / Unix login
GENOME WEB
Up to date Relevant Fully searchable Fully verified Extensive
INTEGRATED ANALYSIS
BLAST NIX PIX GLUE PIE MAGI PINT
COMMON OPTIONS
EMBOSS GCG PINE CLUSTAL STADEN PASSWORD
GENOMICS APPLICATIONS
Linkage Analysis Radiation Hybrid Mapping Sequence Ready Clone Maps Genome Databases Polymorphisms Sequence Analysis Gene Prediction Expression Profiling Phylogenetic Analysis Integrated Tools - GLUE,
RHYME, NIX, PIE
PROTEOMICS APPLICATIONS
Protein Sequence Analysis Protein Structure Analysis Protein Structural Modelling Proteome Databases Tools for Peptide Sequence
Determination Protein Cellular Localisation Protein Functional Studies Pathways and Protein
Interactions Integrated tools and
databases - PIX
NETWORK / JANET SERVICE
LONDON Currently 34 Mbps
main link Future keep 34
Mbps link for backup
CAMBRIDGE Currently 8 Mbps
redundant link Future Gigabit
Ethernet
SERVERS
More than 80 servers 1, 4 and 8 cpu SMP Sparc and Intel Solaris and Linux Databases doubling every 14 months
LOADS
Load is the percentage of processes trying to run
Interactive load 50% Job queues load 100% Jobs waiting can be 6-10 times the
work being processed
PROCESSES AND QUEUES
Menu service (hot swop) General analysis (overloaded) Sun BLAST and NIX queue Dell BLAST queue BLAST data file server Interactive Linkage queue Heavy Linkage queue
USERS’ REAL WORLD PROBLEMS
Comparative method Extrapolate from known to similar Hints to reduce the amount of
experimental work that needs to be done
SOFTWARE SYSTEMS
A variety of technical solutions are used BLAST NCBI Entrez SRS GeneCards NIX ENSEMBL
HELPING THE USER
Information discovery – completeness Communication – multiple sites Ontology – uniformity? Software integration – ease of use Reasoning about results Monitoring – repeat queries
MAJOR CHALLENGES
User interface Back end processing Cost recovery
NEW TECHNOLOGIES?
Web services GRID (EMBnet) Object-orientated computing Multi-agent systems
TREASURE
Web service with top level container Customise for the user User selects a service and opens it as
an application An alternative view can be built
around user data as the fundamental objects
IMPLEMENTATION
EMBREO library written in Java handles web service layer (also CORBA, XML-RPC, JDBC and other connectivity)
Also handles file access and transfer and display of results (including use of VNC)
Simple Object Access Protocol (SOAP) Browser channel uses XML format
USER ACCOUNTING AND CUSTOMIZATION
Currently very complex HED NIS+ Filesystem configuration files
Future a single database Lightweight Directory Access Protocol
(LDAP)
CREDITS
Gary Williams Menu systems and Genome Web
Geoff Gibbs Network and systems
Peter Tribble Web servers, Queues, Treasure