13
Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow [email protected] 18 th March 2004 BRIDGES Status Report

Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow [email protected]

Embed Size (px)

Citation preview

Page 1: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Dr Richard SinnottTechnical Director National e-Science Centre

||| Deputy Director Technical Bioinformatics

Research Centre University of Glasgow

[email protected]

18th March 2004

BRIDGESStatus Report

Page 2: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Overview

Review goals of Bridges projectBriefly summarise technical approach

Outline achievements thus farDemonstrationPlans for the future

Page 3: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Bridges Goals

High blood pressure affects 25% of adults in western societiesCardiovascular Functional Genomics (CFG) project investigating this through physiological models of hypertension in ratBridges is a supporting project to CFG and will provide Grid infrastructure to facilitate scientific researchCFG project partners are distributed but need to access and integrate various software and especially data resources

Main aims of BRIDGES are to develop re-useable infrastructure to provide data federation incorporating appropriate security concerns

Page 4: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

CFG Partner Distribution

Shared data

Glasgow Edinburgh

Leicester

Oxford

London

Netherlands

Public curated

data

Private data

Private data

Private data

Private data

Private data

Private data

Page 5: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Problems to be addressed

BRIDGES will address the following problems facing CFG biologists

How to integrate data with multiple levels of security including public data, project only data and private data?How to search multiple distributed databases through single optimised queries?How to use multiple tools in a coordinated (and automated) manner, e.g. how to develop re-useable workflows for the CFG scientists?Integration of a range of bioinformatics analysis and visualisation tools, e.g. BLAST, genome browsers, etc. How to deal with inconsistencies of online databases and possible “dirty data”?How to get more “up to date” data?

Make it all user friendly… portals, hidden infrastructure, e.g. security authorisation

Page 6: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Planned Approach

BRIDGES will address these problems throughDevelopment of re-useable Grid services based upon GT3 technologies

Virtualisation of multiple distributed data sets to provide a single virtual data set for use by the biologists – exploiting IBM’s DiscoveryLink

Developing a collection of data on a well-managed platform, including copies of extracts of relevant public data, all project data, and the required software tools (administered using DB2 and DiscoveryLink)

Access to and integration of multiple distributed data sets in a Grid environment using results from the OGSA_DAI/DAIT projects

A secure environment offering authentication and authorisation will build on results of the PERMIS security authorisation project

Page 7: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Bridges team

Project ManagementRichard SinnottDave Berry

Database Design/DevelopmentDerek Houghton

Grid Services DeveloperMicha BayerMagnus Ferrier

Technical InputDavid White, Jean-Christophe Mestres, Andy Knox, Emmanuel Guyonnet (IBM), Ela Hunt (Glasgow), Neil Hanlon (Glasgow)Prof’s David Gilbert, Malcolm Atkinson, Anna Dominiczak,

Page 8: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Achievements

Web site and project portal establishedhttp://europa.nesc.gla.ac.uk/wps/portal

Engaged with CFG consortiaStaff trained in relevant technologies

GT3, DiscoveryLink, Condor

Initial version of local repository developed

Populated with data that cannot be federated e.g. public data sets with no programmatic interface

– Ensembl/EMBL-EBI, NCBI - GENBANK, REFSEQ, Gene Expression Omnibus UCSC, SwissProt/TrEMBL UniSTS/dbSTS UNIGENE LOCUSLINK GENMAPP OMIM Sanger dbSNP dbEST InterPro, Pfam,Prints,Cath, SCOP, ProSite, Weissman Institute PDB Rikken Rat Genome DB, Mouse Atlas, Affymetrix, …

Includes shared data sets of CFG scientists QTL DB, …

Page 9: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Achievements …ctd

GT3 based Grid services offered that allow to make use of these local data sets

Grid enabled BLAST services produced Offer access to large e-Science infrastructures at Glasgow

(ScotGrid)

SyntenyVista tool extended to allow Grid enabled visual navigation of genomic data sets

Planned front end for many other tools

ExternallyPoster at AHM 2003Tutorial submitted to ISMB/ECCB (the major bioinformatics conference)Liaising with other projects

eDIKT, myGrid, GeneGrid, PERMIS, ...

Page 10: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Achievements …ctd

Demonstration of some of the achievements

Page 11: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Plans

Refine/extend and requirementsFurther refinement of use cases & scenariosMore data sets (public, shared, private, …)

Implementation and realisation of further use casese.g. extended query services for microarray data interpretation, workflows for probe set mapping, …

Security realisation and roll-outWe can only help share CFG data sets if we can get SECURE access to them – following up with CFG sites

Authorisation with PERMIS coming GSI based authentication

Investigate application of replication manager (RLS) Should support illusion of data from each site being available to all other sites

Further Grid based data visualisation services accessible via SyntenyVistaEnsure that keep track of relevant developments (WSRF, GT4, …)

Page 12: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

To sequence To multiplealignment

To tabularsummaries

DRILL-DOWN FUNCTIONS

Future Vision of Tools via Portal

Page 13: Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk

Questions?