23
my Grid Personalised extensible environments for data-intensive in silico experiments in biology http://www.mygrid.org.uk Professor Carole Goble, University of Manchester,UK [email protected]

My Grid Personalised extensible environments for data-intensive in silico experiments in biology Professor Carole Goble, University

Embed Size (px)

Citation preview

Page 1: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGridPersonalised extensible environments fordata-intensive in silico experiments in biology

http://www.mygrid.org.ukProfessor Carole Goble, University of Manchester,[email protected]

Page 2: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid EPSRC funded pilot project Generic middleware within application

setting 36 month http://www.mygrid.org.uk

IBM

Page 3: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

In silico experimentation

Discovery, interoperation, fusion, sharing

Process is as important as outcome

Science is dynamic – change happens

Scientific discovery is personal & global

Ad-hoc solutions, people-powered

Page 4: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid resourcesQuestion: Nucleotide binding protein in mouse

Answer: P12345 in Swiss-Prot is an ATPaseTerri Attwood is an expert on thisJackson labs have a database but you need

to registerA paper has just been published in Proteins

by the Stanford lab on this.

Page 5: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Grid viewpoints

interrogation

workflows

results

Access Grid

New

B

iolo

gy

Technology Grid

private

public

What is it?Where is it?

How to get it?When did it happen?

Who knows it?Why does it?

What are you doing?

Governance & Control

Page 6: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid e-Science objectives

Active support of scientific practice in biology Straightforward discovery, interoperation, sharing

information AND processes AND best practice Improving quality of both experiments and data

provenance through information <-> process linkage propagating change

Individual creativity & collaborative working personalisation

Cottage Industry to an Industrial Scale

Page 7: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid operational environment

(DeFacto) StandardsOMG LSR, I3C, MGED, Gene Ontology

Open SourceOpen-Bio Foundation, Bio*

Sem

an

tic Web

RD

F, RD

FS, D

AM

L+O

IL

Bioinformatics integration platformsDAS, OpenBSA, ISYS, OpenMMS, Kleisli, Ensembl, AppLab,

SRS, BioNavigator, DiscoveryLink, GX, OPM, TAMBIS

Distributed Computing EnvironmentsCORBA, RMI, Jini, JXTA, DCOM

Web ServicesXML, SOAP, WSDL, UDDI

GRIDGlobus/SRB/Condor

Consortium Expertise

View propagation, reasoning, workflow …

Page 8: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Approach

Personalisation

Toolkits

Meta

data

Interoperation layer

Data mgtProcess mgtContext mgt

Communication fabric

Applications

myGrid Stack

Page 9: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Resource management2. Middleware technologies incl. Globus3. Incorporating existing resources

Page 10: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Integration & distributed queries 2. View management3. Personal repositories

Page 11: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Process description & storage2. Process enactment3. Process personalisation

Page 12: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Security & Confidentiality & Trust2. Provenance & Attribution3. Versioning

Page 13: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Ontology languages & services2. Resource service descriptions 3. Annotation with metadata

Page 14: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Agent based communication abstraction2. Software engineering paradigm for extensible

distributed services3. Foundation for architectural evolution

Page 15: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Personal data repositories2. Personal processes3. Models of sharing

Page 16: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. User interfaces & visualisation2. Collaboration environments3. Environment development4. User-centred application development

Page 17: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Communication fabric

Interoperation layer

Personalisation

Toolkits

Applications

Meta

data Data mgtProcess mgtContext mgt

1. Specialist process: information extraction

Page 18: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid outcomes

1. e-Scientists Environment built on toolkits for service access,

personalisation & community Gene function expression analysis using S.

cerevisiae Annotation workbench for the PRINTS pattern

database

2. Developers myGrid-in-a-Box developers kit Re-purposing DAS, AppLab and OpenBSA … Integrating ISYS & GlaxoSmithKline platforms

Page 19: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid generic technologies

1. Database access from the Grid2. Process enactment on the Grid3. Personalisation services4. Metadata services 5. Laying the foundations for Agent Services

Ontologies, Protocols & APIs

Grid + Services + Semantic Web

Page 20: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Scientific Problems

Scientific Problems

ProcessesProcesses

KnowledgeKnowledge

InformationInformation

Jobs and Data

Jobs and Data

DataData

Raw Resources

Raw Resources

Knowledge / capability

Semantics / process

Data / applications

Valu

e

chain

Inte

ropera

bility

, hig

her le

vel o

nto

logie

s, re

aso

nin

g,

disco

very

, Reaso

nin

g se

rvice

s, Disco

very

se

rvice

s

Fulfillment Grid

"Reproduced by permission of the IT Innovation Centre, University of Southampton." http://www.it-innovation.soton.ac.uk

Page 21: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGrid phased development

Versions of myGrid Varying degrees of

functionality

Pre-prototype

Architecture Simple services

Early toolkit trials

Developers toolkit

Application trials

Release

Extended services

6 months

12 months

24 months

33 months

Page 22: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

myGridPersonalised extensible environments fordata-intensive in silico experiments in biologyhttp://www.mygrid.org.uk

Professor Carole Goble, University of Manchester,UK

Page 23: My Grid Personalised extensible environments for data-intensive in silico experiments in biology  Professor Carole Goble, University

Presented at the BiGUM1: Biological Grid Users Meeting 1

NeSC, Glasgow, Scotland October 30th 2001 http://www.nesc.ac.uk/esi/progs/

bigum1.html