31
Taverna in 2006 Taverna in 2006 Industry Workshop, [email protected] , 8 th March 2006

Taverna in 2006 Industry Workshop, [email protected]@ebi.ac.uk, 8 th March 2006

Embed Size (px)

Citation preview

Page 1: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Taverna in 2006Taverna in 2006

Industry Workshop,

[email protected],

8th March 2006

Page 2: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Taverna 1Taverna 1

3 Years old, 1300 downloads in latest release over two months.

Expanding community covering an increasing variety of domains

Originally funded as part of an EPSRC pilot project, research rather than production focus

A success but with limitations

Page 3: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Taverna 1.3.1 WorkbenchTaverna 1.3.1 Workbench

Page 4: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Evolving challengesEvolving challenges

Long running data intensive workflows Manipulation of confidential or otherwise protected

information Use with classical grid systems Interaction with users during workflows Workflow authoring, service discovery and

composition Data comprehension, provenance and

visualization

Page 5: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

User Interaction HandlingUser Interaction Handling

Interaction Service and corresponding Taverna processor allows a workflow to call out to an expert human user

Used to embed the Artemis annotation editor within an otherwise automated genome annotation pipeline

Page 6: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Interaction Service ArchitectureInteraction Service Architecture

Patterns

Submit

Status

Results

Upload

Download

InteractionStore Proxy

PatternPattern

Pattern

Taverna 1.3

Page 7: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

DALEC – Linking Taverna and DASDALEC – Linking Taverna and DAS

DALEC exposes a Taverna workflow as a Distributed Annotation System (DAS) annotation source.– Design workflow in Taverna– Deploy in DALEC– Access through any DAS client (Spice, Ensembl web server etc)

Standard DAS Service DALEC DAS Service

Page 8: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Taverna 2Taverna 2

Funded as part of OMII-UK 10 Developers Dedicated design, implementation, testing

and support team First new developers started three weeks

ago, project manager arriving in April

Page 9: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Ingest Ingest

Early adoptersPioneers

Pioneers ConservativesEarly adoptersPioneers

myGridPre-release

myGrid Release

OMII-UKRelease

Software Engineering

XP

Software Engineering

Quality & Test

Evaluation Evaluation OMII Software Engineering

Quality & TestPrioritise & Plan

Prioritise & Plan

Production Applications & Professional ServicesApplications & Professional Services

myGridAlliance

myGridAlliance

Source-forgecommunity

Source-forgecommunity

Page 10: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Future DirectionFuture Direction

Enhancements to the Workflow Core Enhancements to user interface and

experience Expanded use of semantic web

technologies Engagement with new user communities –

cheminformatics, humanities, social sciences etc.

Code remains open source and always will

Page 11: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Composite Workflow ModelsComposite Workflow Models

Page 12: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Enhanced Dataflow ModelEnhanced Dataflow Model

Modular dispatcher mechanism– Dynamic service binding– Recursive invocation– Data filter implementation– Retry, failover, back-off behaviours

Transparent third party data transfers High throughput stream handling with

implicit iteration semantics

Page 13: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Runtime Service BindingRuntime Service Binding

Service definition consists of an abstract description

Resolved at workflow runtime to one or more concrete resources by a broker

Allows load balancing or economic model based service selection over grid environments

Page 14: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Recursive InvocationRecursive Invocation Dispatcher allowing

recursive invocation to be plugged into per operation semantics.

Test Forcompletion

Invokeoperation

ModifyInput Set

GatherResult Set

Return Result

ReceiveInput

Page 15: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Dynamic Dispatch ConfigurationDynamic Dispatch Configuration

Page 16: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

33rdrd Party Data Transfers Party Data Transfers

Allows ‘in place’ referencing of data – Large data sets no longer round-trip between

workflow engine and data provider– Allows restricted access to sensitive data

Automatic de-reference when a reference type is linked to a value type within a workflow.– Connecting a grid service to a web service

Page 17: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

Client pushes workflow input data value to workflow enactor, enactor stores the value in a local cache for future use.

Page 18: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1Service 1 Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

Workflow enactor sends cached data value to Service 1.

Page 19: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1Service 1 Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

Service 1 completes and stores its result value in a local data store, for example SRB, on the same host (Provider A). It returns a reference to that value to the workflow enactor.

Page 20: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

The enactor examines the workflow and determines that Service 2 understands the reference it has to the Service 1 result. It sends this reference to Service 2 which uses it to directly access the local data store.

Page 21: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

Service 2 completes, stores its result in the local store and returns a reference to that data to the enactor.

Page 22: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2 Service 3Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

The enactor examines Service 3. This service, located on another provider, cannot consume the reference returned from Service 2. The enactor forces a de-reference, requesting and caching the value of that reference from Provider A

Page 23: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2 Service 3Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

As the enactor now has a value rather than a reference it can invoke Service 3, which is fed data from the enactor local cache, operates over that data and returns a result which is in turn cached by the enactor.

Page 24: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service 1 Service 2 Service 3

Service 1 Service 2

Provider A

Service 3

Provider B

Workflow Enactor

Enactment Engine

Logical Workflow Structure defined by user

The workflow is complete, the enactor sends the final result back to the client.

Page 25: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Streaming DataStreaming Data Allow execution of downstream workflow

stages on partially complete results from upstream.

Service 1 Service 2 Service 3

Non streaming (Taverna 1), entire iteration must complete at each stage

Streamed data, Service 2 starts operating on partial results from Service 1

Page 26: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

New UI DevelopmentNew UI Development

Smart graph editing module 3d ‘virtual reality’ style enactment status

display Data playground – design workflows by

example Integrated semantic search Knowledge driven visualization for result

mining

Page 27: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

KAVE Data and metadata KAVE Data and metadata managementmanagement

Life Science Identifiers Information Model File management Support for custom

database building Provenance metadata

capture using RDF SRB integration OGSA-DAI integration

urn:data:f2

urn:data:f2

urn:data1urn:data1

urn:data2urn:data2

urn:compareinvocation3urn:compareinvocation3

urn:data12

urn:data12

Blast_report

[input]

[output]

[input]

[distantlyDerivedFrom]

SwissProt_seq

[instanceOf]

Sequence_hit

[hasHits]

urn:hit2….

urn:hit2….

urn:hit1…urn:hit1…

urn:hit50…..

urn:hit50…..

[instanceOf]

[similar_sequence_to]

Data generated by services/workflows

Concepts

[ ]

[performsTask]

Find similar sequence

[contains]

Services

urn:data:3urn:data:3

urn:hit8….

urn:hit8….

urn:hit5…urn:hit5…

urn:hit10…..

urn:hit10…..

[contains]

[instanceOf]

urn:BlastNInvocation3urn:BlastNInvocation3

urn:invocation5urn:invocation5urn:data:f1

urn:data:f1

[output]

New sequence

Missed sequence

[hasName] [hasName

]

literalsDatumCollection

[type]

LSDatum

[type]Properties

[instanceOf]

[output]

[output]

[directlyDerivedFrom]

Page 28: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Process 1Process 2Process 3

Enactor

Workflow Workbench

Steering Control

Steering of simulations by

manipulation of service state

Workflow definition sent to enactor

myGrid Metadata Stores

Computational SteeringComputational Steering

Scientists

Process and data provenance captured and stored by metadata services

Scientist designs, initiates and steers simulation from Taverna

Workbench

Page 29: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Service TypesService Types

Closer integration with grid systems i.e. Condor, EGEE et al and their associated security and access control mechanisms.

R for numerical analysis (microarray informatics amongst others)

Continued improvements to SOAP, BioMoby, Biomart, Soaplab, SGS, Local scripting and other components

Page 30: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

Obtaining TavernaObtaining Taverna

Taverna is available under the LGPL from our project site on Sourceforge.net– http://taverna.sourceforge.net

Release 1.3.1 as of December 2005 Win32, Solaris / Linux & OS-X Includes online and downloadable user manual,

examples etc. Support via project mailing lists

Page 31: Taverna in 2006 Industry Workshop, tmo@ebi.ac.uktmo@ebi.ac.uk, 8 th March 2006

mymyGrid team & Early adoptersGrid team & Early adoptersCoreMatthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes,

Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pockock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson and Chris Wroe.

UsersSimon Pearce and Claire Jennings, Institute of Human Genetics School of Clinical Medical

Sciences, University of Newcastle, UKHannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester, UKPostgraduatesMartin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, John Dickman, Keith Flanagan,

Antoon Goderis, Tracy Craddock, Alastair HampshireIndustrial Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM)Robin McEntire (GSK)CollaboratorsKeith Decker