Sequence Services Phase 2--Constellation Technologies

Preview:

DESCRIPTION

Nick Trigg introduces the proof of concept provided by the consortium of Constellation Technologies and Genestack for Phase 2 of the Pistoia Alliance Sequence Services project. The presentation was delivered at the Pistoia Alliance Conference in Boston, MA, on April 24, 2012.

Citation preview

Constellation Technologies& GeneStack

Development of Sequence Services 2 in the Constellation

Framework

1

ConstellationExperts in big data and bioinformatics

2

• Spin out from STFC (Science and Technology Facilities Council)– Largest research facility in UK specialising in large data computing

• CERN, European physics and astronomy science• Supporting all UK disciplines in computing

• Strong IT & Bioinformatics expertise– Strong Bioinformatics delivery expertise– Strong connections into European academia– Excellent access to newly developed applications, tools and algorithms

• Supplier of cloud computing services to large Pharma.• Partners for Pistoia SS2

– Microsoft Azure– STFC

Service ServiceService

Constellation’s “Roadmap”

Service

Core

Text Mining/Search

GenomeAnalysis

Data Integration

“Workflow Management”

Seamless Integration with Client systems

API

“AppMarket”

IT

• IT– Platform Design– Support– Maintenance– Testing– Stability / Scalability– Security

• Bioinformatics– Novel Algorithms– Research– Scientific support– Discovery– Analysis– Value Added

4

BioinformaticsIT

• Hosted– Single Vendor– Hardware limitations– Restricted storage– Limited cost models– “Lock in”

• Cloud– Vendor Agnostic– As required– Selectable storage– Best model available– “Flexible”

5

HostingCloud

Vendor Agnostic

Cloud Vision

6

Flexible Storage

Flexible Compute

TrueCloud

Client Business

Logic

MinimiseSupport

“Bioinformatics Marketplace”

Virtual Organisation

ClientApplications

Academic or bespoke solutions

Your Informatics

The Value of Pistoia

7

• Value to users (Life Sciences R&D)– Access to Life Science R&D “thought” leaders– Opportunity to hear several innovative proposals– Expand the number of your global collaborators– Support supplier collaborations to provide enhanced services– Share long-term issues with other Life Sciences R&D users– Better and cheaper solutions to high-value challenges

• Value to suppliers (e.g. Constellation)– Access to major clients– Clarity on major long term issues faced by customers base– Partnerships and collaboration encouraged and supported

High Level Architecture

Distributed Storage

Distributed Compute

BioinformaticsSystems

Workflow Tools

Portal

Workflow UIDeployed

Workflow (Apps)Bioinformatics

UIs

Bioinformatics Applications

Our goal for SS2• We believed the end goal was a flexible platform where ALL the

application described in SS2 scope could be deployed for individual clients as required.

• Platform should be scalable where security, support and maintenance can be easily managed.– Reducing support costs allows for more focus on research

• Bioinformatics applications added as required:– GeneStack (Analysis Portal)– VIB (Arctix) (Workflow) (in discussion)– EBI (Services) (in discussion)

• Workflow delivered as a fundamental development principle• Development of the “AppMarket” for Bioinformatics

9

10

CompanySpecific

Integrating3rd PartySystems

SecureScalableStorage

WorkflowCore

IntegrationWith otherSystems

FutureDevelopment

Deliverables achieved• Portal with access to all the “Must Have” Web Services described in the SS2 documentation

– Constellation Managed Administration Interface to allow organisational mapping of users to Programs / Projects / Applications

• “Tool Box” of Integrated Applications– Galaxy– Secure Ensembl– Secure CellProfiler– Content Search (New development)

• Galaxy workflow engine with integrating applications deployed as a secure web application to cover “Must Have” tools– Restricted set of apps based on feedback from “testing pool” (Restrictions based on Need/Security)– Tools can be added on request

• Scalable storage and compute (dependant on need and security)– Structured Program - Project – User mapping– Cost effective data storage and compute

• Initial Integration with another Bioinformatics Vendor (GeneStack)

11

Visiting BioIT World tomorrow

Drop us a note if you would like to meet up while were in Boston.

rob.gill@constellationtechnologies.com or +44 7766 515153

12

Recommended