Architectural Constraints on Current Bioinformatics Integration Systems

Norman PatonDepartment of Computer Science

University of ManchesterManchester, UK

<norm>@cs.man.ac.uk

Structure of Presentation Current integration proposals.

What they support. What they don’t support, and why.

Requirements for integration. What could be useful, and why.

Grid opportunities. Relevant Grid technologies. Absent Grid technologies.

Current Integration Proposals

Classification

Feature Values

Data Location In-situ, Replicated, Reorganised

Integration Model

None, Relational, Semi-Structured, Object-Oriented

Architecture Thin Client, Client-Server, Multi-Tier

Analysis Support

Function Call, Query, Workflow

Sequence Retrieval Systemhttp://srs.ebi.ac.uk/

SRS In Use

List of Database

Search Interface

Selected Database

SRS Results

Links to Result

Records

Classification of SRS

Feature Values

Data Location Replicated

Integration Model

Architecture Thin Client

Analysis Support

Function Call, Query

BioNavigator BioNavigator combines data

sources and the tools that act over them.

As tools act on specific kinds of data, the interface makes available only tools that are applicable to the data in hand.

Online trial from:https://www.bionavigator.com/

Initiating Navigation

Select database

Enter accession number

Viewing Selected Data

Relevant display options

Navigate to related programs

Chaining Analyses in Macros

Chained collections of navigations can be saved as macros and restored for later use.

Classification of BioNavigator

Feature Values

Data Location Replicated

Integration Model

Architecture Thin Client

Analysis Support

Function Call, Workflow

Current Public Integration Systems Location: data is replicated – under

control. Integration model: often minimal. Architecture: The architecture is often

two-tier. Analysis support: Query and analysis

access is carefully contained.

Only very careful instantiation of the classificationyields sufficiently predictable performance.

GIMS – recent experience

Feature Values

Data Location Reorganised

Integration Model

Object-Oriented

Architecture Multi-tier

Analysis Support

Function Call

Example Analysis Data:

Yeast genome sequence. Protein-protein interaction data. 350 transcriptome experiments. Overall database ~350Mb.

Analysis: Correlate transcription of interacting

proteins.

Features of Experience Challenging to conduct single runs

of analyses – must break into bits. These are modest data sets

compared with what is coming. Environment has been designed

with analysis in mind. These analyses will never make it

into the public release!

Requirements for Integration

Requirements for Integration Location: replication is

transparent. Integration model: standards. Architecture: Flexible, multiple tier. Analysis support: Arbitrary

analyses over diverse data sets.True integration in bioinformatics should not just be data oriented, but involve integration of analyses.

Three Tier Architecture Clients handle

user interaction and presentation.

Application servers perform computation and analysis.

Data servers manage and query databases.

Client

ApplicationServer

DataServer

Three Tier Architecture Scaleability:

Replace/Upgrade components as needed.

Replace/Upgrade layers independently. Flexibility:

Application server layer protects clients from changes in database layer.

Classical three tier architectures are configured statically, and are adapted slowly as needs evolve.

Grid Opportunities

Necessary and Missing Necessary:

Directory services. Discovery

services. Co-allocation. Data replication. Workload

management. Accounting and

payment.

Missing: Databases. Data models. Heterogeneity

resolution. Personalisation. Web services. Standards.

Dynamic Multi-Tier

Client

ApplicationServer

DataServer

ApplicationServer

DataServer

Resources need to be identified,selected andscheduleddynamically.

Grid Classification

Feature Values

Data Location In-situ, Replicated

Integration Model

Architecture Multi-Tier

Analysis Support

Function Call, Workflow

The current Grid is not the answer, but the answersubsumes the current facilities of the Grid.

Summary Current integration facilities in biology:

Are cunningly restrictive. Make the most of limited distributed

computational architectures. The Grid is bringing to the table:

Resource description facilities. Resource scheduling and workflow

management facilities. The Grid does not directly address current

needs in biology, but its descendents may.

Architectural Constraints on Current Bioinformatics Integration Systems

Documents

What bioinformatics? What is bioinformatics?

Animal Development - Del Mar Collegedmc122011.delmar.edu/nsci/biology/faculty/brower/powerLectures/ch...animal pole pigmented cortex yolk ... Pattern Formation ... Architectural constraints

Bioinformatics for molecular biology - Wiki.uio.no · Bioinformatics for molecular biology Structural bioinformatics tools, predictors, and 3D modeling –Structural Bioinformatics

Bioinformatics III: Structural Bioinformatics and Genome ... · Bioinformatics III Structural Bioinformatics and Genome Analysis Summer Semester 2007 by Sepp Hochreiter (Chapters

スライド 1 - anarg.jp11] R. Solé and S. Valverde, “Information theory of complex networks: On evolution and architectural constraints,” Complex networks, vol. 650,

Meta data and bioinformatics Bioinformatics is EBI-centred, loosely organised Bioinformatics was coined by Pauline Hogekamp ~1979 European bioinformatics

Genetic Threading By J.Yadgari and A.Amir Published: special issue on Bioinformatics in Journal of Constraints, June 2001 Alexandre Tchourbanov University

+ => Bioinformatics: from Sequence to Knowledge Outline: Introduction to bioinformatics The TAU Bioinformatics unit Useful bioinformatics issues and databases:

Immunological Bioinformatics. The Immunological Bioinformatics group Immunological Bioinformatics group, CBS, Technical University of Denmark ()

Constraints and Bioinformatics: Results and Challengesagostino.dovier/SLIDES/CORK_CP15.pdf · Constraints, Volume 13. Special Issue on Bioinformatics and Constraints, 2008. Algorithms

Bioinformatics - Stellenbosch UniversityPevsner J. Bioinformatics and Functional Genomics 3rd Edition Wiley-Blackwell 2015. Bioinformatics, Stellenbosch University • Many bioinformatics

rigid frames: smaller constraints horizontal bending ...faculty.arch.tamu.edu/media/cms_page_media/4350/notes7.pdf · analysis & design Rigid Frames 1 Lecture 7 Applied Architectural

EuropIA 2014 - Analysing the impact of constraints on decision-making by architectural designers

Unconstrained Learning Species Constraints Timing Constraints Domain Constraints

An Ontology for State Analysis: Formalizing the Mapping to ... _in_SysML.pdfAnalysis semantics including architectural constraints. The SysML extension provides the practical basis

Introduction to Bioinformatics · biopotato bioinformatics Introduction to Bioinformatics. 8 What is Bioinformatics? Interdisciplinary a biology/medical researchers, just like you

Bioinformatics II Theoretical Bioinformatics and Machine Learning … · Theoretical Bioinformatics and Machine Learning Part 1 SeppHochreiter Institute of Bioinformatics Johannes

Extension to Chapter 2. Architectural Constraints3 Background for Architectural Constraints 4 Key terms 5 Routes for Implementation 6 Critique Lundteigen& Rausand Extension to Chapter

EB3233 Bioinformatics Introduction to Bioinformatics

Bioinformatics and Constraints...Chapter 1 Bioinformatics and Constraints Rolf Backofen and David Gilbert Contents 1 Bioinformatics and Constraints 1 Rolf Backofen and David Gilbert