9
my Grid -putting the scientist at the centre A case study investigating Williams- Beuren Syndrome

my Grid - putting the scientist at the centre

  • Upload
    edana

  • View
    37

  • Download
    2

Embed Size (px)

DESCRIPTION

my Grid - putting the scientist at the centre. A case study investigating Williams-Beuren Syndrome. - PowerPoint PPT Presentation

Citation preview

Page 1: my Grid - putting the scientist at the centre

myGrid -putting the scientist at the centre

A case study investigating Williams-Beuren Syndrome

Page 2: my Grid - putting the scientist at the centre

The scientist’s (Hannah’s) problem

Chr 7 ~155 Mb

~1.5 Mb

7q11.23

CTA-315H11

CTB-51J22

‘Gap’

Physical Map

12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

1. Identify new, overlapping sequence of interest

2. Characterise the new sequence at nucleotide and amino acid level

Cutting and pasting between numerous web-based services i.e. BLAST, InterProScan etc

Page 3: my Grid - putting the scientist at the centre

A B C

The Williams Workflows

A: Identification of overlapping sequenceB: Characterisation of nucleotide sequenceC: Characterisation of protein sequence

Page 4: my Grid - putting the scientist at the centre

Recording Architecture

Page 5: my Grid - putting the scientist at the centre

19747251 AC005089.3831Homo sapiens BAC

clone CTA-315H11 from 7, complete sequence15145617 AC073846.6

815Homo sapiens BAC

clone RP11-622P13 from 7, complete sequence15384807 AL365366.20

46.1Human DNA sequence

from clone RP11-553N16 on chromosome 1, complete sequence7717376 AL163282.2

44.1Homo sapiens

chromosome 21 segment HS21C08216304790 AL133523.5

44.1Human chromosome 14

DNA sequence BAC R-775G15 of library RPCI-11 from chromosome 14 of Homo sapiens (Human), complete sequence34367431 BX648272.1

44.1Homo sapiens mRNA;

cDNA DKFZp686G08119 (from clone DKFZp686G08119)5629923 AC007298.17

44.1Homo sapiens 12q22

BAC RPCI11-256L6 (Roswell Park Cancer Institute Human BAC Library) complete sequence34533695 AK126986.1

44.1Homo sapiens cDNA

FLJ45040 fis, clone BRAWH302048620377057 AC069363.10

44.1Homo sapiens

chromosome 17, clone RP11-104J23, complete sequence4191263 AL031674.1

44.1Human DNA sequence

from clone RP4-715N11 on chromosome 20q13.1-13.2 Contains two putative novel genes, ESTs, STSs and GSSs, complete sequence17977487 AC093690.5

44.1Homo sapiens BAC

clone RP11-731I19 from 2, complete sequence17048246 AC012568.7

44.1Homo sapiens

chromosome 15, clone RP11-342M21, complete sequence14485328 AL355339.7

44.1Human DNA sequence

from clone RP11-461K13 on chromosome 10, complete sequence5757554 AC007074.2

44.1Homo sapiens PAC

clone RP3-368G6 from X, complete sequence4176355 AC005509.1

44.1Homo sapiens

chromosome 4 clone B200N5 map 4q25, complete sequence2829108 AF042090.1

44.1Homo sapiens

chromosome 21q22.3 PAC 171F15, complete sequence

>gi|19747251|gb|AC005089.3| Homo sapiens BAC clone CTA-315H11 from 7, complete sequenceAAGCTTTTCTGGCACTGTTTCCTTCTTCCTGATAACCAGAGAAGGAAAAGATCTCCATTTTACAGATGAGGAAACAGGCTCAGAGAGGTCAAGGCTCTGGCTCAAGGTCACACAGCCTGGGAACGGCAAAGCTGATATTCAAACCCAAGCATCTTGGCTCCAAAGCCCTGGTTTCTGTTCCCACTACTGTCAGTGACCTTGGCAAGCCCTGTCCTCCTCCGGGCTTCACTCTGCACACCTGTAACCTGGGGTTAAATGGGCTCACCTGGACTGTTGAGCG

urn:lsid:taverna:datathing:15

..BLAST_Report

rdf:type

urn:lsid:taverna:datathing:13

..similar_sequences_to

.. nucleotide_sequence

rdf:type

service invocation

..created_by

workflow invocation

workflow definition

experiment definition

project

person

group

service description

organisation

..described_by

..run_during

..invocation_of

..part_of

..works_for

..part_of

..part_of

..author

..author

..run_for

..masked_sequence_of

..filtered_version_of

The myGrid Information Model Annotation & argumentation

Page 6: my Grid - putting the scientist at the centre

Using workflows and web services

• Automation– Capturing processes in an explicit manner– Tedium! Computers don’t get bored/distracted/hungry/impatient!– Saves repeated time and effort

• Modification, maintenance, substitution and personalisation

• Easy to share, explain, relocate, reuse and build• Available to wider audience: don’t need to be a coder,

just need to know how to do Bioinformatics • Releases Scientists/Bioinformaticians to do other work• Record

– Provenance: what the data is like, where it came from, its quality– Management of data (LSID - Life Science IDentifiers)

Page 7: my Grid - putting the scientist at the centre

Demonstration topics

• Taverna – using a workflow editing environment to capture bioinformatics protocols

• Personalisation – setting context to allow later personalisation

• Provenance – retaining information on the origin of results

Page 8: my Grid - putting the scientist at the centre

The myGrid Information Model Programmes, studies & experiments

has participants

1 0..*

uses

10..*contains

1

0..*

method

0..*1

episodes1

0..*

lab books

1

0..*

participates in

10..*

acts in0..*

1

selected studies

0..*

instances

1 0..*

initiates

1

0..*

LabBookView

StudyRole

StudyParticipationEpisode

PersonStudyParticipation

ExperimentDesign

InvestigationProgramme

Study

ProgrammeResource

Operation

Workflow

WebServiceOperation

ExperimentInstance

example operation types

Page 9: my Grid - putting the scientist at the centre

The myGrid Information Model Provenance metadata

created via

1 1created by

0..*

1

outputs

1

0..*

inputs

1

0..*

includes

0..1

1

value

1

value

1

has provenance

trace0..1

1

initiates

1 0..*StudyParticipation

LifeScienceDocument

ActualInputParameter

ActualOutputParameter

WorkflowTrace

WebServiceTrace

OperationTrace

DirectCreation

CreationTypeDataProvenance

InvestigationExperimentInstance

example trace types