15
Processing data from GS FLX Instrument using UNICORE workflow system M. Borcz 1,2 R. Kluszczyński 2 K. Skonieczna 3,4 T. Grzybowski 3 Piotr Bała 1,2 1 Faculty of Mathematics and Computer Science, UMK, Toruń 2 ICM University of Warsaw 3 Collegium Medicum, UMK, Bydgoszcz 4 Postgraduate School, Medical University of Warsaw

Processing data from GS FLX Instrument using UNICORE workflow system

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Processing data from GS FLX Instrument using UNICORE workflow system

Processing data from GS FLX Instrument using UNICORE workflow system

M. Borcz1,2 R. Kluszczyński2 K. Skonieczna3,4 T. Grzybowski3 Piotr Bała1,2

1Faculty of Mathematics and Computer Science, UMK, Toruń

2ICM University of Warsaw

3Collegium Medicum, UMK, Bydgoszcz

4Postgraduate School, Medical University of Warsaw

Page 2: Processing data from GS FLX Instrument using UNICORE workflow system

PROCESSING TIME

STORAGE

TECHNICAL SUPPORT

AUTOMATION

FLEXIBILITY

SECURITY

PTBI2012 M. Borcz

MOTIVATION

Page 3: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

PL-GRID

„The goal of the PL-Grid project (Polish Infrastructure for Supporting Computational Science in the European Research Space) is to provide the Polish scientific community with an IT platform based on Grid computer clusters, enabling e-science research in various fields.

PL-Grid aims at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that will facilitate effective and innovative use of the available resources.”

www.plgrid.pl

Page 4: Processing data from GS FLX Instrument using UNICORE workflow system

PROCESSING TIME

STORAGE

TECHNICAL SUPPORT

AUTOMATION

FLEXIBILITY

SECURITY

PTBI2012 M. Borcz

MOTIVATION

Page 5: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

UNICORE UNICORE (Uniform Interface to Computing Resources) is a middleware enabling

access to the Grid resources in a seamless and secure way. UNICORE is a part of Unified

Middleware Distribution developed by EMI project.

www.unicore.eu

www.eu-emi.eu

UNICORE RichClient (URC)

UNICORE CommandlineClient (UCC)

High-LevelAPI (HiLA)

Page 6: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

UNICORE in PL-Grid

Page 7: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

EXPERIMENT

Determination of the 18 complete mitochondrial genome sequences of tumor and matched non-tumor tissues obtained from 9 patients diagnosed with colorectal cancer

mtDNA sequences comparison with the reference sequence

mtDNA mutation identification

Ultra high speed processing of mtDNA sequence data.

High-throughput GS FLX Instrument (Roche Diagnostics)

Up to 1 million reads of approximately 500 bp long in a single experiment

Page 8: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

WORKFLOW

GSRunProcessor : Data from GS FLX Instrument (Roche Diagnostics) , SFF and CWF files

GSReferenceMapper: SFF files GSReporter: CWF files GSAssembler: SFF files, FASTA file

BLAST: FASTA file

Page 9: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

DATA PROCESSING

High-throughput GS FLX Instrument (Roche Diagnostics) UNICORE Commandline Client (UFTP)

Target System Storage (PL-Grid)

UNICORE Rich Client Batch System (PL-Grid):

GS Run Processor GS Reporter GS Reference Mapper GS Assembler BLAST

Page 10: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

STORAGE

Page 11: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

UNICORE RICH CLIENT Gridbeans are plug-ins enabling to run an application on the grid. They generate description of the job and supply user with graphical interface to enter input data and present results.

Page 12: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

WORKFLOW EDITOR Gridbeans can be used to build simple jobs or can be treated as building blocks

for workflows consisting of various tasks and operations.

Page 13: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

DETAILS

Data: 17 Gb

Images: 834 files

File size: 33Mb

Transfer: 3s / file

GSRunAnalysisPipe:

Interlagos: AMD Opteron(TM) Processor 6272 @ 2.10GHz

AMD: AMD Opteron(tm) Processor 6174 @ 2.20GHz

Intel: Intel(R) Xeon(R) CPU, X5660 @ 2.80GHz (inifiniband)

1 cpu: 70.0h

8x8 cpu (Intel, MPI): 2.5h

Page 14: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

SHORT DEMONSTRATION

Page 15: Processing data from GS FLX Instrument using UNICORE workflow system

PTBI2012 M. Borcz

REFERENCES

www.unicore.eu

www.plgrid.pl

www.eu-emi.eu

www.454.com

„Building a National Distributed e-Infrastructure - PL-Grid” Lecture Notes in Computer Science, Vol 7136, in the subseries: Information Systems and Applications, incl. Internet / Web, and HCI.