16
globus.org/genomics Research Data Management and Analysis using Globus Platform Ravi K Madduri, University of Chicago and Argonne National Laboratory [email protected] @madduri

Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Research Data Management and

Analysis using Globus Platform

Ravi K Madduri, University of Chicago and Argonne National Laboratory

[email protected]

@madduri

Page 2: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Provide more capability for

more people at lower cost by

delivering “Science as a Service”

www.globus.org

Our vision for a 21st century

discovery infrastructure

Page 3: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Data

Source

Data

Destinatio

n

User initiates

transfer request 1

Globus moves

and syncs files 2

Globus

notifies user 3

Globus: Fast, reliable data

transfer

Page 4: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Globus: Federated identity

Page 5: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Simple, secure sharing off existing

storage systems

Data

Source

User A selects

file(s) to share,

selects user or

group, and sets

permissions

1

Globus tracks shared

files; no need to

move files to cloud

storage!

2

User B logs in

to Globus and

accesses

shared file

3

• Easily share large data

with any user or group

• No cloud storage

required

5

Page 6: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Curated publication of data, with

relevant metadata for discovery • Identify

• Describe

• Curate

• Verify

• Access

• Preserve

Researcher

assembles data

set; describes it

using metadata

(Dublin core and

domain-specific)

1

Peers, public

search and

discover data

sets; transfer

using Globus

3

Published

Data

Store

Curator reviews and

approves; data set

published on campus

or other storage

2

Metadata

6

Page 7: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Globus Genomics

Sequencing Centers

Sequencing Centers

Public Data

Storage

Local Cluster/ Cloud Seq

Center

Research Lab

Globus Provides a • High-performance • Fault-tolerant • Secure

file transfer Service between all data-endpoints

Data Management Data Analysis

Picard

GATK

Fastq Ref Genome

Alignment

Variant Calling

Galaxy Data Libraries

Globus Genomics on Amazon EC2

• Analytical tools

are automatically

run on the

scalable compute

resources when

possible

• Globus Integrated

within Galaxy

• Web-based UI

• Drag-Drop workflow

creations

• Easily modify

Workflows with new

tools

Galaxy Based Workflow Management System Globus Genomics

Page 8: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

It’s about the user experience…

…for your photos

…for your e-mail

…for your entertainment

…for your research data

8

Page 9: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Globus Platform-as-a-Service

Identity, Group, and

Profile Management

Globus Toolkit

Glo

bu

s A

PIs

Glo

bu

s C

on

ne

ct

Data Publication & Discovery

File Sharing

File Transfer & Replication

9

Page 10: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

How would it look if GeneLab

were powered by Globus

Services?

What are the benefits? What are

some of the open questions?

Page 11: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

PI initiates transfer

request; or requested

automatically by script,

science gateway

1

Globus transfers files

reliably, securely

Sequencer

Globus Genomics

2

PI selects files to

share, selects

user or group,

and sets access

permissions

Globus controls

access to shared

files on existing

storage; no need

to move files to

cloud storage!

Researcher logs in to

Globus and accesses

shared files; no local

account required;

download via Globus

Researcher

assembles data set;

describes it using

metadata (Dublin

core and domain-

specific)

Curator reviews and

approves; data set

published on campus

or other system

Peers, collaborators

search and discover

datasets; transfer and

share using Globus

4

7

6

3

5

• SaaS Only a web

browser required

• Access using your

campus credentials

• Globus monitors and

informs throughout

6 8

Publication

Repository

Personal Computer

Managing the research data lifecycle with Globus

Page 12: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Some Examples

Page 13: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Services using Globus Platform

Page 14: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Key Issues

• Lowering the barrier to entry and

Democratizing research

• Reproducibility/Transparency/Accessibility

• Benefits of Crowd Sourcing – GalaxyZoo

• Sustainability – TCO

• Ease of Use (UX)

• API

Page 15: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Our work is supported by:

U.S. DEPARTMENT OF

ENERGY

15

Page 16: Research Data Management and Analysis using …sites.nationalacademies.org/cs/groups/ssbsite/documents/...globus.org/genomics Research Data Management and Analysis using Globus Platform

globus.org/genomics

Thank you!

@madduri