21
M.Goldberg/NSF OSG Sep 18, 2012 The Open Science Grid 1

M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

Embed Size (px)

Citation preview

Page 1: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012

The Open Science Grid

1

Page 2: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012

What is OSG?

✦ OSG is a Consortium★ Resource owners and campuses, scientist and research users, computer

scientists and software providers, national and International partners

✦ OSG is a Project★ Provides a fabric of services across contributed resources★ Funded for 5 years 2012-2016 after initial 6-year run

✦ OSG is an Eco System★ Provides a framework for exploring ways of scientific discovery through

the use of distributed high throughput computing★ Domain and computer scientists collaborating for more than decade★ Contributing to state of the art through innovation and collaboration

2

Page 3: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG in Numbers

✦ Delivers up to 2 Million CPU hours every day!★ about 60% go to LHC, 20% to other physics, 20% to many other sciences

✦ Has a footprint on over 100 campuses and labs in the US✦ Transfers ~1 PetaByte of data every day✦ Supports active community of 20+ multi-disciplinary research groups

3

Page 4: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

Providing Computational Resources to Science

✦ A factor of ~5 increase in usage over past 4 years✦ Besides huge momentum and success of the LHC and its Tier-2s*, a

broad spectrum of other science applications now also using the OSG★ Each color on the plot below is a different user community★ Both LHC and and non-LHC use expanded vastly

4

CMS

ATLAS

!Higgs!

* LHC computing US Tier-2s are universities to which data is distributed from CERN through BNL and Fermilab.

Page 5: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG is much more than the LHC

✦ ~20% of resources overall used by communities outside HEP★ individual researchers use both local and remote resources transparently

✦ mix of applications: biology, climate, astro/gravitational/nuclear physics etc.★ Pilot simulation runs from Dark Energy Survey and LSST have shown promise

✦ Protein Processing programs becoming large user★ Several different protein processing application supported★ Support for “popular” biology portals being developed

✦ From astronomy applications to “Tyler Churchill’s cochlear implant modeling”…!

5

more “science

examples”

available on request!

Page 6: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

Globally Leveraging Local Investments in Resources

6

✦ Users submit jobs locally and have global access to complete set of remote processing and storage

✦ Resources owners retain control and use of their own investments✦ Resources funded from many different sources (many non-MPS or

non-NSF) are shared and accessible to NSF PIs✦ With more users joining, more sites make their resources available

for all OSG users to use “opportunistically”

Page 7: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG is a Consortium

✦ All manner of contributors are members of the Consortium★ Owners of clusters and storage e.g. LHC Tier-2s, Clemson University, …★ Scientist and research users of the services e.g. Structural Biology Grid

at Harvard Medical School★ Software developer projects e.g Condor Project, Globus ★ National and International partners e.g Sao Paolo Brazil★ Campus Organizations e.g. University of Nebraska Lincoln

✦ The Council (Board) represents the Consortium members and the entire OSG “planetary system”

7

Page 8: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG is a Project

✦ Project extended by 5 years (2012-2016) after initial 6-year run★ Very strong support across DOE (OHEP, NP) and NSF (MPS, OCI) ★ PI: Computer Scientist: Miron Livny, University of Wisconsin Madison;

Executive Director: Physicist: Lothar Bauerdick, Fermilab; Together with 5 person Executive Team

✦ Provides a Fabric of Services:★ Production – including Operations, Security, and Campus Infrastructure★ Consulting – including technologies, architectures and user support★ Software – including packaging, system testing, patching.

✦ Federates with Peer Infrastructures:★ Campus infrastructures e.g. University of Nebraska★ In Europe – European Grid Infrastructure, national infrastructures in UK,

France, Italy, Germany etc. ★ Worldwide – including partnerships in South America, Asia and Africa

8

Page 9: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG is an Ecosystem

✦ Together, OSG provides more than the sum of its parts:

✦ OSG Consortium★ sites/resources providers,

science communities,stakeholders

✦ OSG Project★ staff, deliverables, operations

✦ Satellite Projects★ Extensions, loosely coupled,

flexible, responsive to funding opportunities

9

Page 10: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

Distributed High-Throughput Computing

✦ The focus is on Distributed High Throughput Computing (DHTC)★ Jobs and data are distributed across large numbers of sites★ Also use of multi-core computing, using all cores on a single CPU box

✦ Shared use of collaborative, grid, cloud infrastructures to run jobs and to store data on★ Department servers or university computing centers across the campus ★ Dedicated farms of worker nodes or faculty clusters★ Commercial or scientific cloud compute nodes

✦ Also supports grassroots research and science communities★ to solve a computation problem, researcher can use dedicated or

opportunistic resources, even extend into commercial cloud resources...★ ...without having to change their interface, code or workflow environment

10

Page 11: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012

Integrating Cloud Computing

✦ Basic technologies are in place today to support use of cloud resources★ Cloud computing can be accessed from Grids, just like other resources ★ OSG applications, like the LHC, successfully interface to and use amazon

cloud resources, using technologies developed in the OSG Eco System✦ E.g. CMS can run Monte Carlo chain on the amazon EC2 cloud

✦ But there are still issues that need to be resolved:★ E.g. what policies to decide to go to (commercial) clouds? ★ How to procure these resources in terms of administrative procedures?★ Also transparent Identity Management needs work

✦ R&D into transparent resource provisioning, “cloud bursting”★ Started discussion with XSEDE for joint technical work in this area★ Research on resource provisioning also as part of “dV/DT” project

11

Page 12: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012 12

M.Goldberg/NSF OSG Sep 18, 2012

Data on OSG

✦ OSG communities are leaders in “Big Data”★ LHC successfully extracting science from and managing tens of PetaByte★ Expect to grow to ~.3 ExaByte of active data on OSG during this decade

✦ One successful approach: allow access to data across the wide-area★ NSF-funded “Any Data Anytime Anywhere” project, enabling Tier-3s★ high-throughput access to PetaByte data stores from tens of sites

✦ Another example for extending the Eco System: DASPOS★ Proposal led by Notre Dame to investigate data preservation issues★ Initial technical, sociological, and policy work towards curating knowledge★ Goal: repeat of analysis possible using only the preserved data, software

✦ There are many communities with growing data problems★ Often crossing thresholds of scale that require novel approaches★ OSG and the its Eco system are offering solutions at several scales★ “dV/DT” will address open issues in managing opportunistic storage

Page 13: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

A Framework for Innovation

✦ Real-world problems inform and drive computer science research★ This way, OSG is a perfect “test bed” for Computer Science research★ OSG released a document about CS needs towards 2020, available here★ Extend the Services offered, integrating fully into end-to-end solutions

✦ OSG Eco system enables interplay between domain needs and CS★ e.g: CS Pis proposed the “dV/DT” project addressing some of these

needs including the next steps for data, recently awarded a grant by DOE

✦ Networking:★ perfSonar-based network monitoring system between sites

✦ provides unprecedented insights into connectivity and needs of science community

✦ initiated between the LHC sites, now broadened across OSG

★ A wealth of information can be harvested, correlated, etc✦ integration of sites, user communities and applications allows novel approaches

✦ For both research and to improve operational robustness

★ hugely interesting to networking providers, including Internet2, ESnet etc

13

Page 14: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

A Framework for Sustaining Support

✦ OSG commits to support software needed by the users even if the original supplier and supporter move on (a “home for orphans”).

✦ Example: Identity Management ★ Taking on responsibility to provide certificate services to the broader U.S.

science community — when ESnet is stopped this service★ moving provisioning to commercial sector by contracting with vendor

✦ Innovation required to understand/manage dependency on commercial entity

★ continue work towards Identity Management solutions across site

14

Page 15: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

Extending OSG Across the Campuses

✦ Enable sharing of campus-local computational resources★ to increase availability of resources for researchers and groups

✦ Currently ~8 campus infrastructures★ Existing single-faculty resources are “beachheads” for their campus

✦ OSG working with Computer Scientists to provide software that will help users create larger, more inclusive Campus Grids

✦ a number of challenges are to be addressed, like federated Identity Management

15

Page 16: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

OSG and XSEDE

✦ Partnership between OSG and XSEDE★ allows XSEDE users access to the OSG HTC

✦ OSG became a XSEDE Service Provider★ OSG makes 2M hours available per quarter to the XRAC - the XSEDE

Resource Allocation Committee★ Since April provides HTC resources through an XSEDE interface

✦ 26 Users have allocations on OSG for 3.9M hours

✦ usage is steadily increasing: 135,000 hours in last 30 days

✦ Growing partnership between OSG and XSEDE teams★ OSG collaborating with XSEDE to provide user support ★ Also very productive collaboration in the security area★ OSG introduced DHTC principles and OSG at Campus Champions

meeting, HTC tutorial and OSG software talk at XSEDE'12 conference

16

Page 17: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

How to benefit from OSG

17

✦ Register as an individual researcher through the Campus Researcher Club (a new OSG initiative)

✦ Work with technical experts on the details of how your Community (working together already .e.g a Campus, a Science collaboration) can use existing and/or adapt services and software with mutual benefit.

✦ Identify new methods, technologies, services and activities of mutual benefit and propose a related Satellite project.

Page 18: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012M.Goldberg/NSF OSG Sep 18, 2012

Backup Slides

18

Page 19: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012

OSG Goals

✦ Goals of the Open Science Grid: transform data intensive science through a cross-domain, self-managed, nationally distributed cyber- infrastructure that brings together community resources and enables effective computational resource sharing at the academic and research campuses. ★ Provide framework for sharing of distributed computing and storage

resources (all resources contributed.. Not owned by OSG)★ Provision set of services and methods that enable better access to ever

increasing computing resources for researchers and communities★ Support principles and software that enable distributed high through-put

computing (DHTC) for users and communities at all scales.★ Represent and promote these concepts and technology with other

international partners on behalf of US science, research and scholarship.

 http://www.opensciencegrid.org/

19

Page 20: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012

A quick introduction to the OSG Team..

✦ The Open Science Grid leadership and core team of domain and computer scientists has been working together for more than 12 years towards their vision to support scientists and researcher to achieve discovery through the use of distributed high throughput computing services.

★ They are are Open to any researcher, computing resource owner, any software provider.

★ Their goal is to benefit and enable Science - scientific scholarship - of any kind that can effectively make use of our services and experience

★ They equate Grid to distributed high throughput computing of any kind – whether it be on commodity clusters, commercial Clouds, provided by high performance specialized systems etc.

20

Page 21: M.Goldberg/NSFOSGSep 18, 2012 The Open Science Grid 1

M.Goldberg/NSF OSG Sep 18, 2012 21

M.Goldberg/NSF OSG Sep 18, 2012

Funding Slide

✦ OSG Project funded by multiple program offices at NSF and DOE starting mid-2012 (April for DOE, June for NSF)

✦ Project will be reviewed after 3 years before final 2 years.

Program Office Funds/Year

NSF OCI $1,000k NSF MPS $2,750k DOE OHEP $1,600k DOE NP $50k

Total $5,400k

✦ Consortium members provide resources – both computing, software (all s/w open source) and personnel.