13
Presented by End to End Computing at ORNL Scott A. Klasky Computing and Computational Science Directorate Center for Computational Sciences

End to End Computing at ORNL

  • Upload
    pippa

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

End to End Computing at ORNL. Scott A. Klasky Computing and Computational Science Directorate Center for Computational Sciences. Petascale data workspace. Massively parallel simulation. Impact areas of R&D. - PowerPoint PPT Presentation

Citation preview

Page 1: End to End Computing at ORNL

Presented by

End to End Computing at ORNL

Scott A. Klasky

Computing and Computational Science DirectorateCenter for Computational Sciences

Page 2: End to End Computing at ORNL

2 Klasky_E2E_0611

Petascale data workspaceMassively

parallelsimulation

Page 3: End to End Computing at ORNL

3 Klasky_E2E_0611

Asynchronous I/O using data in transit (NXM) techniques (M. Parashar, Rutgers; M. Wolf, GT)

Workflow automation (SDM Center; A. Shoshani, LBNL; N. Podhorszki, UC Davis)

Dashboard front end (R. Barreto, ORNL; M. Vouk, NCSU)

High-performance metadata-rich I/O (C. Jin, ORNL; M. Parashar, Rutgers)

Logistical networking integrated into data analysis/visualization (M. Beck, UTK)

CAFÉ common data model (L. Pouchard, ORNL; M. Vouk, NCSU)

Real-time monitoring of simulations (S. Ethier, PPPL; J. Cummings, Cal Tech;Z. Lin, U.C. Irvine; S. Ku, NYU) Visualization in real-time monitor Real-time data analysis

Impact areas of R&D

Page 4: End to End Computing at ORNL

4 Klasky_E2E_0611

Hardware architecture

Jaguar compute

nodes

Jaguar compute

nodes

JaguarI/O

nodes

JaguarI/O

nodesEwokEwok

Lustre2

Ewok-web

Nfs DB

Ewok-sql

HPSSHPSS

Simulation control

LustreLustre

Restart files

LoginnodesLoginnodes

Job control

Seaborg

40 G/s Sockets later

Page 5: End to End Computing at ORNL

5 Klasky_E2E_0611

Asynchronous petascale I/O for data in transit

High-performance I/O Asynchronous Managed buffers Respect firewall

constraints

Enable dynamic control with flexible MxN operations Transform using

shared-space framework (Seine)

User applications

Seine coupling framework interface

Other program paradigms

Shared space management Load balancing

Directory layer Storage layer

Communication layer (buffer management)

Operating system

Page 6: End to End Computing at ORNL

6 Klasky_E2E_0611

Workflow automation

Automate the data processing pipeline, including transfer of simulation output to the e2e system, execution of conversion routines, image creation, archival using the Kepler workflow system

• Requirements for Petascale computing

– Easy to use– Dashboard front-end– Autonomic

Page 7: End to End Computing at ORNL

7 Klasky_E2E_0611

Dashboard front end Desktop Interface uses asynchronous

Javascript and XML (AJAX); runs in web browser

Ajax is a combination of technologies coming together in powerful ways (XHTML, CSS, DOM, XML, XST, XMLhttpRequest, and Javascript)

The user’s interaction with the application happens asynchronously – independent of communication with the server

Users can rearrange the page without clearing it

Page 8: End to End Computing at ORNL

8 Klasky_E2E_0611

High-performance metadata-rich I/O Two-step process to produce files:

Write out binary data + tags using parallel I/O on XT3. (May or may not use files; could use asynchronous I/O methods) The tags contain the metadata information

that is placed inside the files Workflow transfers this information to Ewok

(IB cluster with 160P)

Service on Ewok decodes files into hdf5 files and places metadata into XML file (one XML file for all of the data)

Cuts I/O overhead in GTC from 25% to <3%

Step 2:

Step 1:

Page 9: End to End Computing at ORNL

9 Klasky_E2E_0611

EwokclusterEwok

cluster

Depots

JaguarCray XT3Jaguar

Cray XT3

NYU

PPPL

UCI

MIT

Portals

Directoryserver

Logistical networking: High-performance ubiquitous and transparent data access over the WAN

Page 10: End to End Computing at ORNL

10 Klasky_E2E_0611

A scientist can seamlessly find input and output variables of a given run, unit, average, min and max values for a variable

CAFÉ common data model (Combustion S3D), astrophysics (Chimera), fusion (GTC/XGC Environment)

Provenance CAFÉ model

Provenance CAFÉ model

Stores and organizes four types of information about a given run: Provenance Operational profile Hardware mapping Analysis metadata

Page 11: End to End Computing at ORNL

11 Klasky_E2E_0611

Real-time monitoring of simulations Using end-to-end technology, we have

created a monitoring technique for fusion codes (XGC, GTC)

Kepler is used to automate the steps. The data movement task will be using the Rutgers/GT data-in-transit routines

Metadata are placed in the movement from the XT3 to the IB cluster. Data go from NXM processors using the Seine framework

Data are archived into the High-Performance Storage System, metadata are placed in DB.

Visualization and analysis services produce data/information that are placed on the AJAX front-end

Data are replicated from ewok to other sites using logisitcal networking

Users access the files from our metadata server at ORNL

Scientists need tools to let them see and manipulate their simulation data quickly

Archiving data, staging it to secondary or tertiary computing resources for annotation and visualization, staging it yet again to a web portal….

These work, but we can accelerate the rate of insight for some scientists by allowing them to observe data during a run

The per-hour facilities cost of running a leadership-class machine is staggering

Computational scientists should have tools to allow them to constantly observe and adjust their runs during their scheduled time slice, as astronomers at an observatory or physicists at a beam source can

Page 12: End to End Computing at ORNL

12 Klasky_E2E_0611

Real-time monitoring

Typical monitoring • Look at volume-averaged

quantities• At four key times, this

quantity looks good• Code had one error that didn’t

appear in the typical ASCII output to generate this graph

• Typically, users run gnuplot/grace to monitor output

Typical monitoring • Look at volume-averaged

quantities• At four key times, this

quantity looks good• Code had one error that didn’t

appear in the typical ASCII output to generate this graph

• Typically, users run gnuplot/grace to monitor output

More advanced monitoring• 5 seconds move 300 MB,

and process the data• Need to use FFT for 3-D data, and

then process data + particles– 50 seconds (10 time steps) move

and process data– 8 GB for 1/100 of the

30 billion particles• Demand low overhead <3%!

More advanced monitoring• 5 seconds move 300 MB,

and process the data• Need to use FFT for 3-D data, and

then process data + particles– 50 seconds (10 time steps) move

and process data– 8 GB for 1/100 of the

30 billion particles• Demand low overhead <3%!

Page 13: End to End Computing at ORNL

13 Klasky_E2E_0611

Contact

Scott A. KlaskyLead, End-to-End SolutionsCenter for Computational Sciences(865) [email protected]

13 Klasky_E2E_0611