29
Exploiting Web 2.0 for Scientific Simulation Gabrielle Allen Department of Computer Science Center for Computation & Technology Louisiana State University Allen, Loeffler, Radke, Schnetter, Seidel, Integrating Web 2.0 Technologies with Scientific Simulation Codes for Real-Time Collaboration , IEEE International Conference on Cluster Computing (Cluster 2009), Workshop on The Impact and Influence of Web 2.0 on e-Research Infrastructure, Services and Applications.

Esi web2.0 may2010

Embed Size (px)

Citation preview

Exploiting Web 2.0 for Scientific Simulation

Gabrielle Allen

Department of Computer Science

Center for Computation & Technology

Louisiana State University

Allen, Loeffler, Radke, Schnetter, Seidel, Integrating Web 2.0 Technologies with Scientific Simulation Codes for Real-Time Collaboration, IEEE International Conference on Cluster Computing (Cluster 2009), Workshop on The Impact and Influence of Web 2.0 on e-Research Infrastructure, Services and Applications.

Gravitational Wave Physics

Observations

ModelsAnalysis & Insight

Petascale problems: Full 3D general relativistic models of binary systems, supernova, gamma-ray bursts

Understanding Gravity

1600 1970 1995 2000 2009Galileo Smarr LSU PRAC

Log(Data)

Data and Collaboration increasing

Cactus Framework• Component-based HPC framework:

– Freely available, modular, portable, manageable environment for collaboratively developing parallel, multi-dimensional simulation codes

• Enabling applications: – Numerical Relativity/Astrophysics, CFD, Coastal,

Reservoir Engineering, Quantum Gravity, …

• Finite difference, AMR, FE/FV, multipatch, …

• Cutting edge CS:– Grid computing, petascale,

accelerators, steering, remote viz, application driver

• Active user & developer communities:– 12 year pedigree, led from LSU. Over

$10M support : NSF, EU, DOD, DOE, NASA, Microsoft, MPG, LSU, NCSA.

Cactus Structure

Core “Flesh”

Plug-In “Thorns”(components)

driver

input/output

interpolation

SOR solver

coordinates

boundary conditions

black holes

equations of state

remote steering

wave evolvers

multigrid

parameters

grid variables

error handling

scheduling

extensible APIs

make system

ANSI CFortran/C/C++

Your Physics !!

ComputationalTools !!

Cactus Structure

Core “Flesh”

Plug-In “Thorns”(components)

driver

input/output

interpolation

SOR solver

coordinates

boundary conditions

black holes

equations of state

remote steering

wave evolvers

multigrid

parameters

grid variables

error handling

scheduling

extensible APIs

make system

ANSI CFortran/C/C++

Your Physics !!

ComputationalTools !!

Web 2.0

Cactus Application Environment

Adaptive mesh refinement, parallel I/O, interaction, …

Flesh: APIs, information, orchestration

Domain specific shared infrastructure

Individual research groups

Einstein Toolkithttp://www.einsteintoolkit.org

• Based on Cactus Framework

• Over 130 open, community developed Cactus modules

• Building a consortium of users– Governance and software

development

• Members– 40 listed on web page– 10 different groups– US, Japan, Mexico,

Spain, Germany, Canada

• 300 science publications, 50 student theses

Typical Black Hole Simulations

At LSU …• 300 Cactus thorns• 10,000 potential parameters• 20 different supercomputers• 100-2000 cores• Days/weeks to run (checkpoint/restart)• GBs to TBs of data (HDF5, ASCII, jpeg)

Collaborative Technologies

• Technologies to share simulation-related information developed in our group from the early 1990s– Essential to support the scientific research

• Review historical evolution of these technologies

• Show how Web 2.0 provides new tools to enable old scenarios

Web-based Mail Lists

• Mosaic web browser (1993, NCSA)– Seidel’s group at NCSA worry about content– http://archive.ncsa.illinois.edu/Cyberia/NumRel/GravWaves.html (1995)

• Collaborative Cork Board (CoCoBoard) (Mid 90’s)– Researchers have web-based “project pages”– Could attach images!! (usually 1-D plots of results)– Used till late 90’s

• Currently– Project based private wikis: parameter/output files, figures– Organize material for weekly project conference calls– Cons: network to access/edit wiki, editing slow

CoCoBoard

Simulation Web Interfaces

• Thorn “httpd”– First collaborative tool fundamentally

integrated into Cactus– Werner Benger (1999), visiting NCSA from

Germany (7 hr time difference and email)– Used socket library developed for remote viz

(John Shalf & TIKSL project)

• Thorn “HTTPD” in standard toolkit (2000)– Simulation status, variables, timing, viewport,

output files, parameter steering, etc– Thorns can include their own web content

Issues• Authorization to web pages

(username/password in parameter file is insecure and awkward, newer version uses https and can also use X.509)

• Browsers can display images in certain formats, a Visualization thorn uses gnuplot to include e.g. performance with time, physical parameters

• Problem deploying on compute nodes where web server cannot be directly accessed (port forwarding, filewalls)

• How to find and track the simulations, publicize existence to a collaboration?

Cactus HTTPD Simulation Page

Cactus HTTPD Viewport

Simulation Reports and Email• Readable report automatically

generated for each simulation (computation and physics)– Prototyped 2001 but not used (?)

• How to collect reports in one place?• Mail Thorn (sendmail)

– Email reliable and fault tolerant (spool)– Supercomputers do not allow mail to be

sent from compute nodes.

GridLab Visualization Service

Brygg Ullmer (2004)

Announcing and Grid Portals• Collaborations need

reliable, live information about long running simulations.– NSF Astrophysics Simulation

Collaboratory (ASC), 1999– Grid Portal provided centralized,

collaborative interface to submit, monitor and archive simulations

– Java, JSP, Javascript with back-end data base, contributed to GridSphere design (GridLab)

– JavaCOG to submit jobs and basic monitoring. ASC Portal (2002)

Announcing Simulation Info• Publish (application provided)

simulation information• Thorn Announce, in prototype

Cactus Worm scenario (2001)– Message from Flesh/Thorn info– Transport: XML-RPC to remote

socket (portal)

• Issues– Job IDs– Security, mapping users – Cumbersome user set parameters

(portal location, visibility of job, notification needs)

Announcing to ASC Portal (2002)

Notification

• Portal notification service

• Portal users configure at portal, simulations configure in parameter file

• Email, SMS, Instant Message– Initial experiments

generated large telecom bills!

Cool and useful, but lots of work (FTE) to develop and modify portal service, difficult to configure.

Web 2.0 Technologies

• Use for collaborative, simulation-level messaging and information archiving– Reliable, persistent, well-documented,

user-configurable, cheap, well supported, good APIs

Twitter

• March 2006• Real-time short messaging system.

Users send and receive each others updates (tweets). Wide range of devices and rudimentary social networking.

• Receivers can filter messages they see and specify how they receive them

• Twitter API (e.g. post a new Twitter message from a user)

• Free

Thorn Twitter

• Uses libcurl• Cactus parameters

for twitter username/password

• Twitter API: statuses/update

• At LSU “numrel” group account

• Messages when simulation starts and at different stages

Flickr• 2004, image hosting website for digital photographs

(and now videos). Bought by Yahoo (2005).• Professional account ($25/yr) for unlimited use• Web service API for uploading and manipulating

images– Group images into Sets and Collections– Tags, title, description, metadata from EXIF headers

• Social networking: users can comment on images, flag them, order by popularity, etc. Public/Private/Friends/Family. Blogs.

• RSS field allows quick previewing.

Thorn Flickr

• Send images from running simulation

• Uses: flickcurl, libcurl, libxml2, openssl

• Authentication more complex (api key, shared secret)

• Thorn uploads images that are generated by Cactus (and known to I/O layer), e.g. IoJpeg

• Each simulation given its own Flickr set

Future Work

• Extend capabilities, production testing• Common authentication mechanism• Social networking model (individual/shared

accounts)• Development of common tags, more metadata etc• Storing videos (Flickr, YouTube, Vimeo)

– Advantage for scientists presenting

• Lots of other possibilities: DropBox to publish files across a collaboration, WordPress for simulation reports/blogs, FaceBook to replace grid portals and aggregate services, Cloud computing APIs for “grid” scenarios, …

Einstein Toolkit

• Trying to establish a community for computational relativity:– Wiki for community documentation– Blog for community posting– www.einsteintoolkit.org

Conclusions• Started as a fun project (undergrad)• Web 2.0 provide reliable delivery, storage,

access, and flexible collaborative features• Can use Web 2.0 to easily prototype new

interactive and collaborative scenarios (have really missed this)– Small groups and individuals can do this too!!

• Target standard of ease-of-use for cyberinfrastructure development

• For real use need unified authentication, clear policies on data, site versions