Exploiting Web 2.0 for Scientific Simulation Gabrielle Allen Department of Computer Science Center for Computation & Technology Louisiana State University Allen, Loeffler, Radke, Schnetter, Seidel, Integrating Web 2.0 Technologies with Scientific Simulation Codes for Real-Time Collaboration , IEEE International Conference on Cluster Computing (Cluster 2009), Workshop on The Impact and Influence of Web 2.0 on e-Research Infrastructure, Services and Applications.
1. Exploiting Web 2.0 for Scientific Simulation
Gabrielle Allen
Department of Computer Science
Center for Computation & Technology
Louisiana State University
Allen,Loeffler,Radke,Schnetter, Seidel,Integrating Web 2.0
Technologies with Scientic Simulation Codes for Real-Time
Collaboration, IEEE International Conference on Cluster Computing
(Cluster 2009), Workshop on The Impact and Influence of Web 2.0 on
e-Research Infrastructure, Services and Applications.
2. Gravitational Wave Physics
Models
Analysis & Insight
Observations
Petascale problems: Full 3D general relativistic models of binary
systems, supernova, gamma-ray bursts
3. Understanding Gravity
Data and Collaboration increasing
Log(Data)
Galileo
Smarr
LSU PRAC
7. 12 year pedigree, led from LSU. Over $10M support : NSF, EU,
DOD, DOE, NASA, Microsoft, MPG, LSU, NCSA.
Cactus Structure
Plug-In Thorns
(components)
remote steering
extensibleAPIs
ANSI C
driver
Fortran/C/C++
parameters
input/output
scheduling
equations of state
interpolation
Core Flesh
errorhandling
SOR solver
Your Physics !!
makesystem
wave evolvers
Computational
Tools !!
gridvariables
multigrid
black holes
coordinates
boundaryconditions
8. Cactus Structure
Web 2.0
Plug-In Thorns
(components)
remote steering
extensibleAPIs
ANSI C
driver
Fortran/C/C++
parameters
input/output
scheduling
equations of state
interpolation
Core Flesh
errorhandling
SOR solver
Your Physics !!
makesystem
wave evolvers
Computational
Tools !!
gridvariables
multigrid
black holes
coordinates
boundaryconditions
9. Cactus Application Environment
Individual research groups
Domain specific shared infrastructure
Flesh: APIs, information, orchestration
Adaptive mesh refinement, parallel I/O, interaction,
10. Einstein Toolkithttp://www.einsteintoolkit.org
Based on Cactus Framework
Over 130 open, community developed Cactus modules
Building a consortium of users
Governance and software development
Members
40 listed on web page
10 different groups
US, Japan, Mexico, Spain, Germany, Canada
300 science publications, 50 student theses
11. Typical Black Hole Simulations
At LSU
300 Cactus thorns
10,000 potential parameters
20 different supercomputers
100-2000 cores
Days/weeks to run (checkpoint/restart)
GBs to TBs of data (HDF5, ASCII, jpeg)
12. Collaborative Technologies
Technologies to share simulation-related information developed in
our group from the early 1990s
Essential to support the scientific research
Review historical evolution of these technologies
Show how Web 2.0 provides new tools to enable old
scenarios
13. Web-based Mail Lists
Mosaic web browser (1993, NCSA)
Seidels group at NCSA worry about content
http://archive.ncsa.illinois.edu/Cyberia/NumRel/GravWaves.html(1995)
Collaborative Cork Board (CoCoBoard) (Mid 90s)
Researchers have web-based project pages
Could attach images!! (usually 1-D plots of results)
Used till late 90s
Currently
Project based private wikis: parameter/output files, figures
Organize material for weekly project conference calls
Cons: network to access/edit wiki, editing slow
14. CoCoBoard
15. Simulation Web Interfaces
Thorn httpd
First collaborative tool fundamentally integrated into Cactus
Werner Benger (1999), visiting NCSA from Germany (7 hr time
difference and email)
Used socket library developed for remote viz (John Shalf &
TIKSL project)
Thorn HTTPD in standard toolkit (2000)
Simulation status, variables, timing, viewport, output files,
parameter steering, etc
Thorns can include their own web content
16. Issues
Authorization to web pages (username/password in parameter file is
insecure and awkward, newer version uses https and can also use
X.509)
Browsers can display images in certain formats, a Visualization
thorn uses gnuplot to include e.g. performance with time, physical
parameters
Problem deploying on compute nodes where web server cannot be
directly accessed (port forwarding, filewalls)
How to find and track the simulations, publicize existence to a
collaboration?
17. Cactus HTTPD Simulation Page
18. Cactus HTTPD Viewport
19. Simulation Reports and Email
Readable report automatically generated for each simulation
(computation and physics)
Prototyped 2001 but not used (?)
How to collect reports in one place?
Mail Thorn (sendmail)
Email reliable and fault tolerant (spool)
Supercomputers do not allow mail to be sent from compute
nodes.
20. GridLab Visualization Service
BryggUllmer (2004)
21. Announcing and Grid Portals
Collaborations need reliable, live information about long running
simulations.
NSF Astrophysics Simulation Collaboratory (ASC), 1999
Grid Portal provided centralized, collaborative interface to
submit, monitor and archive simulations
Java, JSP, Javascript with back-end data base, contributed to
GridSphere design (GridLab)
JavaCOG to submit jobs and basic monitoring.
ASC Portal (2002)
22. Announcing Simulation Info
Publish (application provided) simulation information
Thorn Announce, in prototype Cactus Worm scenario (2001)
Message from Flesh/Thorn info
Transport: XML-RPC to remote socket (portal)
Issues
Job IDs
Security, mapping users
Cumbersome user set parameters (portal location, visibility of job,
notification needs)
Announcing to ASC Portal (2002)
23. Notification
Portal notification service
Portal users configure at portal, simulations configure in
parameter file
Email, SMS, Instant Message
Initial experiments generated large telecom bills!
Cool and useful, but lots of work (FTE) to develop and modify
portal service, difficult to configure.
24. Web 2.0 Technologies
Use for collaborative, simulation-level messaging and information
archiving
Reliable, persistent, well-documented, user-configurable, cheap,
well supported, good APIs
25. Twitter
March 2006
Real-time short messaging system. Users send and receive each
others updates (tweets). Wide range of devices and rudimentary
social networking.
Receivers can filter messages they see and specify how they receive
them
Twitter API (e.g. post a new Twitter message from a user)
Free
26. Thorn Twitter
Uses libcurl
Cactus parameters for twitter username/password
Twitter API: statuses/update
At LSU numrel group account
Messages when simulation starts and at different stages
27. Flickr
2004, image hosting website for digital photographs (and now
videos). Bought by Yahoo (2005).
Professional account ($25/yr) for unlimited use
Web service API for uploading and manipulating images
Group images into Sets and Collections
Tags, title, description, metadata from EXIF headers
Social networking: users can comment on images, flag them, order by
popularity, etc. Public/Private/Friends/Family. Blogs.
RSS field allows quick previewing.
28. Thorn Flickr
Send images from running simulation
Uses: flickcurl, libcurl, libxml2, openssl
Authentication more complex (api key, shared secret)
Thorn uploads images that are generated by Cactus (and known to I/O
layer), e.g. IoJpeg
Each simulation given its own Flickr set
29. Future Work
Extend capabilities, production testing
Common authentication mechanism
Social networking model (individual/shared accounts)
Development of common tags, more metadata etc
Storing videos (Flickr, YouTube, Vimeo)
Advantage for scientists presenting
Lots of other possibilities: DropBox to publish files across a
collaboration, WordPress for simulation reports/blogs, FaceBook to
replace grid portals and aggregate services, Cloud computing APIs
for grid scenarios,
30. Einstein Toolkit
Trying to establish a community for computational relativity:
Wiki for community documentation
Blog for community posting
www.einsteintoolkit.org
31. Conclusions
Started as a fun project (undergrad)
Web 2.0 provide reliable delivery, storage, access, and flexible
collaborative features
Can use Web 2.0 to easily prototype new interactive and
collaborative scenarios (have really missed this)
Small groups and individuals can do this too!!
Target standard of ease-of-use for cyberinfrastructure
development
For real use need unified authentication, clear policies on data,
site versions