Managed by UT-Battelle for the Department of Energy XLDB 2011 10/19/2011 Scott A. Klasky...
33
Managed by UT-Battelle for the Department of Energy In situ data processing for extreme-scale computing XLDB 2011 10/19/2011 Scott A. Klasky [email protected]To Exascale and beyond! H. Abbasi 2 , Q. Liu 2 , J. Logan 1 , M. Parashar 6 , N. Podhorszki 1 , K. Schwan 4 , A. Shoshani 3 , M. Wolf 4 , S. Ahern 1 , I.Altintas 9 , W.Bethel 3 , L. Chacon 1 , CS Chang 10 , J. Chen 5 , H. Childs 3 , J. Cummings 13 , C. Docan 6 , G. Eisenhauer 4 , S. Ethier 10 , R. Grout 7 , R. Kendall 1 , J. Kim 3 , T. Kordenbrock 5 , Z. Lin 10 , J. Lofstead 5 , X. Ma 15 , K. Moreland 5 , R. Oldfield 5 , V. Pascucci 12 , N. Samatova 15 , W. Schroeder 8 , R. Tchoua 1 , Y. Tian 14 , R. Vatsavai 1 , M. Vouk 15 , K. Wu 3 , W. Yu 14 , F. Zhang 6 , F. Zheng 4 1 ORNL, 2 U.T. Knoxville, 3 LBNL, 4 Georgia Tech, 5 Sandia Labs, 6 Rutgers, 7 NREL, 8 Kitware, 9 UCSD, 10 PPPL, 11 UC Irvine, 12 U. Utah, 13 Caltech, 14 Auburn University, 15 NCSU Thanks to: V. Bhat, B. Bland, S. Canon, T. Critchlow, G. Y. Fu, B. Geveci, G. Gibson, C. Jin, B. Ludaescher, S. Oral , M. Polte, R. Ross, J. Saltz, R. Sankaran, C. Silva, W. Tang, W. Wang, P. Widener
Managed by UT-Battelle for the Department of Energy XLDB 2011 10/19/2011 Scott A. Klasky [email protected] To Exascale and beyond! H. Abbasi 2, Q. Liu 2,
Managed by UT-Battelle for the Department of Energy XLDB 2011
10/19/2011 Scott A. Klasky [email protected] To Exascale and beyond!
H. Abbasi 2, Q. Liu 2, J. Logan 1, M. Parashar 6, N. Podhorszki 1,
K. Schwan 4, A. Shoshani 3, M. Wolf 4, S. Ahern 1, I.Altintas 9,
W.Bethel 3, L. Chacon 1, CS Chang 10, J. Chen 5, H. Childs 3, J.
Cummings 13, C. Docan 6, G. Eisenhauer 4, S. Ethier 10, R. Grout 7,
R. Kendall 1, J. Kim 3, T. Kordenbrock 5, Z. Lin 10, J. Lofstead 5,
X. Ma 15, K. Moreland 5, R. Oldfield 5, V. Pascucci 12, N. Samatova
15, W. Schroeder 8, R. Tchoua 1, Y. Tian 14, R. Vatsavai 1, M. Vouk
15, K. Wu 3, W. Yu 14, F. Zhang 6, F. Zheng 4 1 ORNL, 2 U.T.
Knoxville, 3 LBNL, 4 Georgia Tech, 5 Sandia Labs, 6 Rutgers, 7
NREL, 8 Kitware, 9 UCSD, 10 PPPL, 11 UC Irvine, 12 U. Utah, 13
Caltech, 14 Auburn University, 15 NCSU Thanks to: V. Bhat, B.
Bland, S. Canon, T. Critchlow, G. Y. Fu, B. Geveci, G. Gibson, C.
Jin, B. Ludaescher, S. Oral, M. Polte, R. Ross, J. Saltz, R.
Sankaran, C. Silva, W. Tang, W. Wang, P. Widener
Slide 2
Managed by UT-Battelle for the Department of Energy Outline I/O
challenges for extreme scale computing The Adaptable I/O System
(ADIOS) SOA In transit approach Future work Work supported under:
ASCR : CPES, SDM, ASCR SDM Runtime Staging, SciDAC Application
Partnership, OLCF, eXact co-design OFES : GPSC, GSEP NSF : HECURA,
RDAV NASA : ROSES
Slide 3
Managed by UT-Battelle for the Department of Energy Extreme
scale computing Trends More FLOPS Limited number of users at the
extreme scale Problems Performance Resiliency Debugging Getting
Science done Problems will get worse Need a revolutionary way to
store, access, debug to get the science done! ASCI purple (49
TB/140 GB/s) JaguarPF (300 TB/200 GB/s) From J. Dongarra, Impact of
Architecture and Technology for Extreme Scale on Software and
Algorithm Design, Cross- cutting Technologies for Computing at the
Exascale, February 2-5, 2010. Most people get < 5 GB/s at
scale
Slide 4
Managed by UT-Battelle for the Department of Energy Next
generation I/O and file system challenges At the architecture or
node level Use increasingly deep memory hierarchies coupled with
new memory properties At the system level Cope with I/O rates and
volumes stress the interconnect and can severely limit application
performance Can consume unsustainable levels of power At the
exascale machine level Immense aggregate I/O needs with potentially
uneven loads placed on underlying resource Can result in data
hotspots, interconnect congestion and similar issues
Slide 5
Managed by UT-Battelle for the Department of Energy Our team
works directly with (24% of time at OLCF) Fusion: XGC1, GTC, GTS,
GTC-P, Pixie3D, M3D, GEM Combustion: S3D, Raptor, Boxlib
Relativity: PAMR Nuclear Physics: Nuclear Physics Astrophysics:
Chimera, Flash many more application teams Requirements (in order)
Fast I/O with low overhead Simplicity in using I/O Self Describing
file formats QoS Extras, if possible In-transit visualization
Couple multiple codes Perform in transit workflows Applications are
the main drivers
Slide 6
Managed by UT-Battelle for the Department of Energy ADIOS:
Adaptable I/O System Provides portable, fast, scalable,
easy-to-use, metadata rich output with a simple API Change I/O
method by changing XML Layered software architecture: Allows
plug-ins for different I/O implementations Abstracts the API from
the method used for I/O Open source:
http://www.nccs.gov/user-support/center-projects/adios/ Research
methods from many groups. S3D: 10X performance improvement over
previous work Chimera: 10X performance improvement XGC1 code 40
GB/s, SCEC code 30 GB/s GTC code 40 GB/s, GTS code: 35 GB/s SCEC
code 10X performance improvement
Slide 7
Managed by UT-Battelle for the Department of Energy We want our
system to be so easy, even a chimp can use it Even I can use it!
Sustainable Fast Scalable Portable
Slide 8
Managed by UT-Battelle for the Department of Energy Usability
of Optimizations New technologies are usually constrained by the
lack of usability in extracting performance Next generation I/O
frameworks must address this concern Partitioning the task of
optimizations from the actual description of the I/O Allow
framework to optimize for both read/write performance on machines
with high-variability, and increase the average I/O performance std
dev. time
Slide 9
Managed by UT-Battelle for the Department of Energy Data
Formats: Elastic Data Organization File systems have increased in
complexity due to high levels of concurrency Must understand how to
place data on the file system How to decrease the overhead of both
writing and reading data to and from these complex file systems
File formats should be Easy to use, easy to program, high
performance, portable, self-describing, and able to work
efficiently on parallel file systems and simple file systems. Allow
for simple queries into the data A major challenge has always been
in the understanding of the performance of I/O for both reading and
writing, and in trying to optimize for both of these cases
Slide 10
Managed by UT-Battelle for the Department of Energy Excellent
read performance for patterns Use Hilbert curve to place chunks on
lustre file system with an Elastic Data Organization. Theoretical
concurrencyObserved Performance
Slide 11
Managed by UT-Battelle for the Department of Energy I/O skeltal
code creation s3d.xml skel xml s3d_skel.xml skel params
s3d_params.xml skel src skel makefile skel submit Makefile Submit
scripts Source files make Executables make deploy Skel_s3d I/O
Skeletal Applications are automatically generated Generated from
ADIOS XML file No computation or communication. Standard build
mechanism Standard measurement strategies Extensible collection of
applications Skel generates source code, makefile, and submission
scripts
Slide 12
Managed by UT-Battelle for the Department of Energy SOA
philosophy The overarching design philosophy of our framework is
based on the Service-Oriented Architecture Used to deal with
system/application complexity, rapidly changing requirements,
evolving target platforms, and diverse teams Applications
constructed by assembling services based on a universal view of
their functionality using a well-defined API Service
implementations can be changed easily Integrated simulation can be
assembled using these services Manage complexity while maintaining
performance/scalability Complexity from the problem (complex
physics) Complexity from the codes and how they are Complexity of
underlying disruptive infrastructure Complexity from coordination
across codes and research teams
Slide 13
Managed by UT-Battelle for the Department of Energy Data
staging 101 Why staging? Decouple file system performance
variations and limitations from application run time Enables
optimizations based on dynamic number of writers High bandwidth
data extraction from application with RDMA Scalable data movement
with shared resources requires us to manage the transfers
Asynchronous data movement, for low impact on application Managing
interference Use small staging area for large data output
Slide 14
Managed by UT-Battelle for the Department of Energy Our
discovery of data scheduling for asynchronous I/O Scheduling
technique Nave scheduling (Continuous Drain), can be slower than
synchronous I/O Smart scheduling can be scalable and much better
than synchronous I/O
Slide 15
Managed by UT-Battelle for the Department of Energy In situ
processing Data movement is a deterrent to effective use of the
data The output costs increase the runtime and force the user to
reduce the output data Input costs can make up to 90% of the total
running time for scientific workflows Our goal is to create an
infrastructure to make it easy for you to forget the technology and
focus on your science! Cool!
Slide 16
Managed by UT-Battelle for the Department of Energy Monolithic
staging applications Multiple pipelines are separate staging areas
Data movement is between each staging code High level of fault
tolerance Staging Revolution: Phase 1 Application Staging Process 1
Staging Process 2 Storage Staging Area
Slide 17
Managed by UT-Battelle for the Department of Energy Shared
Space Abstraction in the Staging Area Semantically-specialized
virtual shared space abstraction in the staging area Shared
(distributed) data object Simple put/get/query API Supports
application application, workflows E.g. ADIOS DataSpaces (HPDC10)
Constructed on-the-fly on hybrid staging nodes Indexes data for
quick access and retrieval Provides asynchronous coordination and
interaction support Complements existing interaction/coordination
mechanisms Code coupling using DataSpaces Maintain locality for
in-situ exchange Complex geometry-based queries In-space (online)
data transformation and manipulations
Slide 18
Managed by UT-Battelle for the Department of Energy Phase 2:
Introducing Plugins Disruptive step Staging codes broken down into
multiple plugins Staging area is launched as a framework to launch
plugins Data movement occurs between all plugins Scheduler decides
resource allocations Application Plugin A Plugin B Plugin E Plugin
E Plugin C Plugin D Staging Area
Slide 19
Managed by UT-Battelle for the Department of Energy Resources
are co-allocated to plugins Plugins are executed consecutively,
co-located plugins for no data movement Scheduler attempts to
minimize data movement Phase 3: Managing the staging area
Application Plugin A Plugin B Plugin E Plugin E Plugin C Plugin D
Managed Staging Area Co-located plugins
Slide 20
Managed by UT-Battelle for the Department of Energy Application
Plugin B Plugin E Plugin E Plugin C Plugin D Managed Staging Area
Plugins executed within the application node B1B1 B1B1 B2B2 B2B2
Plugin A Next evolution of staging Move plugins/services back to
application Largest computational capacity Integrate with plugins
in the remote staging area Hybrid staging
Slide 21
Managed by UT-Battelle for the Department of Energy Third-Party
Plugins in the Staging Area Data processing plugins in the hybrid
staging area In-situ data processing Analytics pipelines Many
issues Programming (data and control) models for plugins Deployment
mechanisms Robustness, correctness, etc. Multiple approaches Code,
binary, scripts, etc. Several implementations in ADIOS ActiveSpace,
SmartTap, etc. E.g., ADIOS ActiveSpaces: Dynamically deploy custom
application data transformation/filters on-demand and execute in
staging area (DataSpaces) Provide the programming support to define
custom data kernels to operate on data objects of interest Runtime
system to dynamically deploy binary code to DataSpaces, and execute
them on the relevant data objects Define a filter, implement and
compile with the application E.g., min(), max(), sum(), avg(),
etc.
Slide 22
Managed by UT-Battelle for the Department of Energy Local
Statistics Local features Binning of data Sorting within a bucket
Compression Global Statistics Global features Ordered sorts
Indexing of data Compression Spatial correlation Topology mapping
Interactive Visualization Temporal Correlations Validation On
Compute Nodes 10 6 cores On Staging Nodes 10 3 cores Minimize cross
node communication Minimize cross timestep communication Everything
else Especially those which require human interaction Post
processing 10^3 Disks Dividing up the pipeline
Slide 23
Managed by UT-Battelle for the Department of Energy Data
Management (Data Reduction/in situ) ImpactObjectives Develop a
lossy and lossless compression methodology to reduce incompressible
spatio-temporal double precision scientific data Ensure high
compression rates with ISABELA, ISOBAR, while adhering to
user-controlled accuracy bounds Provide fast and accurate
approximate query processing technique directly over
ISABELA-compressed scientific data Up to 85% reduction in storage
on data from petascale simulation applications: S3D, GTS, XGC,
Guaranteed bounds in per-point error, and overall 0.99 correlation,
and 0.01 NRMSE with original data Communication-free, minimal
overhead technique easily pluggable into any simulation application
ISABELA Compression Accuracy at 75% reduction in size Analysis of
XGC1 Temperature Data
Slide 24
Managed by UT-Battelle for the Department of Energy Data
Management (Multi resolution) Storage of the data on each process
group in multiple levels for contiguous arrays Real-time decisions
to decide which levels to stream to the staging resources, and
which data to store Can reduce overall storage by real-time
streaming/storing reduced representations of the data when its
within an error-bound
Slide 25
Managed by UT-Battelle for the Department of Energy
Slide 26
Conclusions and Future Direction ADIOS, developed by many
institutions, is a proven DOE framework which blends together
Self-describing file formats Staging methods with visualization and
analysis plugins Advanced data management plugins, for data
reduction, multi-resolution data reduction, and advanced workflow
scheduling Allows for multiplexing of output using a novel SOA
approach New in situ techniques are critical for application
success on extreme scale architecture We still must use this to
prepare data for further exploratory data analytics Hybrid staging
integration with cloud resources for next generation HPC-Cloud
services New customers and new technologies bring new challenges
AMR Image processing Experimental and Sensor Data
Slide 27
Managed by UT-Battelle for the Department of Energy Community
highlights Fusion 2006 Battelle Annual report, cover image
http://www.battelle.org/annualreports/ar2006/def ault.htm 2/16/2007
Supercomputing Keys Fusion Research
http://www.hpcwire.com/hpcwire/2007-02-
16/supercomputing_keys_fusion_research-1.html
http://www.hpcwire.com/hpcwire/2007-02-
16/supercomputing_keys_fusion_research-1.html 7/14/2008 Researchers
Conduct Breakthrough Fusion Simulation
http://www.hpcwire.com/hpcwire/2008-07-
14/researchers_conduct_breakthrough_fusion_sim ulation.html
http://www.hpcwire.com/hpcwire/2008-07-
14/researchers_conduct_breakthrough_fusion_sim ulation.html
8/27/2009 Fusion Gets Faster
http://www.hpcwire.com/hpcwire/2009-07- 27/fusion_gets_faster.html
http://www.hpcwire.com/hpcwire/2009-07- 27/fusion_gets_faster.html
4/21/2011 Jaguar Supercomputing Harnesses Heat for Fusion Energy
http://www.hpcwire.com/hpcwire/2011-04-
19/jaguar_supercomputer_harnesses_heat_for_fusi on_energy.html
http://www.hpcwire.com/hpcwire/2011-04-
19/jaguar_supercomputer_harnesses_heat_for_fusi on_energy.html
Combustion 9/29/2009 ADIOS Ignites Combustion Simulations
http://www.hpcwire.com/hpcwire/2009-10-
29/adios_ignites_combustion_simulations.html
http://www.hpcwire.com/hpcwire/2009-10-
29/adios_ignites_combustion_simulations.html Computer science
2/12/2009 Breaking the bottleneck in computer data flow
http://ascr- discovery.science.doe.gov/bigiron/io3.shtml
http://ascr- discovery.science.doe.gov/bigiron/io3.shtml 9/12/2010
Say Hello to the NEW ADIOS http://www.hpcwire.com/hpcwire/2010-09-
12/say_hello_to_the_new_adios.html
http://www.hpcwire.com/hpcwire/2010-09-
12/say_hello_to_the_new_adios.html 2/28/2011 OLCF, Partners Release
eSiMon Dashboard Simulation Tool
http://www.hpcwire.com/hpcwire/2011-02-
28/olcf_partners_release_esimon_dashboard_simu lation_tool.html
http://www.hpcwire.com/hpcwire/2011-02-
28/olcf_partners_release_esimon_dashboard_simu lation_tool.html
7/21/2011 ORNL ADIOS Team Releases Version 1.3 of Adaptive
Input/Output System http://www.hpcwire.com/hpcwire/2011-07-
21/ornl_adios_team_releases_version_1.3_of_adap
tive_input_output_system.html
http://www.hpcwire.com/hpcwire/2011-07-
21/ornl_adios_team_releases_version_1.3_of_adap
tive_input_output_system.html
Slide 28
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications SOA 1. S. Klasky, et al., In Situ data
processing for extreme- scale computing, appear SciDAC 2011. 2. C.
Docan, J. Cummings, S. Klasky, M. Parashar, Moving the Code to the
Data Dynamic Code Deployment using ActiveSpaces, IPDPS 2011. 3. F.
Zhang, C. Docan, M. Parashar, S. Klasky, Enabling Multi-Physics
Coupled Simulations within the PGAS Programming Framework, CCgrid
2011. 4. C. Docan, S. Klasky, M. Parashar, DataSpaces: An
Interaction and Coordination Framework for Coupled Simulation
Workflows, HPDC10. ACM, Chicago Ill. 5. Cummings, Klasky, et al.,
EFFIS: and End-to-end Framework for Fusion Integrated Simulation,
PDP 2010, http://www.pdp2010.org/. 6. C. Docan, J. Cummings, S.
Klasky, M. Parashar, N. Podhorszki, F. Zhang, Experiments with
Memory-to- Memory Coupling for End-to-End fusion Simulation
Workflows, ccGrid2010, IEEE Computer Society Press 2010. 7. N.
Podhorszki, S. Klasky, et al.: Plasma fusion code coupling using
scalable I/O services and scientific workflows. SC-WORKS 2009 8. C.
Docan, M. Parashar and S. Klasky. DataSpaces: An Interaction and
Coordination Framework for Coupled Simulation Workflows. Journal of
Cluster Computing, 2011 Data Formats 8. Y. Tian, S. Klasky, H.
Abbasi, J. Lofstead, R. Grout, N. Podhorszki, Q. Liu, Y. Wang, W.
Yu, EDO: Improving Read Performance for Scientific Applications
Through Elastic Data Organization, to appear Cluster 2011. 9. Y.
Tian, SRC: Enabling Petascale Data Analysis for Scientific
Applications Through Data Reorganization, ICS 2011, First Place:
Student Research Competition 10. M. Polte, J. Lofstead, J. Bent, G.
Gibson, S. Klasky, "...And Eat it Too: High Read Performance in
Write- Optimized HPC I/O Middleware File Formats," in In
Proceedings of Petascale Data Storage Workshop 2009 at
Supercomputing 2009, ed, 2009. 11. S. Klasky, et al., Adaptive IO
System (ADIOS), Cray User Group Meeting 2008.
Slide 29
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications SOA 1. S. Klasky, et al., In Situ data
processing for extreme- scale computing, appear SciDAC 2011. 2. C.
Docan, J. Cummings, S. Klasky, M. Parashar, Moving the Code to the
Data Dynamic Code Deployment using ActiveSpaces, IPDPS 2011. 3. F.
Zhang, C. Docan, M. Parashar, S. Klasky, Enabling Multi-Physics
Coupled Simulations within the PGAS Programming Framework, CCgrid
2011. 4. C. Docan, S. Klasky, M. Parashar, DataSpaces: An
Interaction and Coordination Framework for Coupled Simulation
Workflows, HPDC10. ACM, Chicago Ill. 5. Cummings, Klasky, et al.,
EFFIS: and End-to-end Framework for Fusion Integrated Simulation,
PDP 2010, http://www.pdp2010.org/. 6. C. Docan, J. Cummings, S.
Klasky, M. Parashar, N. Podhorszki, F. Zhang, Experiments with
Memory-to- Memory Coupling for End-to-End fusion Simulation
Workflows, ccGrid2010, IEEE Computer Society Press 2010. 7. N.
Podhorszki, S. Klasky, et al.: Plasma fusion code coupling using
scalable I/O services and scientific workflows. SC-WORKS 2009 Data
Formats 8. Y. Tian, S. Klasky, H. Abbasi, J. Lofstead, R. Grout, N.
Podhorszki, Q. Liu, Y. Wang, W. Yu, EDO: Improving Read Performance
for Scientific Applications Through Elastic Data Organization, to
appear Cluster 2011. 9. Y. Tian, SRC: Enabling Petascale Data
Analysis for Scientific Applications Through Data Reorganization,
ICS 2011, First Place: Student Research Competition 10. M. Polte,
J. Lofstead, J. Bent, G. Gibson, S. Klasky, "...And Eat it Too:
High Read Performance in Write- Optimized HPC I/O Middleware File
Formats," in In Proceedings of Petascale Data Storage Workshop 2009
at Supercomputing 2009, ed, 2009. 11. S. Klasky, et al., Adaptive
IO System (ADIOS), Cray User Group Meeting 2008. Data Staging 12.
H. Abbasi, G. Eisenhauer, S. Klasky, K. Schwan, M. Wolf. Just In
Time: Adding Value to IO Pipelines of High Performance Applications
with JITStaging, HPDC 2011. 13.H. Abbasi, M. Wolf, G. Eisenhauer,
S. Klasky, K. Schwan, F. Zheng, DataStager: scalable data staging
services for petascale applications, Cluster Computer, Springer
1386-7857, pp. 277-290, Vol 13, Issue 3, 2010. 14.C. Docan, M.
Parashar, S. Klasky: Enabling high-speed asynchronous data
extraction and transfer using DART. Concurrency and Computation:
Practice and Experience 22(9): 1181-1204 (2010) 15. H. Abbasi,
Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K.,, Zheng, F. 2009.
DataStager: scalable data staging services for petascale
applications. In Proceedings of the 18th ACM international
Symposium on High Performance Distributed Computing (Garching,
Germany, June 11 - 13, 2009). HPDC '09. ACM, New York, NY, 39-48.
16. H. Abbasi, J. Lofstead, F. Zheng, S. Klasky, K. Schwan, M.
Wolf, Extending I/O through High Performance Data Services, Cluster
Computing 2009, New Orleans, LA, August 2009. 17.C. Docan, M.
Parashar, S. Klasky, Enabling High Speed Asynchronous Data
Extraction and Transfer Using DART, Proceedings of the 17 th
International Symposium on High-Performance Distributed Computing
(HPDC), Boston, MA, USA, IEEE Computer Society Press, June 2008.
18.H. Abbasi, M. Wolf, K. Schwan, S. Klasky, Managed Streams:
Scalable I/O, HPDC 2008.
Slide 30
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications Usability of Optimizations 19. J. Lofstead,
F. Zheng, Q. Liu, S. Klasky, R. Oldfield, T. Kordenbrock, Karsten
Schwan, Matthew Wolf. "Managing Variability in the IO Performance
of Petascale Storage Systems". In Proceedings of SC 10. New
Orleans, LA. November 2010. 20. J. Lofstead, M. Polte, G. Gibson,
S. Klasky, K. Schwan, R. Oldfield, M. Wolf, Six Degrees of
Scientific Data: Reading Patterns for extreme scale science IO,
HPDC 2011. 21. Y. Xiao, I. Holod, W. L. Zhang, S. Klasky, Z. H.
Lin, Fluctuation characteristics and transport properties of
collisionless trapped electron mode turbulence, Physics of Plasmas,
17, 2010. 22. J. Lofstead, F. Zheng, S. Klasky, K. Schwan,
Adaptable Metadata Rich IO Methods for Portable High Performance
IO, IPDPS 2009, IEEE Computer Society Press 2009. 23. Lofstead,
Zheng, Klasky, Schwan, Input/Output APIs and Data Organization for
High Performance Scientific Computing, PDSW SC2008. 24. J.
Lofstead, S. Klasky, K. Schwan, N. Podhorszki, C. Jin, Flexible IO
and Integration for Scientific Codes Through the adaptable IO
System, Challenges of Large Applications in Distributed
Environments (CLADE), June 2008. 25. Y. Tian, S. Klasky, B. Wang,
H. Abbasi, N. Podhorszki, R. Grout, M. Wolf, Chimera: Efficient
Multiresolution Data Representation through Chimeric Organization,
submitted to IPDPS 2012.
Slide 31
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications Data Management 26. J. Kim, H. Abbasi, C.
Docan, S. Klasky, Q. Liu, N. Podhorszki, A. Shoshani, K. Wu,
Parallel In Situ Indexing for Data- Intensive Computing, accepted
2011 LDAV 27. T. Critchlow, et al., Working with Workflows:
Highlights from 5 years Building Scientific Workflows, SciDAC 2011.
28. S. Lakshminarasimhan, N. Shah, Stephane Ethier, Scott Klasky,
Rob Latham, Rob Ross, Nagiza F. Samatova, Compressing the
Incompressible with ISABELA: In Situ Reduction of Spatio-Temporal
Data, Europar 2011. 29. S. Lakshminarasimhan, J. Jenkins, I.
Arkatkar, Z. Gong, H. Kolla, S. H. Ku, S. Ethier, J. Chen, C.S.
Chang, S. Klasky, R. Latham, R. Ross,, N. F. Samatova, ISABELA-QA:
Query- driven Analytics with ISABELA-compressed Extreme-Scale
Scientific Data, to appear SC 2011. 30. K. Wu, R. Sinha, C. Jones,
S. Ethier, S. Klasky, K. L. Ma, A. Shoshani, M. Winslett, Finding
Regions of Interest on Toroidal Meshes, to appear in Journal of
Computational Science and Discovery, 2011 31. J. Logan, S. Klasky,
H. Abbasi, et al., Skel: generative software for producing I/O
skeletal applications, submitted to Workshop on D 3 Science, 2011.
32. E. Schendel, Y. Jin, N. Shah, J. Chen, C.S. Chang, S.-H. Ku, S.
Ethier, S. Klasky, R. Latham, R. Ross, N. Samatova, ISOBAR
Preconditioner for Effective and High- throughput Lossless Data
Compression, submitted to ICDE 2012. 33. I. Arkatkar, J. Jenkins,
S. Lakshminarasimhan, N. Shah, E. Schendel, S. Ethier, CS Chang, J.
Chen, H. Kolla, S. Klasky, R. Ross, N. Samatova, ALACRI2TY:
Analytics- driven lossless data compression for rapid in-situ
indexing, storing, and querying, submitted to ICDE 2012. 34. Z.
Gong, S. Lakshminarasimhan, J. Jenkins, H. Kolla, S. Ethier, J.
Chen, R. Ross, S. Klasky, N. Samatova, Multi- level layout
optimizations for efficient spatio-temporal queries of
ISABELA-compressed data, submitted to ICDE 2012. 35. Y. Jin, S.
Lakshminarasimhan, N. Shah, Z. Gong, C Chang, J. Chen, S. Ethier,
H. Kolla, S. Ku, S. Klasky, R. Latham, R. Ross, K. Schuchardt, N.
Samatova, S-preconditioner for Multi- fold Data Reduction with
Guaranteed user-controlled accuracy, ICDM 2011.
Slide 32
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications Simulation Monitoring 36.R. Tchoua, S.
Klasky, N. Podhorszki, B. Grimm, A. Khan, E. Santos, C. Silva, P.
Mouallem, M. Vouk, "Collaborative Monitoring and Analysis for
Simulation Scientist", in Proceedings of CTS 2010. 37.P. Mouallem,
R. Barreto, S. Klasky, N. Podhorszki, M. Vouk. 2009. Tracking Files
in the Kepler Provenance Framework. In Proceedings of the 21st
International Conference on Scientific and Statistical Database
Management (SSDBM 2009), Marianne Winslett (Ed.). Springer-Verla,
273-282. 38.E. Santos, J. Tierny, A. Khan, B. Grimm, L. Lins, J.
Freire, V. Pascucci, C. Silva, S. Klasky, R. Barreto, N.
Podhorszki, Enabling Advanced Visualization Tools in a Web-Based
Simulation Monitoring System, escience 2009. 39.R. Barreto, S.
Klasky, N. Podhorszki, P. Mouallem, M. Vouk: Collaboration Portal
for Petascale Simulations, International Symposium on Collaborative
Technologies and Systems, pp. 384-393, Baltimore, Maryland, May
2009. 40.R. Barreto, S. Klasky, C. Jin, N. Podhorszki, Framework
for Integrating End to End SDM Technologies and Applications
(FIESTA)., Chi 09 workshop, The Changing Face of Digital Science,
2009. 41.Klasky, et al., Collaborative Visualization Spaces for
Petascale Simulations, to appear in the 2008 International
Symposium on Collaborative Technologies and Systems (CTS 2008).
42.R. Tchoua, S. Klasky, et. Al, Schema for Collaborative
Monitoring and Visualization of Scientific Data, PDAC 2011.
Slide 33
Managed by UT-Battelle for the Department of Energy 2008 I/O
Pipeline Publications In situ processing 43. A. Shoshani, et al.,
The Scientific Data Management Center: Available Technologies and
Highlights, SciDAC 2011. 44. A. Shoshani, S. Klasky, R. Ross,
Scientific Data Management: Challenges and Approaches in the
Extreme Scale Era, SciDAC 2010. 45. F. Zheng, H. Abbasi, C. Docan,
J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K.
Schwan, M. Wolf, PreDatA - Preparatory Data Analytics on Peta-Scale
Machines, IPDPS 2010, IEEE Computer Society Press 2010. 46. S.
Klasky, et al., High Throughput Data Movement,, Scientific Data
Management: Challenges, Existing Technologies, and Deployment,
Editors: A. Shoshani and D. Rotem, Chapman and Hall, 2009 47. G.
Eisenhauer, M. Wolf, H. Abbasi, S. Klasky, K. Schwan, A Type System
for High Performance Communication and Computation, submitted to
Workshop on D 3 Science, 2011. 48. F. Zhang, C. Docan, M. Parashar,
S. Klasky, N. Podhorszki, H. Abbasi, Enabling In-situ Execution of
Coupled Scientific Workflow on Multi-core Platform, submitted IPDPS
2012. 49. F. Zheng, H. Abbasi, J. Cao, J. Dayal, K. Schwan, M.
Wolf, S. Klasky, N. Podhorszki, In-Situ I/O Processing: A Case for
Location Flexibility, PDSW 2011. 50. F. Zheng, J. Cao, J. Dayal, G.
Eisenhauer, K. Schwan, M. Wolf, H. Abbasi, S. Klasky, N.
Podhorszki, High End Scientific Codes with Computational I/O
Pipelines: Improving their End-to-End Performance, PDAC 2011. 51.
K. Moreland, R. Oldfield, P. Marion, S. Jourdain, N. Podhorszki, C.
Docan, M. Parashar, M. Hereld, M. Papka, S. Klasky, Examples of In
Transit Visualization, PDAC 2011. Misc. 52. C. S. Chang, S. Ku, et
al., Whole-volume integrated gyrokinetic simulation of plasma
turbulence in realistic diverted-tokamak geometry, SciDAC 2009. 53.
C. S. Chang, S. Klasky, et al., Towards a first-principles
integrated simulation of tokamak edge plasmas, SciDAC 2008. 54. Z
Lin, Y. Xiao, W. Deng, I. Holod, C. Kamath, S. Klasky, Z. Wang, H.
Zhang, Size scaling and nondiffusive features of electron heat
transport in multi-scale turbulence, IAEA 2010. 55. Z. Lin, Y.
Xiao, I. Holod, W. Zhang, W. Deng, S. Klasky, J. Lofstead, C.
Kamath, N. Wichmann, Advanced simulation of electron heat transport
in fusion plasmas, SciDAC 2009.