1
Center for Autonomic Computing
Enabling End-to-end Data-driven Sensor-based Scientific and Engineering Applications
DDDAS at ICCS 2009 Rutgers University, Center for Autonomic Computing
Nanyan Jiang and Manish Parashar May 25, 2009
Acknowledge: University of Texas at Austin (M. Wheeler, H. Klie, et al.), NSF, DoE.
Motivations Enable the integration of physical and
computational worlds for management, optimization and control Integrate simulations and data with (near) real time
sensors and actuators Data acquisition, assimilation and coupling with
computational models for end-to-end decision processes.
Kriging (Err<0.01)
Sensor data Simulation grid
X R
Observation data Scientific Application/ Computational Model
Simulation model
2
Application Scenarios: Examples Oil reservoir management and optimization
Processes Detect and track changes Invert data to reservoir properties Assimilate data and reservoir properties, …
Feedback loop between measured data and computational models to enable optimal oil management and control
Contaminant modeling and control Ruby Gulch Waste Repository
Understanding the evolution of sites Impact on drinking water supplies Autonomous monitoring:
Temperature, moisture, chemical sensors, etc. Extract application interested information from complex coupled
transport models which utilized data gathered by field sensors.
Structural health monitoring Temperature, pressure sensors, etc. Using models and real time data to predict the status of the
bridge
Challenges Data volumes and rate Constrained heterogeneous sensor systems Dynamics and uncertainty of applications, systems, and data
Key Requirements Application level requirements
Multi-scale, multi-resolution data access Spatial temporal variation, e.g. online spatial and temporal interpolation
Data quality and uncertainty management
System level requirement Adaptive runtime management of in-network processing Resource management and computation/communication/energy
tradeoffs
Programming requirements Simple, extensible end-to-end programming models and
interfaces
3
Objectives Provide programming abstractions and systems
software support for sensor-driven applications Programming abstractions and runtime systems for
integrating sensor systems with applications processes in an end-to-end dynamic sensor-driven application.
Efficient in-network computation/communication mechanisms for resource constrained heterogeneous sensor networks. Support for tradeoff between data quality, resource
consumptions and performance.
Our Approach Support end-to-end sensor-driven applications and
the interactions between computational models and the sensor system. The GridMap/iZone programming system provides
semantically meaningful abstractions and runtime mechanisms for integrating sensor systems with computational models for scientific processes
Abstractions and in-network mechanisms to support different sensor selection and interpolation approaches, e.g., aggregation, adaptive interpolation and assimilations.
Use virtualization to address the mismatch between the instrumentation of the physical domain and the presentation used with computational models.
4
Contributions
The GridMap/iZone Programming System Provides programming abstractions for integrating
sensor systems with computational models for scientific processes (e.g. biophysical, geophysical processes) and with other application components in an end-to-end experiment.
Provides programming abstractions and system software support for developing in-network data processing mechanisms.
Enabling End-to-end Sensor-driven Scientific and Engineering Applications End-to-end oil reservoir
Overview of Programming System
X R
Interpolation/ Regression/ Modeling
Scientific Application/ Computational Model
Content Overlay Content-based routing engine
Self-organized overlay
X
System data
CyberInfrastructure
Location aware Content-based Middleware
Discovery, Associative Rendezvous Messaging
Clustering/Geo-routing Wireless overlay, mesh, etc.
CyberPhysical Applications
In-network algorithms IDW, Kriging, regression, etc.
In-network Data Processing Space, Time and Resource aware Opt Dissemination, aggregation, collaboration
Observation data
Programming Abstraction
iZone
GridMap Programming Abstraction
5
Outline
Motivation The GridMap/iZone Programming System
Programming abstractions and support Enabling end-to-end data-driven sensor-based
applications An end-to-end oil reservoir application
Conclusion
Content Overlay
GridMap Programming Abstraction
Location aware Content-based Middleware
Clustering/Geo-routing
CyberPhysical Applications
In-network algorithms
Programming Abstraction iZone
In-network Data Processing Space, Time and Resource aware Opt
GridMap Programming Abstraction A virtualization of the physical sensor grid to match
the representation of the physical domain used by the application Sensor data can be simply and seamlessly integrated into
computational models Changes in the underlying sensor network is transparent to
the applications In-network spatial-temporal data processing can help remove
computational bias Flexible and powerful operators for querying and processing
sensor data simple query interface using content and supporting wildcards,
ranges, etc.
Kriging (Err<0.01)
Sensor data Simulation grid
6
iZone Programming Abstraction Abstraction and operators to implement in-network
mechanisms to map raw sensor data to application representations Hide details/irregularities of measurement infrastructures and
sensed data using user-defined functions (e.g. regression, interpolation)
Provide a consistent representation of sensor data in time and space
Defining iZone based on interpolation/regression/model Can be specified using a reference point (or line, polygon),
ranges, and/or number of readings
X R
Assume a two-level tiered sensor network with sensor clusters and cluster heads �
In-network Programming Primitives
iZone operators
Applications
7
Programming primitives
GridMap representation E.g. <118:8:223, 23:10:333>,
pressure, GridMap operators
E.g. query, retrieve, notify, delete, init, refine, coarsen, etc.
Interpolation functions Kriging, IDW, user-defined, …
iZone operators E.g. Get/put, aggregate (i.e. sum,
weighted sum, …)
iZone operators
GridMap operators
Applications
Oil Reservoir Management and Optimization
Instrumented oil reservoir (2) request application
specified sensor data Data
Archive
(3a) Retrieve real-time interpolated pressures
Sensor data Simulation grid
(3b) Retrieve production rates
(3c) Retrieve historical information
(4) Optimize oil reservoir production with
simulation process
Pressure models
(7) Update data archive
Three oil well production distribution
(5) Adjust gas ingestion places/pressures
Before After adjustment
Water/gas ingestion placements
(6) Production rates’ update
Before After adjustment
(1) Start simulation processes
8
Using GridMap/iZone programming primitives
Deployed sensor network in the oil field
An end-to-end simulation process
(a) init
(c) retrieve
Update production policy
(b) query
Experimental Evaluation Objectives
Demonstrate using GridMap/iZone to support the integration of computational processes with real-time in-network processing of sensor information
Proof-of-concept performance evaluations: impact of accuracy and communication costs.
Experiment setup About 800 sensors distributed in a 800X2000 feet^2 area About 40 clusters are initialized
Deployed sensor network
In-network computation
An end-to-end simulation process
1 2
3
4 5 gateway
9
Impact on communication costs Given accuracy requirements, what are the communication
costs of querying the GridMap of (different) multiple grid points. For a given number of grid points, increasing the required accuracy
increases the volume of communication. he number of grid points on the GridMap increases less proportionally
than the increasing of the volume of communication.�
Conclusions The GridMap/iZone Programming System
Provide programming abstractions for integrating sensor systems with computational models for scientific processes (e.g. biophysical, geophysical processes) and with other application components in an end-to-end experiment.
Provide programming abstractions and system software support for developing in-network data processing mechanisms.
Enabling End-to-end Sensor-driven Scientific and Engineering Applications:
End-to-end oil reservoir
Links http://nsfcac.rutgers.edu/ http://nsfcac.rutgers.edu/doc/html/Programming_Sensor-
driven_Autonomic_Applications/ http://www.caip.rutgers.edu/~nanyanj/GridMap.html
10
Related Publication 1. Enabling End-to-end Data-driven Sensor-based Scientific and Engineering Applications,
Proceeding of the Workshop on Dyanmic Data-Driven Application Systems (DDDAS '09), in conjunction with the international conference on Computational Science (ICCS 2009), Pringer Verlag, Baton Rouge, Louisiana, May, 2009.
2. Enabling Autonomic Power-Aware Management of Instrumented Data Centers. the Fifth Workshop on High-Performance, Power-Aware Computing (HPPAC 2009), in conjunction with the 23rd Annual International Parallel & Distributed Processing Symposium (IPDPS 2009), Rome, Italy. Accepted, 2009
3. In-network Data Estimation Mechanisms for Sensor-driven Scientific Applications, 15th IEEE International Conference on High Performance Computing (HiPC 2008), Bangalore, India, December, 2008.
4. Programming Sensor-based, Dynamic Data-driven Scientific Applications, IEEE International Parallel & Distributed Processing Symposium (IPDPS) Ph.D Forum, Miami, FL, April, 2008
5. Programming Support for Sensor-based Scientific Applications, The NSF Next Generation Software (NGS) Workshop held in conjunction with IPDPS, Miami, FL, April, 2008
6. Meteor: A Middleware Infrastructure for Content-based Decoupled Interactions in Pervasive Grid Environments, Concurrency and Computation: Practice and Experience, John Wiley and Sons, 2007.
7. A Decentralized Content-based Aggregation Service for Pervasive Environments, International Conference of Pervasive Services (ICPS), June, 2006.
8. Enabling Applications in sensor-based Pervasive Environments, (BaseNets 2004), San Jose, CA, USA October 25, 2004
Questions? Acknowledgement
The research presented in this paper is supported in part by National Science Foundation via grants numbers CNS 0723594, IIP 0758566, IIP 0733988, CNS 0305495, CNS 0426354, IIS 0430826 and ANI 0335244, and by Department of Energy via the grant number DE-FG02-06ER54857, and was conducted as part of the NSF Center for Autonomic Computing at Rutgers University.
Thank you
11
Backup Slides
Autonomic Instrumented Data Center Management Large scale and dense sensor networks, e.g. temperature,
humidity, etc. Data center power efficiency and job allocation management
Integrate snapshots of physical phenomenon monitored by sensor networks with computational models (e.g., heat distribution or online learning, power consumption models, etc.) to improve energy efficiency.
Simulation Model: optimize power consumption and job processing throughput
Sensor data Simulation grid
Wide-area network
Data Archive
Job requests
Power models
Job allocation policy
Job requests
X
Instrumented DataCenter
12
Programming Instrumented Data Center
0
5
10
15
20
0
5
10
15
20
65
70
75
80
85
90
95
tem
pe
ratu
re (
F)
Instrumented DataCenter
(2) Request application specified sensor data Data
Archive (3a) Retrieve real-time interpolated data
Job requests
(3b) Retrieve current load distribution
(3c) Retrieve historical information
(4) Simulation processes: e.g., optimize power usage and job
throughput
(5a) update temperature control policy, e.g., cooling configuration
(5b) Update job allocation policy
Job requests (1) Job requests
Power models
(7) Update data archive
Job migration
(6) In-network Analysis, e.g., hotspots
Sensor data Simulation grid
Communication Costs
As the area of the GridMap increases, so does the average number of messages required per update.