View
581
Download
3
Embed Size (px)
DESCRIPTION
Copyright Martin Litzenberger at CeDEM14
Citation preview
Management and Analysis of Large Scale Heterogeneous Time-Series Data Sensor and Government Data: Their Role in Public Policy
Martin Litzenberger
Safety and Security Department
AIT Austrian Institute of Technology
Martin Litzenberger | Senior Engineer | DSS SNI
Motivation
� A plethora of heterogeneous data are collected by public institutions with various sensors today
� But the data and their use are (usually) restricted to the domain or departments they belong to, e.g.
� security surveillance, traffic, public transport, air quality, power grids, ...
� Reasons: Lack of interoperability and often lack of communication and cooperation of data owners
223.05.2014
Advantages
� Connecting these data or even collecting them on a common platform would allow for new ways of analysis and insight into important and interesting mechanisms (e.g. traffic / air quality)
� But data are heterogeneous in many aspects such as: format,
update frequency, representation, owner, accessibility .. which
makes a joint analysis a big challenge
� Real-time 24/7 processing and availability, not a “one-time”academic investigation!
323.05.2014
Challenge: Heterogeneity of Data
� Temporal heterogeneity
� Discrete events versus regular time series
� Spatial heterogeneity
� „On-site“ versus „as near as possible“
� Semantic heterogeneity
� The same parameters might have different significance under different context
� Technical heterogeneity
� Non-standardized interfaces, formats, etc.
� Political heterogeneity
� “Owners” of data have different missions and goals
423.05.2014
523.05.2014
� Investigating effects of traffic state (free flow/stop&go) on local air quality
� Data sources
� Traffic monitor for traffic volume and acceleration
� Black carbon sensor at road side and a background station
� Meteorological station
Case Study
Case Study: Combined Air Quality and Traffic Monitoring
� Different owners
� City Council, State AQ Department and projects own sensors
� Different data intervals
� Traffic: Individual vehicles (~ 4000 data sets (speed, acceleration, vehicle class)/hour !)
� Air Quality & Meteo: fixed frequency, 30 min averages (48 data points/day)
� Pre-processing
� Temporal alignment & Aggregation
Goal: Investigating a “black carbon equivalent” for traffic
� Accelerating cars have a higher tailpipe emission than “free flowing”vehicles
Approach:
Q”BC” = Qtotal-vehicles + 6 * Qaccelerating-vehicles
(can be even more complex including weight factors for HGV etc...)
� Local (road-side) black carbon concentrations need to be reduced by “background” values to “isolate” traffic related component
CBC = Croad – Cbackground
And of course wind speed is of interest at the same time ... !
723.05.2014
Solution: What is openUwedat?
� OpenUwedat is a toolbox that allows to build Time Series relatedApplications
� The toolbox contains many ready-made, adaptable programs
� The toolbox contains libraries to write your own programs which integrate seamlessly with the existing ones
Driv
er Driver
Database
Dri
ver
configurable
What can I do with openUwedat?
� openUwedat allows to interact with any kind of Time Series Device. You can integrate new devices by writing new modules which act as „drivers“.
� Typical devices are:
� Measurement Devices
� Data Aquisition Systems (station computers)
� Other Time Series Management Systems
� Databases (SQL and no-SQL)
� …
Implementation in openUwedat
� Powerful scripting language “Formula 3”
� Real time interfaces and real-time processing pipes
Example code how to implement the BC-Equivalent function in Formula 3
@A="name=Database; type=Aggregation;Source=TDS;Sensor=S4.TDS1;Lane=0"
@B="name=Database; type=Aggregation;Source=TDS;Sensor=S4.TDS1;Lane=1"
<<(A.accCount[i]+B.accCount[i]+A.decCount[i]+B.decCount[i])*6+A.totalFlow[i]+B.totalFlow[i]>> |
<< sum( _ ]t-60mins..t] ) >> every 60 mins
1023.05.2014
1123.05.2014
Very good correlation! But depending on meteo-conditions. During
episodes of stronger wind, the correlation drops!
Typical Result Traffic / Air Quality
Conclusions
� Plenty of heterogeneous data are collected on regular basis by public authorities day by day
� The potential to analyse these data together stays mostly unusedbecause:
� Lack of cooperation between authorities / departments
� Lack of interoperability of the systems
� Case study on traffic/air quality show potential of how heterogeneous data analysis creates new insights
� AIT’s OpenUwedat data management toolbox allows
� Collection of Large Scale Heterogeneous Time-Series Data from different sources
� Complex analysis using a powerful scripting language
1223.05.2014
AIT Austrian Institute of Technologyyour ingenious partner
Martin Litzenberger