Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
National Technical University of Athens- Forecasting & Strategy Unit
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Spiliotis Evangelos – Forecasting & Time Series Prediction stream
Exploiting business intelligence of water companies. ForWarD: an online WaterDemand Forecasting tool
Evangelos Spiliotis
Co-Authors: Achilleas RaptisElektra SkepetariProf. Vassilios Assimakopoulos
National Technical University of AthensForecasting & Strategy Unit
Our motivationWater companies face numerous challenges: Decreasing water supply Increasing population Changes in the distribution of the population andits habits
Aim: Produce mid-term forecasts of the water demand, the water supplies andthe per water supply consumption for large-scaled supply systems.
Detailed forecasts of monthly frequency Forecast may refer:
• to the whole system• to specific areas (defined through postal codes or/and municipalities)• to specific consumption intervals
High forecasting accuracy – low computational time
2
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Need to optimize theirservices and pricing policybased on reliable forecasts
Proposed solution3
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
The amount of data and the complexity of the problem lead to the development of a Custom, Web Based & Open Source WaterDemand Forecasting Tool: ForWarD
1. Fully based on Open Source solutions2. A portal to easily access the whole water demand dataset 3. Instant cluster analysis of the data4. Instant forecasts for the chosen data5. Fully moderated forecasting parameters 6. Good accuracy compared with the available computational time7. Remote use of the tool 8. Multiple applications can be handled simultaneously9. User friendly interface10. Suitable for both experienced and inexperienced users in the field of forecasting 11. Visualization and exportation of the raw data and the results (PNG/JPEG images,
PDF documents, SVG, csv)
4
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Architecture of the systemDatabase Server Raw data derived from the DSS data warehouse of the company Intermediate MySQL database between the data warehouse and ForWarD:
Deal with big data by filtering useful information Updated DB via Oracle stored procedures on monthly basis
Application Web Server Shiny: R package for building web applications using R. RMySQL: Allows ForWarD to retrieve data from a MySQL database. R packages: forecast & MAPA rCharts: Javascript charting libraries such as MorrisJS, NVD3, xCharts and
HighCharts
5
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Architecture of the system (2/2)
Data cleansing & preprocess: • Detect & smooth additive
outliers and level shifts • Normalize zero values• Deseasonalization (opt.)Forecasting:• Methods: ETS, MAPA, Theta,
ARIMA, Naïve or ‘Auto Forecast’
• Calculate insample & outsample errors
• Confident intervals
Variables: • Monthly data with a
duration of a year• The consumption intervals
chosen• Number of clustersMethod:Ward's minimum variance method
Data & requirementsThe Data set:
Our data refer to the county of Attica, Greece
• 3 variables to be forecasted
• 22 consumption intervals
• 96 municipalities
• 309 postal codes
• 2,400,000+ water supplies
Data available per water supply
Length: Jan 2007 to Dec 2014
6
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
7
Demonstration of the tool: Forecasting Module (1/2)
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
8
Demonstration of the tool: Forecasting Module (2/2)
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
9
Demonstration of the tool: Clustering Module (1/2)
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
10
Demonstration of the tool: Clustering Module (2/2)
Case study: Forecasting water
demand in Attica
11
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Postal Code 10 432, interval (0,1]
Postal Code 10 432, all intervals
Athens, all intervals
Attica, all intervals
Aggregate data?Bottom-up or top-down?
Group based on total water demand or on water demand per consumption interval?
Experimental design12
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Results13
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
4 clusters indicating differences on the amount of water consumed across Attica per interval
C1: Downtown and rural areas (92)C2: Urban areas (79)C3: Areas with parks, hospitals, stadiums, airports & industries (42)C4: Suburban areas (85)
Data: Jan ‘07-Dec ’11Forecasting horizon: 24 monthsMethod: Auto forecast
0
50
100
150
200
250
300
[ 0
, 0]
( 0
, 1)
[ 1
, 2)
[ 2
, 3)
[ 3
, 4)
[ 4
, 5)
[ 5
, 6)
[ 6
, 7)
[ 7
, 8)
[ 8
, 9)
[ 9
,10
)
[10
,12
)
[12
,14
)
[14
,16
)
[16
,18
)
[18
,20
)
[20
,25
)
[25
,30
)
[30
,35
)
[35
,40
)
[40
,45
)
Wat
er
Dam
and
(1
00
0 m
3)
Consumption Intervals
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Bottom-up (sMAPE 3.56%) performed better than top down (sMAPE 3.82%) method for predicting total water demand
Further work to be made• Improve the existing forecasting methodology
behind ‘Auto forecast’:Include more forecasting modelsAdvanced model selection methods (e.g. rolling
origins)Advanced decomposition methodsMore preprocessing techniques (e.g. transforms)
• Extended customization of the applied forecasting methodology by the user
• Emphasis on the application interface • Optimization of the code (parallel programming)
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
14
References
1. Assimakopoulos, V., Nikolopoulos, K., 2000. The theta model: a decomposition approach to forecasting. International Journal of Forecasting 16 (4), 521-530.
2. Hyndman, R., Khandakar, Y., 2008. Automatic time series forecasting: the forecast package for R. Journal of Statistical Software 26 (3), 1 - 22
3. Hyndman, R. J., Koehler, A. B., Snyder, R. D., Grose, S., 2002. A state space framework for automatic forecasting using exponential smoothing methods. International Journal of Forecasting 18 (3), 439-454.
4. Kourentzes, N., Petropoulos, F., Trapero, J. R., 2014. Improving forecasting by estimating time series structural components across multiple frequencies. International Journal of Forecasting 30 (2), 291-302.
5. Kourentzes, N., Petropoulos, F., 2014. Improving forecasting via multiple temporal aggregation. Foresight: The International Journal of Applied Forecasting 34, 12-17
6. Murtagh, F., Legendre, P., 2014. Wards hierarchical agglomerative clustering method: Which algorithms implement wards criterion? Journal of Classication 31 (3), 274-295.
7. R Development Core Team, 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
8. RStudio and Inc., 2014. shiny: Web Application Framework for R. R package version 0.10.2.2. Hyndman, R., 2015. forecast: Forecasting functions for time series and linear models. R package version 5.8
9. Vaidyanathan, R., 2013. rCharts: Interactive Charts using JavaScript Visualization Libraries. R package version 0.4.5
35th International Symposium on ForecastingRotterdam, Netherlands - Energy Forecasting Spiliotis
Evangelos – Forecasting & Time Series Prediction stream
15
35th International Symposium on ForecastingRiverside, California – Demand Forecasting 1
Thank you for your attentionAny questions?
If you would like more information about our
work contact me at: [email protected]
Or visit forecasting & strategy unit’s website
http://www.fsu.gr
16