13
Assessment of machine learning techniques for deterministic and probabilistic intra-hour solar forecasts Hugo T.C. Pedro a , Carlos F.M. Coimbra a, * , Mathieu David b , Philippe Lauret b, a a Department of Mechanical and Aerospace Engineering, Jacobs School of Engineering, Center for Energy Research, University of California San Diego, La Jolla, CA, 92093, USA b Universit e de La R eunion, 15 avenue Ren e Cassin, 97715, Saint-Denis Cedex 9, France article info Article history: Received 5 September 2017 Received in revised form 31 January 2018 Accepted 1 February 2018 Available online 8 February 2018 Keywords: Probabilistic solar forecasts Global irradiance Direct irradiance Machine learning Sky imagery abstract This work compares the performance of machine learning methods (k-nearest-neighbors (kNN) and gradient boosting (GB)) in intra-hour forecasting of global (GHI) and direct normal (DNI) irradiances. The models predict the GHI and DNI and the corresponding prediction intervals. The data used in this work include pyranometer measurements of GHI and DNI and sky images. Point forecasts are evaluated using bulk error metrics while the performance of the probabilistic forecasts are quantied using metrics such as Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Averaged Width (PINAW) and the Continuous Ranked Probability Score (CRPS). Graphical verication displays like reli- ability diagram and rank histogram are used to assess the probabilistic forecasts. Results show that the machine learning models achieve signicant forecast improvements over the reference model. The reduction in the RMSE translates into forecasting skills ranging between 8% and 24%, and 10% and 30% for the GHI and DNI testing set, respectively. CRPS skill scores of 42% and 62% are obtained respectively for GHI and DNI probabilistic forecasts. Regarding the point forecasts, the GB method performs better than the kNN method when sky image features are included in the model. Conversely, for probabilistic forecasts the kNN exhibits rather good performance. © 2018 Elsevier Ltd. All rights reserved. 1. Introduction Solar forecasts at various time horizons are needed in order to increase the share of solar energy into electricity grids. Indeed, the intermittent character of solar energy may result in imbalances between electricity supply and demand. This requires the power system to either procure additional reserves or adjust the output of conventional generators so as to ensure balance between supply and demand [1]. It is therefore important that solar irradiance and the corresponding solar power output are accurately predicted so that the utility grid is able to take appropriate actions to manage intermittency. In this study, we focus on intra-hour global horizontal irradiance (GHI) and direct normal irradiance (DNI) forecasts with forecasting time horizons ranging from 5 min up to 30 min. Targeting these two irradiance components is important: DNI is of particular interest to concentrating solar power plants (CSP) and installations that track the position of the sun; and both DNI and GHI can be used to es- timate the plane-of-array irradiance on tilted/tracking PV panels. With respect to the time horizons, as mentioned by Ref. [1], intra- hour forecasts are relevant for optimal central plant operations, and is an enabling technology for optimal dispatch of ancillary re- sources and storage systems. Moreover, mitigation measures for large drops in solar irradiance, such as demand response, storage and intra-hour scheduling can only be maximized with accurate and reliable intra-our forecast [1]. Depending on the forecast horizon, different types of forecasting models are appropriate [2]. In the solar forecasting community, numerous works have been devoted to the development of models that generate point forecasts also called deterministic forecasts [3e6]. In addition, it must be noted that most of these works focus on the prediction of the GHI with forecasting models using only endogenous ground telemetry. Only recently [7 ,8], added exogenous features derived from a sky camera to produce intra-hour DNI and GHI forecasts. Indeed, short- term uctuations in solar irradiance are dictated, almost exclu- sively, by cloud cover. A single passing cloud can bring the power output of a solar farm from full production to minimum and back to * Corresponding author. E-mail address: [email protected] (C.F.M. Coimbra). Contents lists available at ScienceDirect Renewable Energy journal homepage: www.elsevier.com/locate/renene https://doi.org/10.1016/j.renene.2018.02.006 0960-1481/© 2018 Elsevier Ltd. All rights reserved. Renewable Energy 123 (2018) 191e203

Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

lable at ScienceDirect

Renewable Energy 123 (2018) 191e203

Contents lists avai

Renewable Energy

journal homepage: www.elsevier .com/locate/renene

Assessment of machine learning techniques for deterministic andprobabilistic intra-hour solar forecasts

Hugo T.C. Pedro a, Carlos F.M. Coimbra a, *, Mathieu David b, Philippe Lauret b, a

a Department of Mechanical and Aerospace Engineering, Jacobs School of Engineering, Center for Energy Research, University of California San Diego, LaJolla, CA, 92093, USAb Universit�e de La R�eunion, 15 avenue Ren�e Cassin, 97715, Saint-Denis Cedex 9, France

a r t i c l e i n f o

Article history:Received 5 September 2017Received in revised form31 January 2018Accepted 1 February 2018Available online 8 February 2018

Keywords:Probabilistic solar forecastsGlobal irradianceDirect irradianceMachine learningSky imagery

* Corresponding author.E-mail address: [email protected] (C.F.M. Coimb

https://doi.org/10.1016/j.renene.2018.02.0060960-1481/© 2018 Elsevier Ltd. All rights reserved.

a b s t r a c t

This work compares the performance of machine learning methods (k-nearest-neighbors (kNN) andgradient boosting (GB)) in intra-hour forecasting of global (GHI) and direct normal (DNI) irradiances. Themodels predict the GHI and DNI and the corresponding prediction intervals. The data used in this workinclude pyranometer measurements of GHI and DNI and sky images. Point forecasts are evaluated usingbulk error metrics while the performance of the probabilistic forecasts are quantified using metrics suchas Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Averaged Width(PINAW) and the Continuous Ranked Probability Score (CRPS). Graphical verification displays like reli-ability diagram and rank histogram are used to assess the probabilistic forecasts. Results show that themachine learning models achieve significant forecast improvements over the reference model. Thereduction in the RMSE translates into forecasting skills ranging between 8% and 24%, and 10% and 30% forthe GHI and DNI testing set, respectively. CRPS skill scores of 42% and 62% are obtained respectively forGHI and DNI probabilistic forecasts. Regarding the point forecasts, the GB method performs better thanthe kNN method when sky image features are included in the model. Conversely, for probabilisticforecasts the kNN exhibits rather good performance.

© 2018 Elsevier Ltd. All rights reserved.

1. Introduction

Solar forecasts at various time horizons are needed in order toincrease the share of solar energy into electricity grids. Indeed, theintermittent character of solar energy may result in imbalancesbetween electricity supply and demand. This requires the powersystem to either procure additional reserves or adjust the output ofconventional generators so as to ensure balance between supplyand demand [1]. It is therefore important that solar irradiance andthe corresponding solar power output are accurately predicted sothat the utility grid is able to take appropriate actions to manageintermittency.

In this study, we focus on intra-hour global horizontal irradiance(GHI) and direct normal irradiance (DNI) forecasts with forecastingtime horizons ranging from 5min up to 30min. Targeting these twoirradiance components is important: DNI is of particular interest toconcentrating solar power plants (CSP) and installations that track

ra).

the position of the sun; and both DNI and GHI can be used to es-timate the plane-of-array irradiance on tilted/tracking PV panels.With respect to the time horizons, as mentioned by Ref. [1], intra-hour forecasts are relevant for optimal central plant operations, andis an enabling technology for optimal dispatch of ancillary re-sources and storage systems. Moreover, mitigation measures forlarge drops in solar irradiance, such as demand response, storageand intra-hour scheduling can only be maximized with accurateand reliable intra-our forecast [1].

Depending on the forecast horizon, different types of forecastingmodels are appropriate [2]. In the solar forecasting community,numerous works have been devoted to the development of modelsthat generate point forecasts also called deterministic forecasts[3e6]. In addition, it must be noted that most of these works focuson the prediction of the GHI with forecasting models using onlyendogenous ground telemetry.

Only recently [7,8], added exogenous features derived from a skycamera to produce intra-hour DNI and GHI forecasts. Indeed, short-term fluctuations in solar irradiance are dictated, almost exclu-sively, by cloud cover. A single passing cloud can bring the poweroutput of a solar farm from full production to minimum and back to

Page 2: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203192

full in a mater of minutes or even seconds [9]. In general, themethodology for processing ground-based images and includingthem into the forecast models is based on cloud motion detectionand the propagation of the cloud field into the future [7,10e17].

In this work, the images are not used to monitor could dy-namics, instead, they are used to provide models' input features,such as the entropy of the Red, Green and Blue channels. Conse-quently, the data used here consist of pyranometer measurementsof GHI and DNI and sky images captured with an off-the-shelf se-curity camera pointing to the zenith.

Another timely and relevant topic for grid operators concernsthe generation of solar probabilistic forecasts. A forecast is inher-ently uncertain and in a context of decision-making faced by thegrid operator, a point forecast plus a prediction interval is of keyimportance. Indeed, it is away of judging the reliability of forecasts.Works regarding the production of short term solar probabilisticforecasts using different type of techniques are relatively recent[18e21]. Similarly to the point deterministic forecasts, most ofthese works were devoted to GHI and are based on endogenouspredictors.

In Ref. [8], one of the simplest methods among the machinelearning algorithms, namely the kNN method, was chosen to buildintra-hour GHI and DNI forecasts. An optimization algorithm de-termines the best set of features and other free parameters in themodel, such as the number of nearest neighbors. A simple methodderived from the kNN nearest neighbors allowed the computationof the prediction uncertainty intervals. Results showed that themodel achieves significant forecast improvements (between 10%and 25%) over a reference persistence forecast. However, the in-clusion of sky images in the pattern recognition results in a smallimprovement (below 5%) relative to the kNNwithout images, but ithelps in the definition of the prediction intervals (specially in thecase of DNI).

Machine learning methods like kNN are more and moreemployed in the solar forecasting community for producing pointand probabilistic forecasts [22]. As an illustration [23], usedgradient boosting for the deterministic forecasting of solar powerand kNN for estimating prediction intervals.

The goal of this work is to assess if more sophisticated machinelearning methods like Gradient Boosting (GB) can yield better re-sults than the kNN method in terms of point and probabilisticforecasts. In particular, we will check whether this method per-forms better in the case where sky images are included in themodel.

In addition, contrary to the simple method used by Pedro andCoimbra [8] to compute the uncertainty prediction intervals, theuncertainty related to the forecast will be given by a set of quantileforecasts. Indeed, as noted by Ref. [24], quantile forecasts are keyproducts for energy applications. Prediction intervals can then beinferred from this set of quantiles. Finally, in order to evaluate andto rank the solar probabilistic forecasts, and following Iversen et al.[21]; Alessandrini et al. [25]; Sperati et al. [26]; Bessa et al. [27];Pinson et al. [28] we propose to use the reliability diagram, the rankhistogram (or Talagrand histogram) and the continuous rankedprobability score (CRPS). We will also assess the sharpness of theforecasts by calculating the prediction intervals normalized averagewidth (PINAW).

The remainder of this paper is organized as follows. Section 2describes the data used to build and to test the different modelsand describes how the input features are derived from irradiancedata and sky images. Section 3 gives a description of the methodsused for generating point forecasts. Section 4 explains the tech-niques used to produce the probabilistic forecasts. Section 5 detailsthe errors metrics used to evaluate the point and the probabilisticforecasts. Section 6 discusses the results of point forecasts and

probabilistic forecasts. Finally, Section 7 gives some concludingremarks.

2. Data and methodology

2.1. Data

The forecasting models are trained for solar irradiance mea-surements (GHI and DNI) obtained at Folsom, CA, 38:64+N and121:14+W. The data spans one year from December 2012 toDecember 2013. The raw 1-min data was quality controlled toremove physically impossible values, averaged into 5min bins anddivided into three data sets: training, validation and testing. Thethree data sets were constructed by grouping disjoint subsets foreach month, thus ensuring that all data sets are well representativeof the irradiance data over the whole year.

We follow the common practice in solar energy of working withthe clear sky index kt instead of the original solar irradiance timeseries. The clear-sky index is defined as

ktðtÞ ¼ IðtÞIclrðtÞ (1)

where I is the solar irradiance, GHI or DNI, and Iclr is the clear-skyirradiance computed following the algorithm in Ref. [29]. Table 1lists several statistical metrics for GHI, DNI and the respective ktvalues for the three data sets.

The sky images were obtained, for the same time period, with anoff the shelve security camera (Vivotek, model FE8171V) pointing tothe zenith. The camera provides 24-bit images compressed in jpegformat, with 8 bits per color channel (Red, Green and Blue). Theimage resolution is 1563 by 1538 pixels, of whichz50% correspondto the sky dome. The imageswere synchronized with the irradiancedata and also divided in the same three data sets.

Finally, it must be stressed that the present work focuses on themethodology used for developing the forecasting methods andtherefore does not address more general questions such as theapplicability of the methods to a wide range of solar variabilitymicroclimates or the effect of very long (multiple years) trainingdata.

2.1.1. Features from irradiance and sky imagesFeatures or predictors are derived from the irradiance time se-

ries and the sky images. The predictors are then used by the fore-casting models explained below. The algorithm used to extract thefeatures is explained in detail in Pedro and Coimbra [8]. The set offeatures derived from the irradiance time series (GHI and DNI)consists of:

� The backward average for the clear-sky index time seriesdenoted as BðtÞ;

� The lagged 5-min average values for the 5-min clear-sky indextime series denoted as LðtÞ

� The time series variability VðtÞ.

All these features are derived from the clear-sky-index, thusremoving deterministic components due to the daily solar cycle.The features are computed for several windows in the 120min thatprecede the forecasting issuing time. Fig. 1(b) shows these featurescomputed for the GHI time series. Fig. 1(a) highlights (shaded area)the data in the window between 13:30 and 15:30 which is used tocompute the three features.

The cloud cover information is also incorporated into the fore-cast using predictors derived from the sky images. Here we use a

Page 3: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Table 1Statistical values (m¼mean, s¼ standard deviation, and maximum) for GHI, DNI and the respective clear sky indices. Values are reported for each data set. The number ofsamples in each data set (N.) is given in the second column.

Set N. GHI DNI

Value [Wm�2] kt [,] Value [Wm�2] kt [,]

m s max m s m s max m s

Training 22203 477.2 282.3 1221.0 0.86 0.25 584.7 329.5 962.4 0.79 0.44Validation 8879 526.1 271.6 1083.8 0.95 0.16 701.4 238.6 972.6 0.95 0.32Testing 15807 446.4 256.3 1048.0 0.91 0.23 628.3 301.3 977.4 0.88 0.42

Fig. 1. Irradiance and image features for the forecasting issuing time “2013-02-22 15:30:00 PST”. (a) GHI, clear-sky GHI and the clear-sky index for GHI for a 6-h period window in2013-02-22 at Folsom, CA. The arrow indicates that the features are computed using data that precedes the forecasting issuing time. The shaded window shows the maximum 2-hwindow used to compute the features for this time stamp. (b) The three GHI features B, L, and V. (c) Nine sky images spaced evenly in the 2-h window. (d) The average of the R, Gand B channels (left) and the entropy of the R, G and B channels (right) for the sky images in (c).

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 193

simple image processing algorithm that computes the average (m),standard deviation (s) and entropy (e) for the Red (rÞ), Green (g),Blue (b) and Red to Blue ratio (r) data sets. The first three data setsare obtained by simply converting the 8-bit data in to floating pointvectors. The last one is a vector with components ri ¼ ri=bi.

These values are computed for every image returning 12 values(3 metrics m, s and e for the 4 data sets r; g, b and r). For eachforecast instance the models use the values from images in the10min that precede the issuing time as features. Fig. 1(c) showsnine images in the 2-h period referenced above and Fig. 1(d) showsthe average and entropy for the r; g, b data sets from each image.The presence of clouds is clearly reflected in the behavior of thefeatures extracted from the images.

2.2. Methods for producing deterministic point forecasts

In this section we describe the models used to calculate thepoint forecasts for GHI and DNI 5e30min ahead of time. Forecastsare issued every 5min for solar elevations larger than 5�.

2.2.1. PersistenceThe first model implemented here is the, so called, smart

persistence model, whose accuracy is used as the baseline for thepoint forecasts. This model assumes that the clear-sky index for theforecasted variable (GHI or DNI) persists over the forecasting ho-rizon. For a given forecast horizon Dt, the point forecast for the

average irradiance in the interval ½t; t þ Dt� denoted as bIðt þ DtÞ iscalculated as

bIpðt þ DtÞ ¼ hkti½t�Dt;t� �DIclr

E½t;tþDt�

(2)

where ⟨,⟩½t;t±Dt� is the average in the window ½t; t±Dt�.

2.2.2. The kNN methodThe kNNmodel uses the predictors introduced above to forecast

GHI or DNI. This method is based on the similarity of the predictorsat the forecasting issuing time to the predictors computed with thetraining data set. The details for this model and its optimization areexplained in detail in Pedro and Coimbra [8]. For the sake of

Page 4: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203194

completeness we summarize here the main steps in obtaining thekNN optimized forecasts.

The kNN algorithm starts by computing the euclidean distancebetween the features listed in Table 2 for a new data set (i.e. testingor validation) and the features in the training set. This operationyields a distance vector for each feature. These are then combinedinto a single vector using a weighted sum, denoted as Ds, where thesubscript s indicates the set of features used in the calculations. Asdemonstrated in Ref. [8], this set varies from case to case and isobtained as the result of an optimization algorithm.

The algorithm proceeds to extract the k instances in the trainingdatawith the lowest distance. To each instance there is associated atime stamp ft1;/; tkg in the training set. k forecasts are thencomputed using the GHI or DNI training data subsequent to thesetime stamps:

bf iðt þ DtÞ ¼ hkti½ti;tiþDt� DIclrE½t;tþDt�; i ¼ 1;/; k (3)

from which the final point forecast is calculated as:

bIkNNðt þ DtÞ ¼

Pki¼1

aibf iðt þ DtÞ

Pki¼1

ai

(4)

where the weights a are a function of the distance Ds

ai ¼�

1� Ds;i

maxDs �minDs

�n

; i ¼ 1;/; k (5)

and n is an adjustable positive integer parameter.The algorithm summarized above depends on several

parameters:

1. The number of nearest neighbors, k2f1;2;/;max kg, wheremaxk ¼ 150 in this case;

2. The set of features S, i.e., which features are used in the searchfor the nearest neighbors;

3. The weights in the weighted sum Ds denoted as ui;4. The exponent n2 f1;2;/;5g for the weights ai in Eq. (5);

The optimal model is determined by minimizing the forecasterror for the validation data set:

argmink;S;u;n

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n

Xni

�bIðti þ Dt; k; S;u;nÞ � Iðti þ DtÞ�2vuut (6)

Further details about the optimization procedure and therespective optimal kNN models for GHI and DNI can be found inRef. [8].

Table 2Description of the kNN features used in this work.

Source Feature Win

Irradiance Backward average 5 to(GHI or DNI) Lagged average 5m

Variability 5 to

Sky Images Mean 1 to(R, G, B or R/B) Standard Deviation 1 to

Entropy 1 to

2.2.3. The gradient boosting methodBoosting is a general approach that can be applied to many

statistical learning methods for regression or classification [30,31].For regression problems, given a training data set, the goal is to finda function f ðxÞ such that a specified loss function is minimized.Boosting approximates f ðxÞ by an additive expansion of the form:

bf ðxÞ ¼ XMm¼0

bmhðx; qmÞ (7)

where the functions hðx; qmÞ are simply functions of x parameter-ized by qm. hðx; qmÞ are called “base learners” or “weak learners”[30]. The expansion coefficients bm and the parameters qm are fit tothe training data in a forward “stage wise” manner (i.e. withoutadjusting the previous expansion coefficients and parameters ofthe base learners that have already been added). Here, we restrictthe application of boosting to the context of regression trees (i.e.the base learner hðx; qÞ is a tree TðqÞ). For that purpose, boostingbuilds an ensemble of trees iteratively in order to optimize a loss

function J: the squared loss function Jðy; f ðxÞÞ ¼ ðy� f ðxÞÞ2, inthis case.

2.2.3.1. Construction of a regression tree. Regression trees are sim-ple models that divide the input (or feature) space into a set ofrectangular regions and then associate to each region a constantvalue (also called terminal-node mean value) corresponding to theaverage of the targets yi of the training set that fall in this region. Arecursive greedy top-down algorithm is used to partition the inputspace into regions. The interested reader can refer to [31] for detailsregarding the stratification of the input space. Following the no-tation given by Ref. [32], a tree is denoted by TðqÞ. q is a randomparameter vector that determines how a tree is grown, i.e. how theinput space is stratified in terms of split variables, split locations,and terminal-nodemean values. TðqÞ partitions the input space intoL distinct regions fRlgLl¼1. Using the N independent observations ofthe training set ðxi; yiÞ, i ¼ 1;2;/;N, the prediction byðxÞ of a singletree TðqÞ for a new data point x is obtained by averaging over thetraining observations that fall in the region Rlðx; qÞ of the inputspace to which x belongs:

byðxÞ ¼ XNi¼1

uiðx; qÞyi (8)

where the weights uiðx; qÞ are given by: uiðx; qÞ ¼1xi2Rlðx;qÞ

#:xj2Rlðx;qÞ. The

indicator function 1fug has the value of 1 if its argument u is trueand 0 otherwise.

Fig. 2 illustrates the construction of a regression tree to model afunction of two variables (also called co-variates) y ¼ f ðX1;X2Þ.

2.2.3.2. Generic boosting algorithm. The Generic gradient treeboosting algorithm [31] is depicted below. The training set contains

dow size Increment Feature length

120min 5min 24in 5min 24120min 5min 24

10min 1min 1010min 1min 1010min 1min 10

Page 5: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Fig. 2. (a) Recursive binary splitting of the feature space. (b) Tree corresponding to the previous partition. (c) A perspective plot of the prediction surface. In this example there isd ¼ 4 splits corresponding to 5 regions. Illustrations taken from Ref. [31].

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 195

N samples ðxi; yiÞ, i ¼ 1;2;/;N and J is the loss function to mini-mize. The algorithms follows the steps:

1. Initialize f ðxÞ to be a constant, bf 0ðxÞ ¼ argming½PN

i¼iJðyi;gÞ�2. Perform a loop from m ¼ 1 to M where

a) Compute the negative gradient of the loss function J(also called the pseudo-residuals ~yim):

� �

~yim ¼ � vJðyi; f ðxiÞvf ðxiÞ f ðxÞ¼bf m�1ðxÞ

(9)

b) Fit a tree TðqÞ with d splits predicting the pseudo-residuals ~yim from co-variates xi. TðqÞ partitions theinput space into L ¼ dþ 1 distinct regions Rlm.

c) Compute the optimal node predictions, l ¼ 1;2;/; L

2 3

glm ¼ argming

4 Xxi2Rlm

Jðyi; fm�1ðxiÞ þ gÞ5 (10)

d) Update the estimates of f ðxÞ:

bf mðxÞ ¼ bf m�1ðxÞ þ nglm1x2Rlm(11)

In Eq. (11), n is called the shrinkage parameter. This parametercontrols the rate at which boosting learns. The loss function here(J) is the squared error loss function for determining point fore-casts. The tree that best fits the residuals ~yim ¼ rim ¼ yi � fðm�1ÞðxiÞis added at each step. Once a boosted tree is grown, a predictionbyðxÞ, for a new data point is obtained by averaging the predictionsof the M single trees.

In this work, we use the XGBoost R package [33]. XGBoost is avery efficient implementation of the gradient boosting technique.As shown in Ref. [34] this implementation achieves state-of-the-artresults on a wide range of problems. Among other variants,XGBoost proposes a specific greedy algorithm for splitting the treeas well as an additional improvement based on a regularizedlearning objective function in order to prevent overfitting. Theinterested reader is referred to [34] for details regarding the deri-vation of this variant.

Gradient boosting models must be finely tuned to preventoverfitting. Some tuning parameters or hyperparameters areadjustable by the users to control the models complexity, includingM, n, d:

� The number of trees (or iterations)M. Boosting can overfit ifM istoo large.

� The shrinkage parameter n. Typical values are 0.01 or 0.001, andthe right choice can depend on the problem. Smaller values of nrequire larger numbers of iterations (M) to converge.

� The number d of splits (also called the interaction depth ormaximum depth) in each tree, which controls the complexity ofthe boosted ensemble.

In this study, by using the optimization data set, the optimalnumber of iterations M was found by early stopping. The selectionof a maximum depth parameter d ¼ 3 together with a shrinkageparameter set at a value of n ¼ 0:001 led to a model that shows areasonable trade-off between its complexity and its predictivepower. Notice that the XGboost package also offers additionalcontrol parameters related in particular to the regularizationterms [33].

The input features described in Table 2 were used as the set ofco-variates for the GB model. With the help of the training data-base, we build one GB model per forecast horizon by changing thetarget variable y.

Finally, the XGBoost package provides a specific metric to rankthe input features according to their impact on predicting themodel's response (not detailed here). This allows selecting themostimportant features for the probabilistic GB model in a similar wayto what was done for the kNN model as explained in 2.2.2.

2.3. Generation of probabilistic forecasts

As noted by Pinson [24]; quantile forecasts are key products forenergy applications given that they are closely related to the un-certainty of the forecasted variable. Formally, a quantile qt atprobability level t ð0< t<1Þ is defined as:

qt ¼ F�1ðtÞ ¼ inf fy : FðyÞ> ¼ tg (12)

where F is the cumulative distribution function (CDF) of therandom variable Y such that:

Page 6: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203196

FðyÞ ¼ PrðY � yÞ (13)

In other words, a quantile qt indicates that there is t probabilitythat the observation falls below the quantile qt. Prediction intervalscan be inferred from this set of quantiles. Although different typesof prediction interval exist, here we restrict ourselves to centralprediction intervals. For instance, an 80% prediction interval isgiven by:

PI80% ¼ ½qt¼0:1; qt¼0:9� (14)

The purpose of this study is to generate solar probabilisticforecasts at different forecasting time horizon Dt ¼ 5 mins, 10min,/, 30 mins. To this end, all the methods estimate a set of quantiles

of irradiances (GHI or DNI) fbItðt þ DtÞgt¼0:1;0;2;/;0:9 for each fore-casting time horizon Dt. If no assumption is made about the shapeof the predictive distributions, this set of quantiles summarizes theforecast predictive cumulative (CDF) and corresponding probability(PDF) distributions functions. This set of quantiles may form alsowhat the verification weather community [35] calls an EnsemblePrediction System (EPS).

2.3.1. Generation of quantiles with the persistence ensemble modelThe persistence ensemble (PeEn) method ( [19,25,36] is a

method commonly used to provide reference probabilistic fore-casts. In the present case, the PeEn considers the GHI or DNI laggedmeasurements in the 120min that precede the forecasting issuingtime. The selected measurements are ranked to define quantilevalues for the irradiance forecast.

2.3.2. Generation of quantiles with the kNN methodThe generation of the set of quantiles for the kNN forecast is

straightforward, we simply compute the quantiles of the distribu-tion of the k forecasts given by Eq. (3).

2.3.3. Generation of quantiles with the GB methodThe GB method is also part of a class of methods called quantile

regression methods. The GB technique estimate the set of quantiles

from a regression model that relates the irradiance bIðt þ DtÞ atforecast horizon Dt to a set of co-variates (or predictor) variables.The most important features of Table 2 extracted with the help ofthe xGBoost constitutes this set of predictor variables. In quantileregression, quantiles are estimated applying asymmetric weights tothemean absolute error [37]. A specific loss functionJt (also calledcheck, pinball or quantile loss function) is used to achieve this goal:

jtðy; f ðxÞÞ ¼

tðy� f ðxÞÞ if  y � fðt� 1Þðy� f ðxÞÞ if  y< f

(15)

with t representing the quantile probability level.To assist in this task, we use the R gbm package [38] that im-

plements the stochastic gradient boosting algorithm. Stochasticgradient boosting is a variant of the generic algorithm describedabove. It consists in taking, at each iteration of the process, a sub-sample of the data drawn at random (without replacement) fromthe full training database. Line 2b (fitting of the tree) in 2.2.3 andEq. (11) (model update) of the GB algorithm make use of thissubsample instead of the full training data set. [30] showed that theintroduction of this randomization step improves the accuracy ofthe GB algorithm. The gbm package also provides the quantile lossfunction Jt in order to compute estimates of the conditional

quantiles of irradiance bIt. As mentioned above, the hyper-parameters are: M the number of trees, n the shrinkage param-eter and d the interaction depth of each tree. In this study, as

suggested by Ref. [38], we set the shrinkage parameter at a value of0.001. The interaction depth was set to 5 for models includingimage features and 3 for models without image features andwe usethe optimization data set to estimate the optimal number oftrees M.

2.4. Error metrics

2.4.1. Deterministic error metricsRMSE is used as the main error metric since it emphasizes the

larger errors.

RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

XNi¼1

�bI i � Ii�2vuut (16)

with N representing the number of samples of the testing set. Asecond metric, used to evaluate the improvement relative to thebaseline model (here the persistence model), is the forecast skillwhich is given by:

s ¼�1� RMSEm

RMSE0

�� 100 ½%� (17)

where RMSE0 is the RMSE for the model described by Eq. (2) andRMSEm is the RMSE for the model m (here the kNN or GB model).

2.4.2. Probabilistic error metricsIn order to evaluate the solar probabilistic forecasts, we use the

reliability diagram, the rank histogram (also known as the Tala-grand histogram) and the continuous ranked probability score(CRPS).

2.4.2.1. CRPS. The Continuous Rank Probability Score (CRPS) pro-vides an evaluation of the global skill of an EPS. The CRPS measuresthe difference between the predicted and observed cumulativedistributions functions ( [39]. The formulation of the CRPS is:

CRPS ¼ 1N

XNi¼1

Zþ∞

�∞

hbPifcstðxÞ � Pix0ðxÞ

i2dx (18)

where bPifcstðxÞ is the predictive CDF of the variable of interest x and

Pix0 ðxÞ is a cumulative-probability step function that jumps from0 to 1 at the point where the forecast variable x equals the obser-vation x0 (i.e. Px0 ðxÞ ¼ 1x�x0 ). The squared difference between thetwo CDFs is averaged over the N ensemble forecast/observationpairs. The CRPS has the same dimension as the forecasted variable(here GHI or DNI). The CRPS is negatively oriented (smaller valuesare better), and its it rewards concentration of probability aroundthe step function located at the observed value ([40]). Put differ-ently, the CRPS will penalize lack of sharpness of the predictivedistributions as well as biased forecasts.

Similarly to the forecast skill score commonly used to assess thequality of point forecasts, we also include the CRPSS (for continuousrank probability skill score). The latter (in %) is given by:

CRPSS ¼�1� CRPSm

CRPS0

�� 100 (19)

where CRPS0 is the CRPS for the persistence ensemble model andCRPSm is the CRPS for the model m (here the kNN or GB model).Negative values of CRPSS indicate that the probabilistic methodfails to outperform the persistence ensemble model while positive

Page 7: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Table 3RMSE and forecast skill s for the GHI forecast for the testing set. RMSE values are inWm�2 and the skill s is in percentage. TH is the time horizon in minutes.

TH Persistence kNNwithoutimages

kNN withimages

GB withoutimages

GB withimages

RMSE RMSE s RMSE s RMSE s RMSE s

5 37.7 34.7 8.0 34.4 8.6 34.5 8.4 32.7 13.310 40.8 34.5 15.5 34.4 15.7 34.6 15.1 33.4 18.115 43.0 35.2 18.1 35.5 17.5 35.0 18.6 33.7 21.520 44.4 36.2 18.5 34.8 21.6 35.5 20.1 34.0 23.325 44.9 36.5 18.7 36.5 18.7 35.7 20.4 34.2 23.830 45.1 36.8 18.4 35.3 21.7 36.1 19.9 34.4 23.6

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 197

values of CRPSS means that the forecasting method improves onpersistence ensemble. Further, the higher the CRPSS score, thebetter the improvement.

2.4.2.2. Reliability diagram. The reliability diagram is a graphicalverification display used to verify the reliability component of aprobabilistic forecast. As noted by Ref. [28], reliability is seen as aprimary requirement when verifying probabilistic forecasts, since alack of reliability would introduce a systematic bias in subsequentdecision-making. A reliable EPS is said to be calibrated. Here, reli-ability diagrams are constructed according to the methodologydefined by Ref. [28] that is especially designed for density forecastsof continuous variables.

This type of reliability diagram plots the observed probabilitiesagainst the nominal ones. The latter are simply the probabilitylevels of the set of quantiles i.e. for nominal proportions rangingfrom 0.1 to 0.9 with increment of 0.1. This representation isappealing because the deviations from perfect reliability (the di-agonal) can be visually assessed [28].

However, due to finite number of pairs of observation/forecastand also due to possible serial correlation in the sequence offorecast-verification pairs, it is not expected that observed pro-portions lie exactly along the diagonal, even if the density forecastsare perfectly reliable [28]. Note that [28] proposed to associatereliability diagrams with consistency bars in order to take this ef-fect of serial correlation.

2.4.2.3. Rank histogram. The rank histogram [40] is another veri-fication graphical tool for evaluating ensemble forecasts. Rankhistograms are useful for determining the statistical consistency ofthe ensemble, that is, the observation being predicted looks sta-tistically like just another member of the forecast ensemble (Wilks[40]. A necessary condition for ensemble consistency is an appro-priate degree of ensemble dispersion leading to a flat rank histo-gram [40]. In other words, a flat rank histogram shows that themembers of an ensemble system are statistically indistinguishablefrom the observations. If the ensemble dispersion is consistentlytoo small (underdispersed ensemble), then the observation (alsocalled the verification sample) will often be an outlier in the dis-tribution of ensemble members. This will result in a rank histogramwith a U-shape. Conversely, if the ensemble dispersion is consis-tently too large (overdispersed ensemble) then the observationmay too often be in the middle of the ensemble distribution. Thiswill give a rank histogramwith a hump shape. However, consistentensembles may exhibit biases. Ensemble bias can be detected fromoverpopulation of either the smallest ranks, or the largest ranks, inthe rank histogram. An overforecasting bias will correspond to anoverpopulation of the smallest ranks while an underforecastingbias will overpopulate the highest ranks. As a consequence, rankhistograms can reveal deficiencies in ensemble calibration or reli-ability ([40]). Again, care must be taken when analysing rank his-tograms as the number of verification samples is limited. Inaddition, as demonstrated by Ref. [41], a perfect rank histogramdoes not mean that the corresponding EPS is reliable.

To obtain a verification rank histogram, one needs to find therank of the observation when pooled within the ordered ensembleof quantile values and then plot the histogram of the ranks. For anumber of membersM, the number of ranks of the histogram of anensemble is M þ 1. If the consistency condition has been met, thishistogram of verification ranks will be uniform with theoreticalrelative frequency of 1

Mþ1.

2.4.2.4. PICP and PINAW. The sharpness of the forecasts (heresharpness refers to the concentration of the EPS) is assessed by

calculating the prediction intervals normalized width (PINAW)([8]). As mentioned by Ref. [42], the goal of probabilistic forecastingis to maximize the sharpness of a calibrated (reliable) EPS. An EPS issharp if prediction intervals are shorter on average than predictionintervals derived from naïve methods, such as climatology orpersistence. In this work, to generate a ð1� aÞ100% central pre-diction interval (PI), we make use of the a

2 quantile as the lowerbound and the 1� a

2 quantile as the upper bound. As an illustration,wewill give the results regarding the PI80% which is calculated fromthe two extremes quantiles of the forecasted irradiance distribution

i.e. bI0:1ðt þ DtÞ and bI0:9ðt þ DtÞ. The prediction interval coverageprobability (PICP) metric permits to assess the empirical coverageprobability of the PI80%. The PICP is another way to evaluate thereliability of the EPS in relation with the forecast horizon. Eq. (20)and 21 give the definitions of the PICP and PINAW for a specificforecast horizon Dt :

PICP ¼ 1N

XNi¼1

1fIðtþDtÞ2PI80%g (20)

The second metric, the Prediction Interval Normalized AveragedWidth (PINAW) is related to the informativeness of PIs or equiva-lently to the sharpness of the predictions. This parameter is theaverage width of the ð1� aÞ100% prediction interval. Note thatPINAW is normalized by the mean of GHI or DNI for the testing set.

PINAW ¼PN

i¼1

�bI0:9ðt þ DtÞ �bI0:1ðt þ DtÞ�

PNi¼1Iðt þ DtÞ

(21)

If PINAW is large, the PI will have little value as it is trivial to saythat the future irradiance will be between its possible extremevalues ([8]). Ideally, prediction intervals should have PICPs close tothe expected coverage rate and low PINAWs.

3. Results

3.1. Results for deterministic point forecasts

The different optimized models were applied to the testing setin order to analyze their performance in an independent data set.The RMSE and the forecast skill for the differentmodels are listed inTables 3 and 4 for GHI and DNI, respectively. For a better visuali-zation, these results, are plotted in Figs. 3 and 4 for GHI and DNI,respectively.

These results show that GHI is much easier to forecast than DNI.The RMSE for all forecast horizons ranges between 32 and 45Wm�2

for GHI, whereas for DNI the RMSE range more than doubles from60 to 100Wm�2. Figs. 3 and 4 show that the optimized kNN and GBmodels reduce the RMSE with respect to the baseline persistence

Page 8: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Table 4RMSE and forecast skill s for the DNI forecast for the testing set. RMSE values are inWm�2 and the skill s is in percentage.

TH Persistence kNNwithoutimages

kNN withimages

GB withoutimages

GB withimages

RMSE RMSE s RMSE s RMSE s RMSE s

5 68.0 61.0 10.3 60.2 11.5 60.5 11.1 58.2 14.310 79.0 65.1 17.6 64.9 17.8 64.1 18.9 62.2 21.215 87.5 68.6 21.6 67.3 23.1 67.3 23.0 65.0 25.720 93.1 71.1 23.6 69.7 25.1 70.1 24.6 67.3 27.725 98.1 75.3 23.2 72.8 25.8 72.4 26.2 69.1 29.130 101.2 77.5 23.4 74.5 26.4 72.2 26.6 71.3 29.6

Fig. 3. (a) RMSE and (b) forecast skill for GHI forecasts (testing set).

Fig. 4. (a) RMSE and (b) forecast skill for DNI forecasts (testing set).

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203198

model for all the cases studied. The reduction in the RMSE trans-lates into significant forecast skills that range between 8 and 24%,and 10 and 30% for the GHI and DNI testing set, respectively.Moreover, as shown by Figs. 3 and 4, a clear improvement isbrought by the GBmethod that includes the image features. Indeed,overall, the GB method increases the skill score obtained by thekNN algorithm by 3.3%.

Table 5CRPS and CRPS skill score (CRPSS) for the GHI forecast for the testing set. CRPS valuesare in Wm�2 and the CRPSS is in percentage.

TH Persistence kNN withoutimages

kNN withimages

GB withoutimages

GB withimages

CRPS CRPS CRPSS CRPS CRPSS CRPS CRPSS CRPS CRPSS

5 16.2 9.4 41.7 9.6 40.7 9.1 43.7 9.1 43.810 16.9 9.9 41.2 10.0 40.5 9.9 41.6 9.8 41.815 17.6 10.5 40.1 10.5 40.3 10.4 40.9 10.3 41.220 18.2 10.7 40.9 10.4 42.8 10.8 40.5 10.7 41.125 18.8 11.0 41.5 11.0 41.5 11.1 40.7 11.0 41.530 19.4 11.2 42.2 10.7 44.7 11.4 41.0 11.2 42.2

Fig. 5. (a) CRPS (Wm�2) and (b) CRPSS in % - GHI forecasts.

Fig. 6. (a) PICP and (b) PINAW for GHI forecasts.

Page 9: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Table 6PICP and PINAW for the GHI forecast for the testing set in percentage.

TH Persistence kNN without images kNN with images GB without images GB with images

PICP PINAW PICP PINAW PICP PINAW PICP PINAW PICP PINAW

5 51.7 11.7 83.2 9.2 83.0 9.1 81.8 9.7 81.6 9.710 45.0 10.6 82.1 9.4 82.4 9.6 82.1 10.6 81.4 10.415 39.9 9.7 81.4 9.8 78.9 9.5 81.9 11.2 81.3 10.820 36.4 8.9 77.4 9.9 80.4 9.6 81.6 11.7 80.2 11.225 33.5 8.4 76.9 10.1 76.9 10.1 81.1 12.0 80.9 11.530 31.3 8.1 76.5 10.3 81.5 9.9 80.7 12.2 80.4 11.7

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 199

3.2. Results for probabilistic forecasts

3.2.1. GHI forecastsRegarding the CRPS metric, all the four techniques (kNN and GB

with and without image features) clearly outperform the persis-tence ensemble model. This statement is reinforced by the highCRPS skill scores (around 42%) shown in Table 5 and Fig. 5. How-ever, and contrary to the results obtained with the point forecasts,the GB method with images does not clearly beat the kNN coun-terpart. Also, it appears here that, whatever the machine learningtechnique, the inclusion of image features does not bring a clear

Fig. 7. Rank histogram for the GHI/DNI forecasts. The black horizon

improvement.The analysis of the PICP metric (see Fig. 6(a) and Table 6) shows

that all the methods, except the persistence ensemble model,exhibit an empirical coverage close to the expected nominal one of80%. As it will be confirmed below by the analysis of the rankhistogram of the persistence ensemble, this probabilistic modelleads to underdispersed EPS. Put in other words, the ensemblepersistence model is over-confident i.e. it underestimates the un-certainty of the forecasts. More interestingly, as shown by Fig. 6(b)and Table 6, the kNNmodels provide the lowest PINAW. In addition,the normalized widths provided by the KNN methods are rather

tal line represents the theoretical relative frequency (here 111).

Page 10: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203200

constant across all forecast horizons.Fig. 7(c) plots rank histograms of the four different models for

GHI forecasts. It must be noted that, in order to provide syntheticresults, the rank histograms and reliability diagrams are averagedacross all the forecasting time horizons. As mentioned above, rankhistograms are designed to assess the consistency property of thedifferent EPS. Fig. 7(a) shows the rank histogram of the persistenceensemblemodel.With a U-shape, the latter exhibits a lack of spreador put differently the ensemble persistence model is under-dispersed. These results match previous researches [19,36] thatshowed that the persistence ensemble model tends to underesti-mate the uncertainty. As a consequence, and as shown by Fig. 6,ensemble persistence prediction intervals have both low PICP andlow PINAW because the ensemble members of the persistenceensemble tend to be too much like each other, and different fromthe observation.

The four other models exhibit rather flat rank histograms(especially for the GB model that includes image features) henceindicating that the ensembles are statistically consistent (i.e. theensembles apparently include the observations being predicted asequiprobable members). In particular, as neither the highest ranksnor the lowest ranks are overpopulated, no overforecasting orunderforecasting bias is detected.

Fig. 8. Reliability Diagrams fo

The analysis of the reliability diagrams in Fig. 8(c) confirms thisprevious statement. Indeed, slight departures from the ideal lineare noted for all the four models. Moreover, less discrepancies areobserved for the GB model that include image features.

The results regarding the rank histograms, the PICP and thereliability diagrams indicate that the EPS produced by the fourmodels are consistent and reliable.

3.2.2. DNI forecastsIn terms of CRPS, both kNN models slightly outperform the GB

technique that include image features. Also, the inclusion of imagefeatures has a small positive impact for the GB method but not forkNN method. As shown in Fig. 9 and Table 7, high CRPS skill scores(around 62%) are obtained.

As for the GHI probabilistic forecasts, the analysis of the PICPmetric (see Fig. 10(a) and Table 8) shows that all the models, exceptthe persistence ensemble model, exhibit an empirical coverageclose to the expected nominal one of 80%. Again, the kNN methodsprovide the lowest PINAW (see Fig. 10(b) and Table 8).

Except the rank histograms related to GB model without imagefeatures, Fig. 7(d) shows that all other 3 rank histograms exhibit apositive bias i.e. the left extreme rank is highly populated. There-fore, the corresponding EPS have a tendency for overforecasting. In

r the GHI/DNI forecasts.

Page 11: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Fig. 9. a) CRPS (Wm�2) and (b) CRPSS in % - DNI forecasts.

Table 7CRPS and CRPS skill skill score (CRPSS) for the DNI forecast for the testing set. CRPSvalues are in Wm�2 and the CRPSS is in percentage.

TH Persistence kNN withoutimages

kNN withimages

GB withoutimages

GB withimages

CRPS CRPS CRPSS CRPS CRPSS CRPS CRPSS CRPS CRPSS

5 45.0 16.9 62.6 17.3 61.6 17.1 62.0 17.2 61.710 50.0 19.0 61.9 19.2 61.6 21.4 57.2 19.7 60.515 54.4 20.7 62.0 20.9 61.6 23.4 57.1 21.6 60.220 58.4 22.0 62.3 21.9 62.6 25.2 56.9 23.4 60.025 62.0 23.8 61.6 23.2 62.5 25.6 58.7 23.9 61.530 65.4 25.1 61.6 24.0 63.3 25.9 60.4 24.3 62.9

Fig. 10. (a) PICP and (b) PINAW for DNI forecasts as a function of the forecast horizon.

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 201

addition, the rank histogram of the KNN method with image fea-tures shows also that the corresponding ensemble has an excess ofvariability (over-dispersed ensemble) as the middle ranks areoverpopulated. Again, the DNI persistence ensemble (see Fig. 7(b))leads to an underdispersed EPS.

Accordingly, the analysis of the corresponding reliability dia-grams plotted in Fig. 8(d) reveals that, except the GB techniquewithout image features, all 3 others models exhibit departures fromthe ideal line. This observation motivates the statement that DNIprobabilistic forecasts are less reliable than their GHI counterparts.Hence confirming again that DNI is much more difficult to predictthan GHI.

Finally, Figs. 11 and 12 show the 30-min forecast time series andthe corresponding PI80% (the blue shaded areas) for GHI and DNI,respectively. The selected days cover different weather conditions,from completely overcast to completely clear days in the testingdata set. As expected, narrower prediction intervals are obtainedfor clear days (days 5 and 6) or completely overcast days (day 1)than for variable sky conditions (days 2, 3 and 4). For these variablesky conditions days and especially for days 2 and 3, the kNNtechnique leads to the sharpest predictive distribution (i.e. to nar-rower PIs than the GB technique) either for GHI or DNI.

The results presented above clearly indicate that machinelearning forecasting models that use ground telemetry and dataretrieved from sky images achieve good performance both in termsof deterministic point forecasts and probabilistic forecasts. This wasdemonstrated for both GHI and DNI for horizons shorter than30min. Such forecasting models should be a key feature for theoptimization of grid management and contribute to a more effi-cient integration of intermittent sources in the electricity network.Indeed, the knowledge of the uncertainty of the solar forecasts willpermit assessing the risk linked to the scheduling of energy sourcesand to the planning of units commitment. More precisely, proba-bilistic forecasts are important inputs for stochastic models of gridmanagement [43,44]. Additionally, optimal operational manage-ment of grid-connected storage associated with intermittentrenewable resources requires accurate solar forecasts [45]. Thepresented work provides a methodology that can further thesegoals for the sub 30min window.

4. Conclusions

In this study, we performed a comparison of machine learningtechniques for deterministic and probabilistic GHI and DNI fore-casts using local irradiance data and sky images. We carried out adetailed analysis of the forecast performance for the techniques,which are capable of achieving relatively large forecasting skillswith very limited investment in instrumentation. The present workfocuses on the methodology used for developing the forecastingmethods and therefore does not address more general questionssuch as the applicability of the methods to a wide range of solarvariability microclimates or the effect of very long (multiple years)training data. Nevertheless, the following conclusions can be drawnfrom this study:

� The GB method including features from sky images resulted in aclear improvement of the skill of the deterministic forecasts.Values of s for this method range between 13.3% and 23.6% forthe 5e30min GHI forecasting, and range between 14.3% and29.6% for the DNI forecasting, whereas the corresponding valuesobtained with the kNN algorithm range between 8.6% and 21.7%,and 11.5% and 26.4%.

� Conversely, regarding the probabilistic forecasts, the simplerkNN algorithm led to better results than the GB method espe-cially in terms of prediction interval average width (PINAW �

Page 12: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

Table 8PICP and PINAW for the DNI forecasts for the testing set in percentage.

TH Persistence kNN without images kNN with images GB without images GB with images

PICP PINAW PICP PINAW PICP PINAW PICP PINAW PICP PINAW

5 46.4 22.1 79.4 10.7 87.0 11.5 85.0 13.7 84.3 13.310 39.8 21.2 79.0 11.7 83.9 12.1 84.7 16.7 82.8 14.815 35.3 20.5 80.3 12.8 85.4 13.4 84.2 17.9 82.5 16.020 31.9 19.9 79.6 13.9 83.6 13.9 84.8 19.2 81.5 17.125 29.0 19.2 78.5 14.9 84.3 15.3 84.6 19.3 80.8 17.030 26.6 18.7 78.0 15.7 82.0 15.4 84.5 20.6 80.7 17.5

Fig. 11. Measured and forecasted time series and the corresponding PI80% for GHI. The graph shows 7 days that cover different weather conditions, from completely overcast tocompletely clear days in the testing data set. (a) KNN model; (b) GB model.

Fig. 12. Same as in Fig. 11 but for DNI.

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203202

10.1% for GHI and PINAW � 15.4% for DNI using the kNN algo-rithm, and PINAW � 11.7% for GHI and PINAW � 17.5% for DNIusing the GB method).

� For GHI, the ensemble prediction system generated by bothtechniques appears to be statistically consistent and reliable asshown by the rank histograms and reliability diagrams. How-ever, for DNI, the probabilistic forecasts appear to be lessreliable.

In conclusion, we can state that the simplest kNN method

appears to be a viable technique for generating probabilistic fore-casts if correctly optimized. Indeed, it has been demonstrated thatthe kNN method is capable of generating reliable and sharp GHIprobabilistic forecasts. Conversely, while the inclusion of imagefeatures in the gradient boosting model improves the forecast skillof the deterministic forecasts, this inclusion (irrespective of themachine learning method used) has unfortunately no clear impacton the probabilistic forecasts. A reason that may motivate this factis related to the methodology used to extract features from the skyimages. In this work, these features are computed using all the

Page 13: Assessment of machine learning techniques for ...coimbra.ucsd.edu/publications/papers/2018_Pedro_Coimbra_David_L… · Assessment of machine learning techniques for deterministic

H.T.C. Pedro et al. / Renewable Energy 123 (2018) 191e203 203

unmasked pixels in the sky image regardless of cloud motion. Infuture work we will explore the effect of taking into account onlyclouds moving toward the sun's position.

Acknowledgments

Philippe Lauret has received support for his research work fromR�egion R�eunion under the European Regional Development Fund(ERDF) - Programmes Op�erationnels Europ�eens FEDER 2014-2020 -Fiche action 1.07. HugoT.C. Pedro and Carlos F.M. Coimbra gratefullyacknowledge the partial support provided by the California EnergyCommission PIER EPC-14-008 project, which is managed by Dr.Silvia Palma-Rojas.

References

[1] R.H. Inman, H.T.C. Pedro, C.F.M. Coimbra, Solar forecasting methods forrenewable energy integration, Prog. Energy Combust. Sci. 39 (2013) 535e576.

[2] V. Kostylev, A. Pavlovski, Solar power forecasting performance towards in-dustry standards, in: In Proceedings of the 1st International Workshop on theIntegration of Solar Power into Power Systems, Aarhus, Denmark, 2011.

[3] J. Huang, M. Korolkiewicz, M. Agrawal, J. Boland, Forecasting solar radiation onan hourly time scale using a Coupled AutoRegressive and Dynamical System(CARDS) model, Sol. Energy 87 (2013) 136e149.

[4] P. Lauret, C. Voyant, T. Soubdhan, M. David, P. Poggi, A benchmarking ofmachine learning techniques for solar radiation forecasting in an insularcontext, Sol. Energy 112 (2015) 446e457.

[5] J.R. Trapero, N. Kourentzes, A. Martin, Short-term solar irradiation forecastingbased on Dynamic Harmonic Regression, Energy 84 (2015) 289e295.

[6] C. Voyant, F. Motte, A. Fouilloy, G. Notton, C. Paoli, M.L. Nivet, Forecastingmethod for global radiation time series without training phase: comparisonwith other well-known prediction methodologies, Energy 120 (2017)199e208.

[7] Y. Chu, H.T.C. Pedro, C.F.M. Coimbra, Hybrid intra-hour DNI forecasts with skyimage processing enhanced by stochastic learning, Sol. Energy 98 (Part C)(2013) 592e603.

[8] H.T.C. Pedro, C.F.M. Coimbra, Nearest-neighbor methodology for prediction ofintra-hour global horizontal and direct normal irradiances, Renew. Energy 80(2015) 770e782.

[9] H.T.C. Pedro, C.F.M. Coimbra, Assessment of forecasting techniques for solarpower production with no exogenous inputs, Sol. Energy 86 (2012)2017e2028.

[10] C.W. Chow, B. Urquhart, M. Lave, A. Dominguez, J. Kleissl, J. Shields,B. Washom, Intra-hour forecasting with a total sky imager at the UC San Diegosolar energy testbed, Sol. Energy 85 (2011) 2881e2893.

[11] Y. Chu, H.T.C. Pedro, L. Nonnenmacher, R.H. Inman, Z. Liao, C.F.M. Coimbra,A smart image-based cloud detection system for intrahour solar irradianceforecasts, J. Atmos. Ocean. Technol. 31 (2014) 1995e2007.

[12] E.M. Crispim, P.M. Ferreira, A.E. Ruano, Prediction of the solar radiation evo-lution using computational intelligence techniques and cloudiness indices,Int. J. Innovat. Comput. Inf. Control 4 (2008) 1121e1133.

[13] P. Ferreira, J. Gomes, I. Martins, A. Ruano, A neural network based intelligentpredictive sensor for cloudiness, solar radiation and air temperature, Sensors12 (2012) 15750e15777.

[14] R. Marquez, C.F.M. Coimbra, Intra-hour DNI forecasting based on cloudtracking image analysis, Sol. Energy 91 (2013) 327e336.

[15] R. Marquez, V.G. Gueorguiev, C.F.M. Coimbra, Forecasting of global horizontalirradiance using sky cover indices, J. Sol. Energy Eng. 135 (2012)011017e011021.

[16] R. Marquez, H.T.C. Pedro, C.F.M. Coimbra, Hybrid solar forecasting methoduses satellite imaging and ground telemetry as inputs to ANNs, Sol. Energy 92(2013) 176e188.

[17] S. Quesada-Ruiz, Y. Chu, J. Tovar-Pescador, H.T.C. Pedro, C.F.M. Coimbra,

Cloud-tracking methodology for intra-hour DNI forecasting, Sol. Energy 102(2014) 267e275.

[18] P. Bacher, H. Madsen, H.A. Nielsen, Online short-term solar power forecasting,Sol. Energy 83 (2009) 1772e1783.

[19] M. David, F. Ramahatana, P. Trombe, P. Lauret, Probabilistic forecasting of thesolar irradiance with recursive ARMA and GARCH models, Sol. Energy 133(2016) 55e72.

[20] A. Grantham, Y.R. Gel, J. Boland, Nonparametric short-term probabilisticforecasting for solar radiation, Sol. Energy 133 (2016) 465e475.

[21] E.B. Iversen, J.M. Morales, J.K. Mller, H. Madsen, Probabilistic forecasts of solarirradiance using stochastic differential equations, Environmetrics 25 (2014)152e164.

[22] T. Hong, P. Pinson, S. Fan, H. Zareipour, A. Troccoli, R.J. Hyndman, Probabilisticenergy forecasting: global energy forecasting competition 2014 and beyond,Int. J. Forecast. 32 (2016) 896e913.

[23] J. Huang, M. Perry, A semi-empirical approach using gradient boosting and k-nearest neighbors regression for GEFCom2014 probabilistic solar powerforecasting, Int. J. Forecast. 32 (2016) 1081e1086.

[24] P. Pinson, Wind energy: forecasting challenges for its operational manage-ment, Stat. Sci. 28 (2013) 564e585.

[25] S. Alessandrini, L. Delle Monache, S. Sperati, G. Cervone, An analog ensemblefor short-term probabilistic solar power forecast, Appl. Energy 157 (2015)95e110.

[26] S. Sperati, S. Alessandrini, L. Delle Monache, An application of the ECMWFEnsemble Prediction System for short-term solar power forecasting, Sol. En-ergy 133 (2016) 437e450.

[27] R. Bessa, A. Trindade, C.S. Silva, V. Miranda, Probabilistic solar power fore-casting in smart grids using distributed information, Int. J. Electr. Power En-ergy Syst. 72 (2015) 16e23.

[28] P. Pinson, P. McSharry, H. Madsen, Reliability diagrams for non-parametricdensity forecasts of continuous variables: accounting for serial correlation,Q. J. R. Meteorol. Soc. 136 (2010) 77e90.

[29] P. Ineichen, A broadband simplified version of the solis clear sky model, Sol.Energy 82 (2008) 758e762.

[30] J. Friedman, Stochastic gradient boosting, Comput. Stat. Data Anal. 38 (2002)367e378.

[31] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning,Springer Series in Statistics, Springer New York, New York, NY, 2009.

[32] L. Breiman, Random forests, Mach. Learn. 45 (2001) 5e32.[33] T. Chen, T. He, Xgboost: Extreme Gradient Boosting, R package version 0.6-4,

2017.[34] T. Chen, C. Guestrin, XGBoost: a Scalable Tree Boosting System, ACM Press,

2016, pp. 785e794.[35] G. Candille, C. Cte, P. Houtekamer, G. Pellerin, Verification of an ensemble

prediction system against observations, Mon. Weather Rev. 135 (2007)2688e2699.

[36] Y. Chu, C.F.M. Coimbra, Short-term probabilistic forecasts for direct normalirradiance, Renew. Energy 101 (2017) 526e536.

[37] R. Koenker, G. Bassett, Regression quantiles, Econometrica 46 (1978) 33e50.[38] G. Ridgeway, Generalized Boosted Models: a Guide to the Gbm Package, 2007.[39] H. Hersbach, Decomposition of the continuous ranked probability score for

ensemble prediction systems, Weather Forecast. 15 (2000) 559e570.[40] D. Wilks, Statistical Methods in the Atmospheric Sciences: an Introduction,

Elsevier Science, Burlington, 2014.[41] T.M. Hamill, Interpretation of rank histograms for verifying ensemble fore-

casts, Mon. Weather Rev. 129 (2001) 550e560.[42] T. Gneiting, F. Balabdaoui, A.E. Raftery, Probabilistic forecasts, calibration and

sharpness, J. Roy. Stat. Soc. B 69 (2007) 243e268.[43] R.B. Hytowitz, K.W. Hedman, Managing solar uncertainty in microgrid sys-

tems with stochastic unit commitment, Elec. Power Syst. Res. 119 (2015)111e118.

[44] D.E. Olivares, J.D. Lara, C.A. Caizares, M. Kazerani, Stochastic-predictive energymanagement system for isolated microgrids, IEEE Trans. Smart Grid 6 (2015)2681e2693.

[45] R. Hanna, J. Kleissl, A. Nottrott, M. Ferry, Energy dispatch schedule optimi-zation for demand charge reduction using a photovoltaic-battery storagesystem with solar forecasting, Sol. Energy 103 (2014) 269e287.