8
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON POWER SYSTEMS 1 Start-Up Decision of a Rapid-Start Unit for AGC Based on Machine Learning Inmaculada Saboya, Ignacio Egido, and Luis Rouco, Member, IEEE Abstract—Units within a control area, participating in the secondary frequency control, are usually spinning generating units already connected to the network and operating outside their range of optimal performance. This paper deals with an alternative method of providing secondary frequency control called rapid-start (RS). It consists in assigning a regulation band to several ofine units (RS units) which are capable of being started and connected rapidly, therefore allowing the online units to function more closely to their nominal power. RS units have commonly been used for peaking generation and for tertiary control reserve, and have been rarely used for secondary control reserve. As RS operation may have economic benets, since it allows for better dispatch of the other units in the control area, an appropriate algorithm to start up an RS unit needs to be developed. This paper proposes a machine learning based system (MLBS) to be employed in the decision to start up an RS unit while being used to provide secondary frequency control. The decision-making procedure is carried out by a decision tree. The building and implementation of the RS machine learning based system is illustrated for a secondary frequency control zone within the Spanish power system. Index Terms—Clustering, decision tree, machine learning, rapid-start, secondary regulation. I. INTRODUCTION A NCILLARY services (AS) are crucial to assuring the security and to maintaining the reliability of a power system. Although there are AS related to the control of the frequency (active power ancillary services) and AS related to the control of the voltage (reactive power ancillary services), this paper focuses on frequency control services. It is difcult to group and to compare AS, since they vary from system to system, due to the independent and non- simul- taneous processes of their electricity market liberalization and due to the differences in their structure [1]–[4]. Efforts have been made in the literature to compare AS across several power systems. For example, the surveys of frequency and voltage control ancillary services detailed in [1] and [5] in- clude the denitions and technical and economic characteristics of the frequency control ancillary services from various power systems around the world, including the regulator and the trans- mission system operator (TSO) names. Frequency control is usually divided into three different loops, primary, secondary and tertiary control, each one with Manuscript received July 26, 2012; revised November 23, 2012 and March 30, 2013; accepted April 11, 2013. Paper no. TPWRS-00874-2012. The authors are with the School of Engineering (ICAI), Universidad Ponticia Comillas, 28015 Madrid, Spain (e-mail: [email protected] comillas.es; [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TPWRS.2013.2259267 a different reserve. For example, denitions and technical and economic characteristics of the frequency control ancillary services from the UCTE countries can be found in [6]. Primary frequency control is a proportional control to cor- rect generation-load unbalances, resulting in steady-state fre- quency deviations. Transients of primary frequency control are in the time-scale of seconds [7]. Secondary frequency control restores the system frequency to its scheduled value and main- tains the interchange power between control areas at the sched- uled values [7]. When the system fails to correctly fulll these objectives, a less secure power system operation results [8]; suf- cient secondary frequency control reserves must therefore be available. Transients of secondary frequency control are in the time-scale of minutes [7]. Tertiary frequency control is in charge of restoring the secondary control reserve to the initial sched- uled value. The secondary frequency control is provided by control areas. Each control area is composed of two types of units: 1) units previously authorized by the system operator (SO) to respond to control signals sent by the corresponding AGC; and 2) units that are not authorized for the active participation in the sec- ondary frequency control service and which consequently must maintain their generation at its scheduled value. The full avail- ability time of secondary reserve among different systems is in the range from 5 to 15 min [9]. For example: PJM , California , Spain (5 to 8 min), and the UCTE as a whole . Units providing secondary frequency control reserves are presently spinning generating units already connected to the grid [1], [10]–[12]. In order to provide reserve, these units are operating below their optimal power level, and when necessary, they quickly ramp up/down to provide the required reserve [12]. Increasing the available secondary reserve results in higher operation security, but also involves higher costs, since it requires that additional generation units be committed and that those units operate outside their optimal power level [12]–[14]. In other words, the current way of providing secondary reserve is rather inefcient and environmentally unfriendly [12], [15]. Moreover, this provided secondary reserve cannot be offered in other markets (e.g., energy market), which results in an opportunity cost. This paper deals with a method of providing secondary fre- quency control called rapid-start (RS). In contrast to the cur- rent way, this method keeps most online units working close to their nominal power and thus with less regulation reserve; the remaining regulation reserve cleared in the market is assigned to ofine units (RS units). An RS unit is an ofine, quick start unit that is capable of starting up, generating power and par- ticipating in the secondary reserve in a short time. In fact, RS units have high ramp rates and can start up within 5 to 10 min [16]–[18], making them suitable for AGC operation. Thus an RS unit will be started up when necessary in order to support the control area to provide secondary frequency control under 0885-8950/$31.00 © 2013 IEEE

Start-Up Decision of a Rapid-Start Unit for AGC Based on Machine Learning

  • Upload
    luis

  • View
    222

  • Download
    1

Embed Size (px)

Citation preview

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON POWER SYSTEMS 1

Start-Up Decision of a Rapid-Start Unitfor AGC Based on Machine Learning

Inmaculada Saboya, Ignacio Egido, and Luis Rouco, Member, IEEE

Abstract—Units within a control area, participating in thesecondary frequency control, are usually spinning generatingunits already connected to the network and operating outsidetheir range of optimal performance. This paper deals with analternative method of providing secondary frequency controlcalled rapid-start (RS). It consists in assigning a regulation bandto several offline units (RS units) which are capable of beingstarted and connected rapidly, therefore allowing the online unitsto function more closely to their nominal power. RS units havecommonly been used for peaking generation and for tertiarycontrol reserve, and have been rarely used for secondary controlreserve. As RS operation may have economic benefits, since itallows for better dispatch of the other units in the control area,an appropriate algorithm to start up an RS unit needs to bedeveloped. This paper proposes a machine learning based system(MLBS) to be employed in the decision to start up an RS unitwhile being used to provide secondary frequency control. Thedecision-making procedure is carried out by a decision tree. Thebuilding and implementation of the RS machine learning basedsystem is illustrated for a secondary frequency control zone withinthe Spanish power system.

Index Terms—Clustering, decision tree, machine learning,rapid-start, secondary regulation.

I. INTRODUCTION

A NCILLARY services (AS) are crucial to assuring thesecurity and to maintaining the reliability of a power

system. Although there are AS related to the control of thefrequency (active power ancillary services) and AS related tothe control of the voltage (reactive power ancillary services),this paper focuses on frequency control services.It is difficult to group and to compare AS, since they vary

from system to system, due to the independent and non- simul-taneous processes of their electricity market liberalization anddue to the differences in their structure [1]–[4].Efforts have been made in the literature to compare AS across

several power systems. For example, the surveys of frequencyand voltage control ancillary services detailed in [1] and [5] in-clude the definitions and technical and economic characteristicsof the frequency control ancillary services from various powersystems around the world, including the regulator and the trans-mission system operator (TSO) names.Frequency control is usually divided into three different

loops, primary, secondary and tertiary control, each one with

Manuscript received July 26, 2012; revised November 23, 2012 and March30, 2013; accepted April 11, 2013. Paper no. TPWRS-00874-2012.The authors are with the School of Engineering (ICAI), Universidad

Pontificia Comillas, 28015 Madrid, Spain (e-mail: [email protected]; [email protected]; [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TPWRS.2013.2259267

a different reserve. For example, definitions and technical andeconomic characteristics of the frequency control ancillaryservices from the UCTE countries can be found in [6].Primary frequency control is a proportional control to cor-

rect generation-load unbalances, resulting in steady-state fre-quency deviations. Transients of primary frequency control arein the time-scale of seconds [7]. Secondary frequency controlrestores the system frequency to its scheduled value and main-tains the interchange power between control areas at the sched-uled values [7]. When the system fails to correctly fulfill theseobjectives, a less secure power system operation results [8]; suf-ficient secondary frequency control reserves must therefore beavailable. Transients of secondary frequency control are in thetime-scale of minutes [7]. Tertiary frequency control is in chargeof restoring the secondary control reserve to the initial sched-uled value.The secondary frequency control is provided by control areas.

Each control area is composed of two types of units: 1) unitspreviously authorized by the system operator (SO) to respondto control signals sent by the corresponding AGC; and 2) unitsthat are not authorized for the active participation in the sec-ondary frequency control service and which consequently mustmaintain their generation at its scheduled value. The full avail-ability time of secondary reserve among different systems is inthe range from 5 to 15 min [9]. For example: PJM ,California , Spain (5 to 8 min), and the UCTE as awhole .Units providing secondary frequency control reserves are

presently spinning generating units already connected to thegrid [1], [10]–[12]. In order to provide reserve, these units areoperating below their optimal power level, and when necessary,they quickly ramp up/down to provide the required reserve [12].Increasing the available secondary reserve results in higheroperation security, but also involves higher costs, since itrequires that additional generation units be committed and thatthose units operate outside their optimal power level [12]–[14].In other words, the current way of providing secondary reserveis rather inefficient and environmentally unfriendly [12], [15].Moreover, this provided secondary reserve cannot be offeredin other markets (e.g., energy market), which results in anopportunity cost.This paper deals with a method of providing secondary fre-

quency control called rapid-start (RS). In contrast to the cur-rent way, this method keeps most online units working close totheir nominal power and thus with less regulation reserve; theremaining regulation reserve cleared in the market is assignedto offline units (RS units). An RS unit is an offline, quick startunit that is capable of starting up, generating power and par-ticipating in the secondary reserve in a short time. In fact, RSunits have high ramp rates and can start up within 5 to 10 min[16]–[18], making them suitable for AGC operation. Thus anRS unit will be started up when necessary in order to supportthe control area to provide secondary frequency control under

0885-8950/$31.00 © 2013 IEEE

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON POWER SYSTEMS

TABLE IREFERENCES CLASSIFICATION

the conditions set by the SO. Note that online units are still re-quired to provide the main part of the secondary reserve, sincean RS unit provides only a fraction of it. This fraction, here ap-proximately 10% to 20%, depends on the control area and theRS unit characteristics.As RS units usually have high operational costs, they are gen-

erally used for peaking generation. Regarding the provision ofancillary services, they are also commonly used for the tertiaryfrequency control [2], [11], [15], [16], [19]–[22]. However, de-pending on the secondary reserve price, RS units can also obtainbenefits from providing secondary frequency control. The RSoperating method augments these benefits; as RS units are onlystarted up when a high quantity of upward secondary reserve isdemanded by the power system, they do not always need to beonline. Using an RS unit for secondary regulation has only beensuggested in [23] so far, where a very simple algorithm to startup the RS units is presented. Table I summarizes the referencesthat use RS operation for the provision of ancillary services, em-phasizing the frequency control reserve that is provided by theRS units. Table I also indicates whether the activation of thosereserves is carried out manually or automatically, and how thestart-up decision is made. Note that in most cases, RS units’start-up decision rules have not been described. Only in somecases is it mentioned that RS units are called to start up when acontingency occurs and that the activation is done manually.Rapid-start operation may have economic benefits, since it

allows for a better dispatch of the other units in the controlarea. Otherwise, the cost of starting up and shutting down theRS unit when necessary, and the possible costs for non-compli-ance with the dynamic response necessary for AGC, may reducethese benefits or even make this kind of operation inadvisable.For instance, a utility operating in a Spanish control area withapproximately 40% market share could expect benefits of 3–4€; due to a cost reduction in the re-dispatch (units offer their

full capacity in the Spanish energy market and a re-dispatch isneeded for the assignment of the secondary regulation band) anddue to the higher margin of the online units if a 50-MW RS unitis used. Accordingly an algorithm to start the RS unit is needed,which takes into account the technical constraints of repeatedlystarting up and shutting down the RS unit. The aim of this paperis to propose an advanced RS machine learning based system(RS-MLBS) is proposed, in order to provide an effective RSoperation for secondary frequency control. The proposed algo-rithm is an automatic RS algorithm that includes an MLBS totake the decision of starting up and shutting down an RS unit.The decision making is carried out by a decision tree. Moreover,clustering techniques are used to group one (or several) of thepredictor variables of the decision tree. This proposal is also in-cluded at the bottom of Table I.Machine learning is concerned with the automatic design

of rules similar to those used by human experts, and it is

widely applied to power system problems such as securityassessment, contingency analysis, transient stability, etc. [24],[25]. Decision tree methodology and clustering techniques aresuccessful classes of such machine learning methods [26], [27].The main strengths of decision trees are [24], [28]: 1) theirinterpretability, 2) their ability to identify the most relevant at-tributes for each problem, and 3) their computational efficiency,which is compatible with real-time constraints. Clustering tech-niques group a given set of data by maximizing the similaritywithin groups (clusters) and minimizing the similarity betweendifferent groups [29].The paper is organized as follows. Section II describes

rapid-start operation.Section III describes the RS experi-ence-based system presented in [23]. Section IV presents theproposed RS-MLBS. The building of the RS-MLBS is shownin Section V. In Section VI, the built RS-MLBS, as obtained inSection V, is implemented in a Spanish secondary frequencycontrol zone. Section VII concludes the paper.

II. RAPID START

RS operation is an alternative way of providing secondary re-serve that allows RS offline units to provide an amount of totalsecondary regulation reserve assigned to an area [23]. This sec-tion describes the RS operation, the RS units’ characteristics andthe difficulties of rapid-start operation for secondary frequencycontrol. Finally, a summary of the rapid-start algorithm is pre-sented.

A. RS Operation

RS operation seeks to increase the output of connected units,and thereby bring it closer to the optimal output, while avoidingthe commitment of additional units. Although some economicbenefits could be expected with RS operation, this paper focuseson obtaining an automatic start-up algorithm such that the RSunits are online when required.Fig. 1 compares the current way of providing secondary re-

serve and the method based on RS operation. In the upper part ofFig. 1, the current way of providing secondary reserve is shown.In this example, it is required that two online generation units(G1 and G2) are operating below their nominal power level inorder to provide a determined amount of secondary control re-serve. In the lower part of Fig. 1, the method of RS operation isdepicted. In this case, and in order to provide the same amount ofsecondary control reserve, only one online generation unit (G1)is required, which operates closer to its power level whilst satis-fying the demand. The remaining regulation reserve is providedby an offline RS unit that will be only started up if necessary.Obviously, RS units provide only upward secondary reserve.

B. RS Unit Characteristics

RS units (also called quick-start units) should be offline unitsthat are capable of starting up, generating power and partici-pating in the secondary reserve in a short time (RS units arecapable of starting up in 5–10 min) [16]–[18]. Hydro units canbe seen as rapid-start units due to their low start-up time andtheir fast response [2], [23], [30], [31]. Open cycle gas turbines(OCGT) can also be considered as rapid-start units [23].

C. RS Operation Difficulties

The main task of RS operation is to decide, once an amountof regulation band has been assigned to the RS units, when tostart up and shut down the RS unit. This is a difficult decision totake because of the complexity of predicting how much power

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SABOYA et al.: START-UP DECISION OF A RAPID-START UNIT FOR AGC BASED ON MACHINE LEARNING 3

Fig. 1. Traditional way of providing secondary frequency control reserve (top)versus rapid-start way (bottom).

will be needed for secondary frequency control in the followingseveral minutes. As RS units are capable of starting up in 5–10min, a minimum of this prediction time frame will be needed toevaluate when the RS unit will be needed. Accordingly, an algo-rithm for starting the RS unit is needed, which takes into accountthe technical constraints of repeatedly starting up and shuttingdown the RS unit. A very simple algorithm for secondary fre-quency control has been suggested in [23]. This paper proposesan RS-MLBS to take the decision to start up and shut down anRS unit.Note that the decision to start up a generation unit should

be taken in real time, when it seems that the secondary reserveprovided by online, non-RS generation units is being depleted.It is important to mention that each area is required to provide

the secondary reserve previously agreed, and that non-fulfill-ment, in the case that the power system requires it, will producesevere penalties to the area [10]. Thus, in the case that an RSunit is not online within due time, the whole area may be penal-ized.

D. Rapid-Start Algorithm

As a consequence, in order to ensure that RS operation isfeasible and realistic in terms of secondary frequency controlreserve provision, an effective algorithm for starting up an RSunit in due time should be used. This algorithm may fulfill twoobjectives: 1) to start up an RS unit within due timewhen neededand 2) to minimize operation costs, i.e., to minimize the numberof start-ups and the duration of the RS unit’s online operation.Thus, the RS unit should be started up when it is needed butmust remain offline during the remainder of the time.

III. EXPERIENCE-BASED SYSTEM

This section describes the RS experience-based (RS-EB) al-gorithm presented in [23], where a very simple algorithm is pre-sented to obtain the signal of start-up and shut-down of an RSunit. It consists in starting up the RS unit only when the sec-ondary reserve provided by online, non-RS units is becomingdepleted.This RS algorithm calculates the maximum power that

can be generated with the units that are currently connected

Fig. 2. RS-EB algorithm [23].

(PMaxNoRS). The RS algorithm then decides to start up anRS unit when the total power generated within the controlarea (PGen) approaches this maximum power, i.e., when thesecondary up reserve is becoming depleted. Accordingly thepower level which activates a new start up mustbe lower than PMaxNoRS. This power level is calculated bysubtracting DifStart from PMaxNoRS, with DifStart being athreshold to anticipate the need for an RS start up.The RS unit is stopped when PGen achieves a value called

that is calculated as the PMaxNoRS minus a thresholdDifStop. DifStop should be higher thanDifStart in order to avoidconsecutive shut-downs and start-ups. Fig. 2 shows an exampleof the behavior of the RS experience-based algorithm.

IV. MACHINE LEARNING BASED SYSTEM

This paper proposes an RS algorithm that includes a machinelearning based system (MLBS) to take the decision to start upand shut down an RS unit. TheMLBS as developed seeks to pre-dict when the RS unit will be needed by using a set of predictorvariables. The MLBS improves the aforementioned simple RSexperience-based algorithm.This section describes the RS-MLBS. Moreover, the training

set of the MLBS is described and the clustering analysis is pre-sented.

A. Description of the RS Algorithm With the MLBS

The RS-MLBS developed in this paper is based on the RS-EBalgorithm as described in Section III. However, the RS-EB al-gorithm leads on many occasions to a situation where an RS unitis started up even though this unit is finally not needed due toa change in the power trend. The MLBS is used to improve thedecision-making regarding the start-up by taking into account aset of predictor variables.Thus, when the control area power approaches , the RS

unit is not started up, but the MLBS (decision tree) is insteadrun. The decision tree would supply information as to whetherthe RS unit will be needed, and only in this case is the RSunit started up. The start-up decision is now carried out at allthose points (start-up decision points) above

.The use of a Classification And Regression Tree (CART) de-

cision tree generation algorithm [26] is here proposed for thebuilding of a decision tree which predicts how high the non-ful-fillment is in the case that an RS unit does not start up. Basedon this result, the RS algorithm will decide whether or not tostart up an RS unit. In addition, other algorithms can be used tobuild the tree, such as the ID3 [32] and the C4.5 [27] decisiontree generation algorithms.It is important to note that each area is required to provide

the agreed secondary reserve. A non-fulfillment in the case that

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON POWER SYSTEMS

Fig. 3. Overview of the RS machine learning based algorithm.

the power system requires the secondary regulation band willproduce severe penalties to the area. Therefore, a variable thatreflects this non-fulfillment should be used to take the start-updecision. This target variable measures whether or not an RSunit is needed. Variables that can be used are: 1) the time thatthe area is not providing the required secondary regulation band,termed Time of Non-Served Secondary Power (TNSSP); and 2)the energy that would be not provided by the area if the RS unitwere not started, termed non-served secondary energy (NSSE);As decision tree methods are more appropriate for pre-

dicting categorical variables, the predicted variable should bediscretized into several levels by taking into account the sizeof the data base. Using a higher number of conceptual levelswould result in a smaller data range at each level, and thiswould provide a more detailed description, but the accuracy ofdecision tree models might be reduced [33].Several attributes can be selected as input variables. These at-

tributes are considered to be potential determinants of non-ful-fillment in the case that the power system requires the secondaryregulation band assigned to the RS unit. Choosing an appro-priate set of attributes and preprocessing the attribute values inorder to apply a given learning algorithm (for example, a pos-sible discretization of an attribute) is carried out in an iterativetrial and error process [24]. One possible input variable is thetrend that the power generated within the control area has haduntil it achieves a limit equal or higher than . This vari-able is grouped into several levels by clustering techniques (seesubsection C). Other attributes such as the secondary regulationband assigned to the control area, or the marginal price of thatsecondary regulation band, can also be used as input variables.Fig. 3 shows an overview of the behavior of the proposed RS

algorithm. Several input variables can be used, some of themfollowing the use of clustering or discretization. Based on thevalues of the target variables resulting from the decision tree,the RS algorithm will decide whether or not to start up an RSunit. Input variables from a training set and a testing set are usedin Section V to build and to test the tree of the RS-MLBS. InSection VI, input variables from the simulated behavior of a realsecondary frequency control area with real data signals are usedfor the RS-MLBS implementation.These input data sets should incorporate various hours from

several days of the week and from differing seasons. In this waythe RS-MLBS takes into account the current uncertainty andvariability of the system due to demand and renewable genera-tion. If the demand profiles change, due to the higher penetra-tion of renewable generation or other reasons, the decision toolcreated should be tested with this new data and if necessary itsimplementation should be revised and adjusted.Finally, Fig. 4 shows an example of the behavior of the pro-

posed RS algorithm. In this example, the variable that wouldindicate whether the RS unit must be started up is the time ofnon-served secondary power (TNSSP). In Fig. 4, it can be ob-served how one of the predictor variables, the trend of the power

Fig. 4. Description of the RS machine learning based algorithm.

generated within the control until it achieves a start-up decisionpoint (i.e., the points equal or higher than ), may be clas-sified into several groups by using clustering techniques. It canbe also observed that this trend can be used to predict how highthe TNSSP could be. The signals classified by cluster 1 involvethose cases which lead to a small TNSSP when the RS unit isnot started up. By contrast, signals grouped into cluster 2 arecases that involve high TNSSP.

B. Description of the Training Set of the MLBS

To develop the RS-MLBS, a data set composed of data col-lected from real operation should be used. Each data item shouldhave several attributes: 1) the model target variable and 2) themodel input variables. In the learning process of a decision tree,the data are split into the training set and the testing set. Thetraining set is used to build the tree, while the testing set is usedto measure whether the accuracy of the decision tree is accept-able [33]. Usually, the results are also validated by applying thetree to a new dataset. Training set size typically corresponds totwo thirds of the data set, whereas the testing set amounts to onethird [34].

C. Clustering

As previously mentioned, the trends of the power generatedwithin the control area until it achieves a start-up decision pointare grouped by clustering techniques. Two of the most importantclustering algorithms are the K-Means and the Fuzzy C-Means[35], [36]. Fuzzy clustering allows data to belong to severalclusters simultaneously, with grades of membership.The first step is to determine the optimal number of clusters.

In order to obtain this optimal number, the value of the objectivefunction is evaluated as a function of the number of clusters. Theobjective function evaluates the sum of the distances from thedata (i.e., the trends of the power generated within the controlarea until it achieves a start-up decision point) to the cluster cen-ters assigned to these data. The value of the objective functiondecreases as the number of clusters increases, and the most ap-propriate number of clusters is selected by detecting this value,which corresponds to a “knee” in the objective function curve[24].As previously mentioned in the description of the proposed

algorithm, when the power generated within the control area ap-proaches , the unit is not started up, but the MLBS (deci-sion tree) is run. For this purpose, the trends of the power gener-ated within the control area until it achieves a start-up decisionpoint are classified according to their membership to one of the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SABOYA et al.: START-UP DECISION OF A RAPID-START UNIT FOR AGC BASED ON MACHINE LEARNING 5

previously obtained clusters. The cluster closest to the trend isused as input to the decision tree.

V. EXAMPLE CASE: BUILDING THE RS-MLBS

This section describes an example of the development of theRS-MLBS for a secondary frequency control zone within theSpanish power system. Firstly, the secondary frequency controlservice in Spain is revised. Then, the training set of the MLBS,the clustering and the resulting decision tree are detailed. Fi-nally, the accuracy of the decision tree is verified.

A. Secondary Frequency Control in Spain

The Spanish power system belongs to the ENTSO-E powersystem and comprises a regulation area for frequency controlin this system. The secondary frequency control in Spain is op-erated hierarchically. The control area is divided into severalzones, each corresponding to a different generation company.It is the responsibility of each zone to distribute its regulationrequirement among its generation units and thus each zone hasa zonal AGC regulator. In order to obtain a correct dynamic re-sponse of the control area, the TSO (REE) has established dy-namic performance criteria that every zone must fulfill. A zonewill be economically penalized if [37]: 1) it does not complywith the required response criteria, or 2) its scheduled regula-tion band is not available. The zone is penalized in accordancewith the time that it does not comply with the power required(zone set point).

B. Description of the Training Set of the MLBS

From 502 hours of real operation (covering several hours(from 00:00 to 24:00) from all the days of the week and from allfour seasons), a large number of data corresponding to start-updecision points has been collected. A total of 900 data pointshave been randomly selected, which constitute the data set forthe development of the decision tree. Two-thirds of the data (600data points) have been used for the training set. The RS unitunder control is a 65-MW unit.The model target variable used here is the TNSSP, since the

zone is penalized with respect to the time it does not complywith the power required. TNSSP (min.) measures the timeduring which the required power is not fully supplied if the RSunit is not started up. A period of 40 min following the instant

has been chosen in order to evaluate TNSSP, since thisis the minimum time an RS unit has to be started-up.Two attributes have been selected as input variables fol-

lowing a detailed iterative process, resulting in the best choiceof attributes. The first attribute (termed Cluster) is the trend ofthe power generated within the control zone until it achieves astart-up decision point. This attribute has been classified intofive levels using the Fuzzy C-Means clustering algorithm. Thesecond attribute (termed Band) is the secondary regulation bandassigned to the control zone. The band is expressed in MW.Since the decision tree method is more appropriate for

predicting categorical variables, attributes and target variables(i.e., Band and TNSSP) have been discretized into severallevels as shown in Table II. In this table, the distribution ofthese variables has also been shown. First, the attribute Bandhas been discretized into the five intervals shown in Table II,since most of the values are between 300 MW and 600 MW.The attribute Cluster has been classified into five levels usingthe Fuzzy C-Means clustering algorithm. Finally, the targetvariable TNSSP has been discretized into two levels. TNSSP

TABLE IIDISCRETIZATION OF THE INPUT AND THE TARGET VARIABLES

Fig. 5. Power generated trend (PInc) and clusters.

is considered to be low, and therefore the RS unit is not startedup if TNSSP is lower than five minutes. Otherwise, TNSSP isconsidered to be high and the RS is started up. In other words,the decision to start up the RS unit depends on whether TNSSPis high or low. The proposed discretization results from aniterative process where other discretizations have also beencontemplated. The proposed discretization is the one that leadsto the best results.

C. Clustering

As stated before, the trend of the power generated within thecontrol zone until it achieves a start-up decision point has beenclassified into several clusters. Fuzzy C-Means and K-Meansclustering algorithms have been tested. Finally, five clusters bymeans of Fuzzy C-Means clustering algorithm have been used,since better results have been obtained than with the K-Meansalgorithm.The procedure developed has the following characteristics:• The trends (PInc) of the power generated within the con-trol zone in the period 40 minutes prior to the instant ofachieving a start-up decision point have been clustered.The whole data set (900 data) has been used for clustering.Fuzzy C-Means clustering algorithm has been applied.

• In order to obtain the optimal number of clusters, the valueof the objective function is evaluated as a function of thenumber of clusters.

Fig. 5 represents all the 900 trends. It should be mentionedthat trends have been obtained by subtracting at each point themaximum power that can be generated with the units that arecurrently connected (PMaxNoRS) from the power generated. Onthe right, Fig. 5 also shows the evaluation of the objective func-tion for different numbers of clusters. It can be deduced that 4to 5 clusters are sufficient. 5 clusters have been chosen here torepresent the data.Fig. 6 shows the 900 trends (PInc) separated by clusters to-

gether with their corresponding representative cluster. In addi-tion, the number of trends resulting in high or low TNSSP isdepicted for each cluster.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON POWER SYSTEMS

Fig. 6. Power generated trend and clusters separately represented.

Fig. 7. Decision tree.

From Fig. 6 it can be inferred that the clusters with most casesof high TNSSP are Cluster 1 and Cluster 4. Cluster 3 and Cluster5 are those with the smallest number of high TNSSP data.

D. Knowledge Extraction Process: Decision Tree

Decision trees are used to predict how high the TNSSP iswhen the RS unit is not started up. CART decision tree gen-eration algorithm is applied here since this algorithm is widelyused, and is also implemented within the MATLAB softwarepackage [38]. The results obtained with the C4.5 [27] decisiontree generation algorithm are similar to the ones obtained withthe CART algorithm.Fig. 7 shows the decision tree obtained. The tree includes 23

nodes, twelve of which are leaf nodes. The number of data of thenode as well as the number of incorrectly classified instances ineach node are shown. In fact, 405 of the 600 trends are correctlyclassified (67.5%), whereas 195 are incorrectly classified (32.5%).The first classification attribute of the tree is Cluster and

the second attribute is Band. As expected, the band assigned

Fig. 8. Control zone model [23].

strongly affects the variable TNSSP: with higher bands it ismore unlikely that the band is exhausted and that the cases ofHigh TNSSP are lower.

E. Accuracy of the Decision Tree

Once the tree has been built with the training set, the testingset is used to check whether the decision tree’s accuracy is ac-ceptable or not and whether it can be applied to a new dataset.In this case, 199 of the 300 trends are correctly classified

(66.3%) and 101 are incorrectly classified (33.7%). It is inter-esting to observe that the accuracy of the decision tree using thetesting set is similar to that obtained with the training set. Withinthe 101 incorrectly classified trends, 73 trends that should beclassified as Low TNSSP have been classified as High TNSSP,and 28 trends that should be classified as High TNSSP have beenclassified as Low TNSSP. This means that in only 9.33% of allcases, the proposed algorithm is not starting a unit when needed.The accuracy of the decision tree obtained seems to be suffi-ciently acceptable because of the high variability of the data, thecomplexity of the TNSSP prediction, and the slight differencesbetween learning and testing set results; therefore, the tree canbe applied to a new dataset in order to validate its performance.

VI. EXAMPLE CASE: IMPLEMENTING THE RS-MLBSIN A SPANISH SECONDARY FREQUENCY CONTROL ZONE

To validate the performance of the whole RS-MLBS, the de-cision tree is applied to a simulation of a control zone model byusing real data signals. The results are compared with those ob-tained by the experience-based algorithm in [23]. A 110-hoursimulation has been used, and 65 MW of the regulation bandhave been assigned to a 65-MW RS unit. The start-up time ofthe RS unit amounts to seven minutes, and once started up, theunit must remain online for 33 min.The control zone model used here is the one described in [23],

and it is shown in Fig. 8. The model includes: 1) an AGC reg-ulator that distributes the AGC setpoints as calculated from theACE signal, 2) several blocksmodeling the response of the unitsthat are not RS units (Regulating units), 3) a block modeling theRS unit response, and 4) a block that includes the RS algorithm.

The decision tree is run whenever the difference between thecurrent PGen and PMaxNoRS is lower than 30 MW (i.e., Dif-Start is 30 MW). According to the proposed MLBS algorithm,an RS unit will be started up whenever the TNSSP predicted bythe decision tree is high .The results used to compare the algorithms are: 1) the total

number of RS unit start-ups, 2) the number of RS unit start-ups

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SABOYA et al.: START-UP DECISION OF A RAPID-START UNIT FOR AGC BASED ON MACHINE LEARNING 7

TABLE IIIRESULTS OF THE RS ALGORITHMS BEHAVIOR

per day, 3) the time of non-compliance (min), and 4) the per-centage of the time of non-compliance with respect to the totaltime simulated. It has been assumed that there is non-compli-ance in a given period if the difference between the power gen-erated by the whole zone and the power required to the zone (setpoint) in this period is greater than 5 MW. Note that the time ofnon-compliance and TNSSP are not the same, since the formerevaluates the final behavior of the model and not only the zoneof the regulation band assigned to the RS units.In addition, 5) the energy not served by the whole control

area (MWh), NSSE, and 6) the energy not served by the wholecontrol area per simulated hour (MWh) are also compared. Notethat NSSE has not been a criterion in the RS algorithm to decideon a unit start-up.Table III shows the results obtained by using the RS-EB al-

gorithm in [23] and by using the proposed MLBS algorithm.In addition, a worst case is also included where the RS band isassigned to the RS unit, but the unit is never started up. FromTable III it can be concluded that the proposed RS algorithmleads to a remarkable reduction in the number of start-ups of theRS unit. Note, however, that the time of non-compliance (andalso the non-served secondary energy) with the proposed RS al-gorithm did not increase, but it is actually slightly lower than theone obtained with the RS-EB algorithm. The proposed MLBSalgorithm reduces by 46% the number of start-ups and decreasesby 11% the non-compliance time in comparison with the RS-EBalgorithm. This implies a significant reduction in costs withouta deterioration in the response of the secondary frequency con-trol zone.Finally, in Fig. 9, an example of the behavior of both al-

gorithms during the simulation is presented. The RS-EB algo-rithm, once the (shown with a black dot in Fig. 9) isachieved, decides to start up the RS unit. However, it can beobserved in Fig. 9 that due to the start-up time of the RS unit(in this example seven minutes), it is not possible to providethe required power within due time. When the RS unit is finallystarted up and available, this RS unit is no longer needed, andneither is it necessary during the entire period the RS unit shouldbe obligatorily started up (in this example 33 min).By contrast and by using the proposed algorithm, there is

no decision to start up the RS unit once the has beenachieved. The reason for this is that the decision tree predicts alow TNSSP . This is an example in whichthe proposed MLBS algorithm has been able to prevent an un-necessary start up.It can be concluded that the performance of the MLBS as

presented in this paper is acceptable. The results obtained showthe benefits of using a machine learning based system whichemploys decision trees in rapid-start operation.

Fig. 9. Example of the behavior of both algorithms once the control zoneachieves the power that indicates a new start up .

VII. CONCLUSIONS

In this paper, an MLBS to start up an RS unit has been pro-posed. The proposed algorithm allows the automatic start- up ofan RS unit for AGC operation.The proposed algorithm employs several variables to predict

the non-fulfillments in the case of an RS unit not starting up,when the secondary regulation reserve without the RS unit isgetting depleted. If the expected non-fulfillment is consideredexcessive, the RS unit is started up. A decision tree is used topredict the non-fulfillment time. The trend of the power gen-erated in the final minutes is taken into account by means ofclustering procedures.The proposed MLBS has been applied to a secondary fre-

quency control zone within the Spanish power system. A realAGC set point is used as input for the simulation of the con-trol zone model. For the 612 hours of the secondary frequencycontrol zone simulation (around 26 days), the proposed MLBSreduces the number of start-ups and decreases the non-compli-ance time in comparison with a very simple algorithm (RS-EBalgorithm). This implies a significant reduction of costs withno corresponding deterioration in the response of the secondaryfrequency control zone.

ACKNOWLEDGMENT

The authors would like to thank E. Porras, J. L. Ruiz, A.Morón, and R. González from ENDESA for their valuable com-ments and for making available the data needed for the examplecase.

REFERENCES

[1] Y. G. Rebours, D. S. Kirschen, M. Trotignon, and S. Rossignol, “Asurvey of frequency and voltage control ancillary services-part I: tech-nical features,” IEEE Trans. Power Syst., vol. 22, pp. 350–357, 2007.

[2] Z. A. Vale, C. Ramos, P. Faria, J. P. Soares, B. Canizes, and H. M.Khodr, “Ancillary services market clearing simulation: A comparisonbetween deterministic and heuristic methods,” in Proc. 2010 IEEEPower and Energy Soc. General Meeting, 2010, pp. 1–6.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON POWER SYSTEMS

[3] P. Faria, Z. Vale, J. Soares, H. Khodr, and B. Canizes, “ANN based day-ahead ancillary services forecast for electricity market simulation,” inProc. MELECON 2010–2010 15th IEEE Mediterranean Electrotech-nical Conf., 2010, pp. 1159–1164.

[4] A. Papalexopoulos and H. Singh, “On the various design options forancillary services markets,” in Proc. 34th Annu. Hawaii Int. Conf. Syst.Sci., 2001, pp. 8–8.

[5] Y. G. Rebours, D. S. Kirschen, M. Trotignon, and S. Rossignol, “Asurvey of Frequency and voltage control ancillary services—part II:Economic features,” IEEE Trans. Power Syst., vol. 22, pp. 358–366,2007.

[6] UCTE, “Operation Handbook,” Jul. 2004. [Online]. Available:http://www.pse-operator.pl/uploads/kontener/542UCTE_Opera-tion_Handbook.pdf.

[7] P. Kundur, Power System Stability and Control. Palo Alto, CA, USA:McGraw-Hill, 1994.

[8] D. Soler, P. Frias, T. Gomez, and C. A. Platero, “Calculation of theelastic demand curve for a day-ahead secondary reserve market,” IEEETrans. Power Syst., vol. 25, pp. 615–623, 2010.

[9] Y. Rebours and D. Kirshen, A survey of definitions and specificationsof reserve services,” 2005. [Online]. Available: http://www.eee.man-chester.ac uk/research/groups/eeps/publications/reportstheses/aoe/re-bours%20et%20al_tech%20rep_2005B.pdf.

[10] J. Garcia-Gonzalez, A.M. S. Roque, F. A. Campos, and J. Villar, “Con-necting the intraday energy and reserve markets by an optimal redis-patch,” IEEE Trans. Power Syst., vol. 22, pp. 2220–2231, 2007.

[11] P. Havel and O. Malik, “Optimal dispatch of regulation reserves forpower balance control in transmission system,” in Proc. Control andAutomation 2009, 2009, pp. 807–812.

[12] B. J. Kirby and J. D. Kueck, Spinning Reserve from Pump Load: A Re-port to the California Department of Water Resources, 2003. [Online].Available: http://www.consultkirby.com/files/TM2003–99_CDWR.pdf.

[13] M. A. Ortega-Vazquez and D. S. Kirschen, “Optimizing the spinningreserve requirements using a cost/benefit analysis,” IEEE Trans. PowerSyst., vol. 22, pp. 24–33, 2007.

[14] L. Toma, L. Urluescu, M. Eremia, and J. -. Revaz, “Trading ancillaryservices for frequency regulation in competitive electricity markets,”in Proc. 2007 IEEE Lausanne Power Tech, , 2007, pp. 879–884.

[15] ERCOT Topaz Power Management, “10-Minute Non-Spinning Re-serve Service,” Sep. 2009. [Online]. Available: http://www.ercot.com/content/meetings/qstf/keydocs/2009/20091023-QSTF/2009–09-18_10MNSRS_Proposal.pdf.

[16] H. Chao, Li Fangxing, J. Pan, M. Gopinathan, and P. Wong, “As-sessment of quick start resource requirements in market operations,”in Proc. 2005/2006 IEEE PES Transmission and Distribution Conf.Exhib., 2006, pp. 1363–1367.

[17] R. Billinton and A. V. Jain, “The effect of rapid start and hot reserveunits in spinning reserve studies,” IEEE Trans. Power App. Syst., vol.PAS-91, pp. 511–516, 1972.

[18] M. E. Khan and R. Billinton, “Composite system spinning reserveassessment in interconnected systems,” Proc. Inst. Elect. Eng., Gen.,Transm. Distrib., vol. 142, pp. 305–309, 1995.

[19] PJM, “Manual 35: Definitions and Acronyms,” 2010. [Online]. Avail-able: http://www.pjm.com~media/documents/manuals/m35.ashx.

[20] California ISO, “Spinning and Non-Spinning Reserves,” 2006.[Online]. Available: http://www.caiso.com/docs/2003/09/08/2003090815135425649.pdf.

[21] New York Independent System Operator, “Ancillary service Manual,Version 3.19,” Sep. 2011. [Online]. Available: http://www.nyiso.com/public/webdocs/documents/manuals/operations/ancserv.pdf.

[22] RTE France, Réseau de Transport d’Électricité, “Memento ofPower System Reliability,” 2005. [Online]. Available: http://www.rte-france.com/uploads/media/pdf_zip/publications-annuelles/me-mento_surete_2005_completVA__.pdf.

[23] I. Egido, F. Fernández-Bernal, L. Rouco, and I. Saboya, “Operation ofrapid start units in an AGC area,” Przeglad Elektrotechniczny, vol. 88,pp. 141–145, 2012.

[24] L. Wehenkel, Automatic Learning Techniques in Power Systems.Boston, MA, USA: Kluwer, 1997.

[25] L. Sigrist, I. Egido, E. F. Sanchez-Ubeda, and L. Rouco, “Represen-tative operating and contingency scenarios for the design of UFLSschemes,” IEEE Trans. Power Syst., vol. 25, pp. 906–913, 2010.

[26] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification andRegression Trees. Belmont, CA, USA: Chapman & Hall, 1984.

[27] R. Quinlan, C4.5: Programs for Mach. Learning. San Mateo, CA:Morgan Kaufmann, 1993.

[28] Linna Li and Z. Xuemin, “Study of data mining algorithm based ondecision tree,” in Proc. 2010 Int. Conf. Compututer Design and Appli-cations (ICCDA), 2010, pp. 1–155.

[29] G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: hierarchical clus-tering using dynamic modeling,” Computer, vol. 32, pp. 68–75, 1999.

[30] G. Hinuber and H. -. Haubrich, “Optimal intraday operation strategy ofpower plants at wholesale and reserve markets,” in Proc. 2007 IEEELausanne Power Tech, , 2007, pp. 855–860.

[31] G. K. Toh and H. B. Gooi, “Cost/benefit and reliability studies on rapid-start units for energy/reserve contributions,” in Proc. 6th Int. Conf. Eur.Energy Market, 2009 (EEM 2009), 2009, pp. 1–6.

[32] R. Quinlan, “Induction of decision trees,” Mach. Learn., vol. 1, pp.81–106, 1986.

[33] Z. Yu, F. Haghighat, B. C. M. Fung, and H. Yoshino, “A decision treemethod for building energy demand modeling,” Energy & Buildings,vol. 42, pp. 1637–1646, 2010.

[34] P. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining.Reading, MA, USA: Addison-Wesley, 2005.

[35] T. Velmurugan and T. Santhanam, “Performance evaluation ofK-means and fuzzy C-means clustering algorithms for statistical dis-tributions of input data points,” Eur. J. Sci. Res., vol. 46, pp. 320–330,2010.

[36] S.-C.Wang and P.-H. Huang, “A fuzzymethod for power systemmodelreduction,” in Proc. 2004 IEEE Int. Conf. Fuzzy Systems, 2004, vol. 2,pp. 891–894.

[37] E. L. Miguélez, I. E. Cortés, L. R. Rodríguez, and G. L. Camino, “Anoverview of ancillary services in Spain,” Elect. Power Syst. Res., vol.78, pp. 515–523, 2008.

[38] MathWorks, “Statistics Toolbox User’s Guide.”. Natick, MA, 2009.

Inmaculada Saboya was born in Madrid, Spain, in 1986. She received the de-gree of Industrial Engineer from the Universidad Pontificia Comillas, Madrid,Spain, in 2010.In September 2010, she joined the Instituto de Investigación Tecnológica

(IIT) of Universidad Pontificia Comillas as a Research Assistant, where sheworks in different projects related to automatic generation control (AGC) andsystem stability. Her interests include system modeling and control, controlsystem design, and power system stability.

Ignacio Egido was born in Arévalo (Ávila), Spain, in 1976. He received theM.S. and Ph.D. degrees in electrical engineering from the Universidad PontificiaComillas, Madrid, Spain, in 2000 and 2005, respectively.He is currently an Assistant Professor in the Department of Electrical Engi-

neering of the School of Engineering of Universidad Pontificia Comillas. Hedevelops his research activities at the Instituto de Investigación Tecnológica(IIT) of the same university, where he has been involved in a number of re-search projects related to AGC and power system stability. His interests includecontrol system design and power systems stability and control.

Luis Rouco (M’91) received the Ingeniero Industrial and Doctor Ingeniero In-dustrial degrees from Universidad Politécnica de Madrid, Madrid, Spain, in1985 and 1990, respectively.He is a Professor of electrical engineering in the School of Engineering of

Universidad Pontificia Comillas, Madrid, attached to the Department of Elec-trical Engineering. He served as Director of the Department of Electrical En-gineering from 1999 to 2005. He develops his research activities at Institutode Investigación Tecnologica (IIT) of the same university, where he has super-vised more than 100 research and consultancy projects for Spanish and foreigncompanies. He has published more than 70 papers in conferences and journals.He has been a visiting researcher at Ontario Hydro, Toronto, ON, Canada; theMassachusetts Institute of Technology, Cambridge, MA, USA; and ABB PowerSystems, Vasteras, Sweden. His areas of interest are modeling, analysis, simu-lation, and identification of electric power systems.Prof. Rouco is a member of Cigré, the Vice-President of the Spanish Chapter

of the IEEE Power Engineering Society, and a member of the Executive Com-mittee of Spanish National Committee of Cigré.