Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Repeated games for privacy-aware distributedstate estimation in the Smart Grid
E. Veronica Belmega†, Lalitha Sankar‡, and H. Vincent Poor∗
† ETIS / ENSEA - UCP - CNRS, France‡Arizona State University, AZ, USA
∗Princeton University, NJ, USA
29/11/2012
Outline
1 State Estimation Problem in the Smart Grid
2 System Model
3 Games Framework
4 Conclusions
NetGCoop 2012 | E.V. Belmega | 29/11/2012 2 / 22
Introduction
State Estimation (SE) Problem
General issue : large-scale monitoring and observability of interconnected power systems
Managed locally by several Regional Transmission Organisations (RTOs)
Traditional SE : fully centralized approachMeasurements are sent from the RTOs to a central coordinatorSE is performed at the central coordinator
Hierarchical SE : distributed approaches [VCHRP-1981] [GEAVJGQ-2011]Each RTO performs local SEAll results are gathered at the central coordinator to obtain a system-wide solution
NetGCoop 2012 | E.V. Belmega | 29/11/2012 3 / 22
Introduction
State Estimation (SE) Problem
General issue : large-scale monitoring and observability of interconnected power systems
Managed locally by several Regional Transmission Organisations (RTOs)
Traditional SE : fully centralized approachMeasurements are sent from the RTOs to a central coordinatorSE is performed at the central coordinator
Hierarchical SE : distributed approaches [VCHRP-1981] [GEAVJGQ-2011]Each RTO performs local SEAll results are gathered at the central coordinator to obtain a system-wide solution
A fully distributed approach is needed for size and scalability reasons.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 3 / 22
Introduction
State Estimation (SE) Problem
General issue : large-scale monitoring and observability of interconnected power systems
Managed locally by several Regional Transmission Organisations (RTOs)
Traditional SE : fully centralized approachMeasurements are sent from the RTOs to a central coordinatorSE is performed at the central coordinator
Hierarchical SE : distributed approaches [VCHRP-1981] [GEAVJGQ-2011]Each RTO performs local SEAll results are gathered at the central coordinator to obtain a system-wide solution
A fully distributed approach is needed for size and scalability reasons.
Is it possible to design fully-distributed SE via cooperation among RTOs ? Is there a conflictof interest ? What will be the signaling cost among RTOs ?
NetGCoop 2012 | E.V. Belmega | 29/11/2012 3 / 22
System Model
System Model and Communication Protocol
RTO 2
α
β
1X 2X
+
1Z
+
2Z
1Y
2Y
RTO 2 Enc./Dec.
RTO 1 Enc./Dec.
( )1 21 11 1, , , KJ J J J= … ( )1 2
2 22 2, , , KJ J J J= …
Assumptions [SKTP-2011] :
Two adjacent RTOs
Linearized noisy Gaussianmeasurements model
Yj = Hj,j Xj + Hi,jXi + Zj
Yj measurement of RTO j ∈ {1, 2},Xj ∼ N (0, 1) state of RTO j ,Zj ∼ N (0, σ2
j )
H1,1 = H2,2 = 1, H1,2 = α > 0,H2,1 = β > 0
RTOs encode (compress) a block of measurements and share with each other
K such rounds are allowed
NetGCoop 2012 | E.V. Belmega | 29/11/2012 4 / 22
[SKTP-2011] L. Sankar, S. K. Kar, R. Tandon, H. V. Poor, “Competitive Privacy in the Smart Grid : An Information-theoreticApproach”, Smart Grid Communications, Brussels, Belgium, Oct. 2011.
System Model
Competitive Privacy (Distributed SE) in the Smart Grid
[SKTP-2011]
Conflict at the RTO level :
Need to share data for better estimation fidelity
Withholding data for economic and end-user privacy reasons
Information theoretical formalism :
RTO j encodes/quantizes its measurements at rate Rj
Fidelity of SE is measured by distortion Dj : mean square error between the originaland reconstructed state vectors
Leakage measure Lj : the average mutual information between the state vector andthe revealed data by the two RTOs, measurement at the other RTO
NetGCoop 2012 | E.V. Belmega | 29/11/2012 5 / 22
System Model
Rate-Distortion-Leakage (RDL) Tradeoff
[SKTP-2011] : (R1, R2, D1, D2, L1, L2)
For RTO j ∈ {1, 2} :
I If Dmin,i < Di < Dmax,i :
Rj =1
2log
(
cjm2j
Di − Dmin,i
)
and Lj =1
2log
(
m2j
m2j Dmin,j + n2
j (Di − Dmin,i)
)
II. If Di ≥ Dmax,i : Rj = 0 and Lj = log(
Vi/(Vi − qj))
/2
NetGCoop 2012 | E.V. Belmega | 29/11/2012 6 / 22
System Model
Rate-Distortion-Leakage (RDL) Tradeoff
[SKTP-2011] : (R1, R2, D1, D2, L1, L2)
For RTO j ∈ {1, 2} :
I If Dmin,i < Di < Dmax,i :
Rj =1
2log
(
cjm2j
Di − Dmin,i
)
and Lj =1
2log
(
m2j
m2j Dmin,j + n2
j (Di − Dmin,i)
)
II. If Di ≥ Dmax,i : Rj = 0 and Lj = log(
Vi/(Vi − qj))
/2
minimum distortion ⇐⇒ maximum leakage
minimum leakage ⇐⇒ maximum distortion
NetGCoop 2012 | E.V. Belmega | 29/11/2012 6 / 22
System Model
Rate-Distortion-Leakage (RDL) Tradeoff
[SKTP-2011] : (R1, R2, D1, D2, L1, L2)
For RTO j ∈ {1, 2} :
I If Dmin,i < Di < Dmax,i :
Rj =1
2log
(
cjm2j
Di − Dmin,i
)
and Lj =1
2log
(
m2j
m2j Dmin,j + n2
j (Di − Dmin,i)
)
II. If Di ≥ Dmax,i : Rj = 0 and Lj = log(
Vi/(Vi − qj))
/2
minimum distortion ⇐⇒ maximum leakage
minimum leakage ⇐⇒ maximum distortion
How to choose the operating point of the network ?
NetGCoop 2012 | E.V. Belmega | 29/11/2012 6 / 22
Games Framework
What happens in a strategic environment ?
RTOs are strategic autonomous entities
Target distortion levels : Dj ∈ (Dmin,j , Dmax,j ]
Objective : each RTO chooses its own encoding rate Rj to optimize its utility
uj (Rj , Ri) = −wjLj (Di (Rj)) + wj log
(
Dj
Dj (Ri)
)
The encoding rates are constrained by the target distortions :
Rj (Di) =
∣
∣
∣
∣
[
log(
cj m2j /(D i − Dmin,i)
)
/2, ∞)
, if Di < Dmax,i
[0, ∞) , otherwise
NetGCoop 2012 | E.V. Belmega | 29/11/2012 7 / 22
Games Framework
Equivalent model
Rate-distortion pairs are one-to-one mappings
Action of RTO j : aj , Di ∈ Aj , (Dmin,i , Di ]
Simplified expression of the utility function :
uj(aj , ai ) = −wj Lj(aj ) + wj log
(
Dj
ai
)
NetGCoop 2012 | E.V. Belmega | 29/11/2012 8 / 22
Games Framework
Equivalent model
Rate-distortion pairs are one-to-one mappings
Action of RTO j : aj , Di ∈ Aj , (Dmin,i , Di ]
Simplified expression of the utility function :
uj(aj , ai ) = −wj Lj(aj ) + wj log
(
Dj
ai
)
One-shot interaction G = (P , {1, 2}, {Aj }j∈P , {uj }j∈P )
Decoupled optimization problems : a∗j = Di , ∀ j ∈ P
NetGCoop 2012 | E.V. Belmega | 29/11/2012 8 / 22
Games Framework
Equivalent model
Rate-distortion pairs are one-to-one mappings
Action of RTO j : aj , Di ∈ Aj , (Dmin,i , Di ]
Simplified expression of the utility function :
uj(aj , ai ) = −wj Lj(aj ) + wj log
(
Dj
ai
)
One-shot interaction G = (P , {1, 2}, {Aj }j∈P , {uj }j∈P )
Decoupled optimization problems : a∗j = Di , ∀ j ∈ P
The RTOs reveal their measurements at the minimum rates required to achieve the targetdistortions.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 8 / 22
Games Framework
Equivalent model
Rate-distortion pairs are one-to-one mappings
Action of RTO j : aj , Di ∈ Aj , (Dmin,i , Di ]
Simplified expression of the utility function :
uj(aj , ai ) = −wj Lj(aj ) + wj log
(
Dj
ai
)
One-shot interaction G = (P , {1, 2}, {Aj }j∈P , {uj }j∈P )
Decoupled optimization problems : a∗j = Di , ∀ j ∈ P
The RTOs reveal their measurements at the minimum rates required to achieve the targetdistortions.
What are the incentives required to achieve other possible RDL tuples ?
NetGCoop 2012 | E.V. Belmega | 29/11/2012 8 / 22
Games Framework Pricing Mechanisms
Plan
1 State Estimation Problem in the Smart Grid
2 System Model
3 Games Framework⊲ Pricing Mechanisms⊲ Repeated Interactions
4 Conclusions
NetGCoop 2012 | E.V. Belmega | 29/11/2012 9 / 22
Games Framework Pricing Mechanisms
Logarithmic pricing function [BSKP-2012]
Each RTO is rewarded proportional to the improvement of the SE fidelity at the other RTO
Objective function for the one-shot interaction :
u(p)j (aj , ai) = uj (aj , ai) + pj log
(
Di
aj
)
Rate-distortion function is proportional to the logarithm of the distortion
NetGCoop 2012 | E.V. Belmega | 29/11/2012 10 / 22
[BSP-2012] E.V. Belmega, L. Sankar, H. V. Poor, “Pricing mechanisms for cooperative state estimation”, ISCCSP 2012, Rome,Italy, May 2012.
Games Framework Pricing Mechanisms
Logarithmic pricing function [BSKP-2012]
Each RTO is rewarded proportional to the improvement of the SE fidelity at the other RTO
Objective function for the one-shot interaction :
u(p)j (aj , ai) = uj (aj , ai) + pj log
(
Di
aj
)
Rate-distortion function is proportional to the logarithm of the distortion
Any distortion pairs (D1, D2) can be achieved by appropriately tunning the price level andat the cost of leaking information.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 10 / 22
[BSP-2012] E.V. Belmega, L. Sankar, H. V. Poor, “Pricing mechanisms for cooperative state estimation”, ISCCSP 2012, Rome,Italy, May 2012.
Games Framework Pricing Mechanisms
Numerical Illustration [BSKP-2012]
Scenario : α = 0.5, β = 0.9, σ21 = 0.1, σ2
2 = 0.1, Dj = Dmin,j + (Dmax,j − Dmin,j )/2
Optimal rate as functions of the price : R∗1 (p1)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6−3
−2
−1
0
1
2
3
4
p1
Opt
imal
rat
e of
RT
O 1
α=0.5, β=0.9, σ12=σ
22=0.1
R1(p1)
Rmin,1(D2)
R∗
1
T0,1
=0.5
T2,1
=1.0548
T1,1
=1.5093
If p1 < T2,1 then R∗j = Rmin,j (Di ). If T2,1 ≤ p1 < T1,1 then
R∗j = Rj (pj ). If p1 ≥ T1,1, then R∗
j → ∞.
Optimal distortion of RTO 2 as function of the price : D∗2 (p1)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.4
p1
Opt
imal
dis
tort
ion
of R
TO
2
α=0.5, β=0.9, σ12=σ
22=0.1
T0,1
=0.5
T2,1
=1.0548
T1,1
=1.5093
Strictly positive pricing is required for the RTOs to revealinformation. Any distortion Di ∈ (Dmin,i , Di ]
can be achieved by tuning the price pj .
NetGCoop 2012 | E.V. Belmega | 29/11/2012 11 / 22
Games Framework Repeated Interactions
Plan
1 State Estimation Problem in the Smart Grid
2 System Model
3 Games Framework⊲ Pricing Mechanisms⊲ Repeated Interactions
4 Conclusions
NetGCoop 2012 | E.V. Belmega | 29/11/2012 12 / 22
Games Framework Repeated Interactions
Repeated games
Pricing techniques may appear artificial (RTOs are paid for their sharing data)
With no pricing, can better distortion levels be achieved as a result of repeated interactionsamong RTOs ?
Assumptions :
G is played repeatedly
T > 1 rounds (finite or infinite)
At each round, players can observe the history of the game and condition theircurrent play on past action profiles
NetGCoop 2012 | E.V. Belmega | 29/11/2012 13 / 22
Games Framework Repeated Interactions
Definitions and notations
a(t) = (a(t)1 , a(t)
2 ) action profile at stage t
h(t+1) = (a(1), . . . , a(t)) ∈ H(t+1) history at the end of stage t ≥ 1
Pure strategy{
s(t)j
}T
t=1, contingent plan of how to play in each stage t for any history h(t)
stj : H(t) −→ (Dmin,i , Di ]
s(t)j (h(t)) = a(t)j ∈ (Dmin,i , Di ].
Discounted payoff for player j for joint strategy s = (s1, s2)
vj (s) = (1 − ρ)
T∑
t=1
ρt−1uj(a(t))
Present payoffs are more important than future ones.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 14 / 22
Games Framework Repeated Interactions
Subgame perfect equilibrium
For any history h(t) ∈ H(t), the game from stage t onwards is a subgame GR(h(t))
Final history in GR(h(t)) : h(T+1) = (h(t), a(t), . . . , a(T ))
Strategies for GR(h(t)) are functions of the possible histories that are consistent with h(t)
DefinitionA subgame perfect equilibrium, s∗, is a strategy profile (in a multi-stage game with observedactions) such that for all h(t) ∈ H(t), the restriction s∗|h(t) is a Nash equilibrium for the subgameGR(h(t)).
Refined equilibrium concept : allows players to make only credible commitments on whichthey have incentives to follow through.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 15 / 22
Games Framework Repeated Interactions
Subgame perfect equilibrium
For any history h(t) ∈ H(t), the game from stage t onwards is a subgame GR(h(t))
Final history in GR(h(t)) : h(T+1) = (h(t), a(t), . . . , a(T ))
Strategies for GR(h(t)) are functions of the possible histories that are consistent with h(t)
DefinitionA subgame perfect equilibrium, s∗, is a strategy profile (in a multi-stage game with observedactions) such that for all h(t) ∈ H(t), the restriction s∗|h(t) is a Nash equilibrium for the subgameGR(h(t)).
Refined equilibrium concept : allows players to make only credible commitments on whichthey have incentives to follow through.
Are there any non-trivial subgame perfect equilibria in our game ?
NetGCoop 2012 | E.V. Belmega | 29/11/2012 15 / 22
Games Framework Repeated Interactions
Finite-horizon
Number of rounds T < +∞ is finite
Players have perfect knowledge on the ending of the game
Proposition
In the finite-horizon repeated game G(T )R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P, T ), the unique subgame
perfect equilibrium is “not to share any information above the minimum requirement” at eachstage of the game and for both players :
s(t),∗j = Di , ∀t ∈ {1, . . . , T }, ∀j ∈ P.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 16 / 22
Games Framework Repeated Interactions
Finite-horizon
Number of rounds T < +∞ is finite
Players have perfect knowledge on the ending of the game
Proposition
In the finite-horizon repeated game G(T )R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P, T ), the unique subgame
perfect equilibrium is “not to share any information above the minimum requirement” at eachstage of the game and for both players :
s(t),∗j = Di , ∀t ∈ {1, . . . , T }, ∀j ∈ P.
Similar to prisoners’ dilemma
Cooperation by sharing data beyond the minimum requirement is not a crediblecommitment and cannot be enabled in finite-horizon repeated game.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 16 / 22
Games Framework Repeated Interactions
Infinite-horizon
Number of rounds T → +∞
The players are unsure of the ending of the game
Proposition
In the infinite-horizon repeated game G(∞)R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P), if Dmin,j > 0 for all j ∈ P ,
then the strategy “not to share any information above the minimum requirement” at each stage ofthe game and for both players is a subgame perfect equilibrium :
s(t),∗j = Di , ∀t ≥ 1, ∀j ∈ P.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 17 / 22
Games Framework Repeated Interactions
Infinite-horizon
Number of rounds T → +∞
The players are unsure of the ending of the game
Proposition
In the infinite-horizon repeated game G(∞)R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P), if Dmin,j > 0 for all j ∈ P ,
then the strategy “not to share any information above the minimum requirement” at each stage ofthe game and for both players is a subgame perfect equilibrium :
s(t),∗j = Di , ∀t ≥ 1, ∀j ∈ P.
Are there any other subgame perfect equilibria ? Can non-trivial exchange of data besustainable in the long-term ?
NetGCoop 2012 | E.V. Belmega | 29/11/2012 17 / 22
Games Framework Repeated Interactions
Proposition
In the infinite-horizon repeated game G(∞)R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P) and for any agreement
profile (D∗2 , D∗
1 ) ∈ (Dmin,2, D2) × (Dmin,2, D2) such that{
u1(D∗2 , D∗
1 ) > u1(D2, D1)
u2(D∗1 , D∗
2 ) > u2(D1, D2).
if the discount factor is bounded as follows :
1 > ρ > maxj∈P, Di ∈(D∗
i,Di ]
{
uj (Di , D∗j ) − uj (D∗
i , D∗j )
uj(Di , D∗j ) − uj(Di , Dj )
}
and Dmin,j > 0 for all j ∈ P , then the following strategy is a subgame perfect equilibrium : “Anyplayer j : shares data at the agreement point D∗
i in the first stage and continues to share data atthis agreement point so long as the other player i shares data at the agreement point D∗
j . If anyplayer has ever defected from the agreement point, then the players do not cooperate beyond theminimum requirement from this stage on”.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 18 / 22
Games Framework Repeated Interactions
Proposition
In the infinite-horizon repeated game G(∞)R = (P,
{
Sj
}
j∈P,{
vj
}
j∈P) and for any agreement
profile (D∗2 , D∗
1 ) ∈ (Dmin,2, D2) × (Dmin,2, D2) such that{
u1(D∗2 , D∗
1 ) > u1(D2, D1)
u2(D∗1 , D∗
2 ) > u2(D1, D2).
if the discount factor is bounded as follows :
1 > ρ > maxj∈P, Di ∈(D∗
i,Di ]
{
uj (Di , D∗j ) − uj (D∗
i , D∗j )
uj(Di , D∗j ) − uj(Di , Dj )
}
and Dmin,j > 0 for all j ∈ P , then the following strategy is a subgame perfect equilibrium : “Anyplayer j : shares data at the agreement point D∗
i in the first stage and continues to share data atthis agreement point so long as the other player i shares data at the agreement point D∗
j . If anyplayer has ever defected from the agreement point, then the players do not cooperate beyond theminimum requirement from this stage on”.
If the discount factor is large enough, in the long-term, it is possible to achieve betterdistortion levels.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 18 / 22
Games Framework Repeated Interactions
Numerical IllustrationScenario : r1 , w1/w1, r2 , w2/w2, α = 0.9, β = 0.5, σ2
1 = σ22 = 0.1,
Dj = Dmin,j + 0.5(Dmax,j − Dmin,j), Dmin,1 = 0.3088, D1 = 0.3926, Dmin,2 = 0.2183,D2 = 0.2388.
Sustainable distortion pairs (r1 = 1, r2 = 5)
Relatively asymmetric distortion pairs are not achievable.
Sustainable distortion pairs (r1 = 5, r2 = 5)
Higher the impact of the fidelity w.r.t. the leakage term on thepayoff function, the larger the sustainable region is and lower
the distortion levels.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 19 / 22
Conclusions
Conclusions
Questions : How high does the discount factor have to be ? Is it always possible to achievenon-trivial distortion pairs in the long term ?
Knowledge required at each RTO :
Their own state measurements
Perfect knowledge of the history of play
Overall system parameters
Tradeoff : signaling between the RTOs and the distributed SE problem
Extension to more than two RTOs : finding the achievable RDL tradeoff is an open issue[Sankar-Kar-Poor-Tandon]
NetGCoop 2012 | E.V. Belmega | 29/11/2012 20 / 22
Conclusions
Bibliography
L. Sankar, S. K. Kar, R. Tandon, H. V. Poor, “Competitive Privacy in the Smart Grid : AnInformation-theoretic Approach”, Smart Grid Communications, Brussels, Belgium, Oct.2011.
T. Van Cutsem, J. L. Horward, and M. Ribbens-Pavella, “A two-level static state estimatorfor electric power systems”, IEEE Trans. Power Apparatus and Systems, vol. 100, no. 8, pp.3722–3732, Aug. 1981.
A. Gómez-Expósito, A. Abur, A. de la Villa Jaén, and Catalina Gómez-Quiles, “A MulitlevelState Estimation Paradigm for Smart Grids”, Proc. IEEE, vol. 99, no. 6, pp. 952–976, Jun.2011.
E.V. Belmega, L. Sankar, and V. Poor, “Pricing mechanisms for cooperative stateestimation”, ISCCSP 2012, Rome, Italy, May 2012.
NetGCoop 2012 | E.V. Belmega | 29/11/2012 21 / 22
Conclusions
Rate-Distortion-Leakage (RDL) Tradeoff
[SKTP-2011] : (R1, R2, D1, D2, L1, L2)
For RTO j ∈ {1, 2} :
I If Dmin,i < Di < Dmax,i :
Rj =1
2log
(
cjm2j
Di − Dmin,i
)
and Lj =1
2log
(
m2j
m2j Dmin,j + n2
j (Di − Dmin,i)
)
II. If Di ≥ Dmax,i : Rj = 0 and Lj = log(
Vi/(Vi − qj))
/2
where q1 = β, q2 = α, V1 = 1 + α2 + σ21 , V2 = 1 + β2 + σ2
2 , E = α + β
cj =V1V2 − E2
Vi,n1 =
V2 − βE
V1V2 − E2, n2 =
V1 − αE
V1V2 − E2,m1 =
αV2 − E
V1V2 − E2, m2 =
βV1 − E
V1V2 − E2
Dmin,1 = 1 −(β2V1 + V2 − 2βE)
(V1V2 − E2),Dmin,2 = 1 −
(V1 + α2V2 − 2αE)
(V1V2 − E2), Dmax,j = 1 −
1
Vj
NetGCoop 2012 | E.V. Belmega | 29/11/2012 22 / 22