Repeated games for privacy-aware ... - Lalitha Sankar · Repeated games for privacy-aware distributed state estimation in the Smart Grid E. Veronica Belmega†, Lalitha Sankar‡,

Repeated games for privacy-aware distributedstate estimation in the Smart Grid

E. Veronica Belmega†, Lalitha Sankar‡, and H. Vincent Poor∗

† ETIS / ENSEA - UCP - CNRS, France‡Arizona State University, AZ, USA

∗Princeton University, NJ, USA

29/11/2012

Outline

1 State Estimation Problem in the Smart Grid

2 System Model

3 Games Framework

4 Conclusions

NetGCoop 2012 | E.V. Belmega | 29/11/2012 2 / 22

Introduction

State Estimation (SE) Problem

General issue : large-scale monitoring and observability of interconnected power systems

Managed locally by several Regional Transmission Organisations (RTOs)

Traditional SE : fully centralized approachMeasurements are sent from the RTOs to a central coordinatorSE is performed at the central coordinator

Hierarchical SE : distributed approaches [VCHRP-1981] [GEAVJGQ-2011]Each RTO performs local SEAll results are gathered at the central coordinator to obtain a system-wide solution


Introduction






A fully distributed approach is needed for size and scalability reasons.


Introduction






A fully distributed approach is needed for size and scalability reasons.

Is it possible to design fully-distributed SE via cooperation among RTOs ? Is there a conflictof interest ? What will be the signaling cost among RTOs ?


System Model

System Model and Communication Protocol

RTO 2

α

β

1X 2X

+

1Z

+

2Z

1Y

2Y

RTO 2 Enc./Dec.

RTO 1 Enc./Dec.

( )1 21 11 1, , , KJ J J J= … ( )1 2

2 22 2, , , KJ J J J= …

Assumptions [SKTP-2011] :

Two adjacent RTOs

Linearized noisy Gaussianmeasurements model

Yj = Hj,j Xj + Hi,jXi + Zj

Yj measurement of RTO j ∈ {1, 2},Xj ∼ N (0, 1) state of RTO j ,Zj ∼ N (0, σ2

j )

H1,1 = H2,2 = 1, H1,2 = α > 0,H2,1 = β > 0

RTOs encode (compress) a block of measurements and share with each other

K such rounds are allowed


[SKTP-2011] L. Sankar, S. K. Kar, R. Tandon, H. V. Poor, “Competitive Privacy in the Smart Grid : An Information-theoreticApproach”, Smart Grid Communications, Brussels, Belgium, Oct. 2011.

System Model

Competitive Privacy (Distributed SE) in the Smart Grid

[SKTP-2011]

Conflict at the RTO level :

Need to share data for better estimation fidelity

Withholding data for economic and end-user privacy reasons

Information theoretical formalism :

RTO j encodes/quantizes its measurements at rate Rj

Fidelity of SE is measured by distortion Dj : mean square error between the originaland reconstructed state vectors

Leakage measure Lj : the average mutual information between the state vector andthe revealed data by the two RTOs, measurement at the other RTO


System Model

Rate-Distortion-Leakage (RDL) Tradeoff

[SKTP-2011] : (R1, R2, D1, D2, L1, L2)

For RTO j ∈ {1, 2} :

I If Dmin,i < Di < Dmax,i :

Rj =1

2log

(

cjm2j

Di − Dmin,i

)

and Lj =1

2log

(

m2j

m2j Dmin,j + n2

j (Di − Dmin,i)

)

II. If Di ≥ Dmax,i : Rj = 0 and Lj = log(

Vi/(Vi − qj))

/2


System Model


[SKTP-2011] : (R1, R2, D1, D2, L1, L2)

For RTO j ∈ {1, 2} :


Rj =1

2log

(

cjm2j

Di − Dmin,i

)

and Lj =1

2log

(

m2j

m2j Dmin,j + n2

j (Di − Dmin,i)

)


Vi/(Vi − qj))

/2

minimum distortion ⇐⇒ maximum leakage

minimum leakage ⇐⇒ maximum distortion


System Model


[SKTP-2011] : (R1, R2, D1, D2, L1, L2)

For RTO j ∈ {1, 2} :


Rj =1

2log

(

cjm2j

Di − Dmin,i

)

and Lj =1

2log

(

m2j

m2j Dmin,j + n2

j (Di − Dmin,i)

)


Vi/(Vi − qj))

/2

minimum distortion ⇐⇒ maximum leakage

minimum leakage ⇐⇒ maximum distortion

How to choose the operating point of the network ?


Games Framework

What happens in a strategic environment ?

RTOs are strategic autonomous entities

Target distortion levels : Dj ∈ (Dmin,j , Dmax,j ]

Objective : each RTO chooses its own encoding rate Rj to optimize its utility

uj (Rj , Ri) = −wjLj (Di (Rj)) + wj log

(

Dj

Dj (Ri)

)

The encoding rates are constrained by the target distortions :

Rj (Di) =

∣

∣

∣

∣

[

log(

cj m2j /(D i − Dmin,i)

)

/2, ∞)

, if Di < Dmax,i

[0, ∞) , otherwise


Games Framework

Equivalent model

Rate-distortion pairs are one-to-one mappings

Action of RTO j : aj , Di ∈ Aj , (Dmin,i , Di ]

Simplified expression of the utility function :

uj(aj , ai ) = −wj Lj(aj ) + wj log

(

Dj

ai

)


Games Framework

Equivalent model





(

Dj

ai

)

One-shot interaction G = (P , {1, 2}, {Aj }j∈P , {uj }j∈P )

Decoupled optimization problems : a∗j = Di , ∀ j ∈ P


Games Framework

Equivalent model





(

Dj

ai

)



The RTOs reveal their measurements at the minimum rates required to achieve the targetdistortions.


Games Framework

Equivalent model





(

Dj

ai

)



The RTOs reveal their measurements at the minimum rates required to achieve the targetdistortions.

What are the incentives required to achieve other possible RDL tuples ?


Games Framework Pricing Mechanisms

Plan


2 System Model

3 Games Framework⊲ Pricing Mechanisms⊲ Repeated Interactions

4 Conclusions



Logarithmic pricing function [BSKP-2012]

Each RTO is rewarded proportional to the improvement of the SE fidelity at the other RTO

Objective function for the one-shot interaction :

u(p)j (aj , ai) = uj (aj , ai) + pj log

(

Di

aj

)

Rate-distortion function is proportional to the logarithm of the distortion


[BSP-2012] E.V. Belmega, L. Sankar, H. V. Poor, “Pricing mechanisms for cooperative state estimation”, ISCCSP 2012, Rome,Italy, May 2012.


Logarithmic pricing function [BSKP-2012]

Each RTO is rewarded proportional to the improvement of the SE fidelity at the other RTO

Objective function for the one-shot interaction :

u(p)j (aj , ai) = uj (aj , ai) + pj log

(

Di

aj

)

Rate-distortion function is proportional to the logarithm of the distortion

Any distortion pairs (D1, D2) can be achieved by appropriately tunning the price level andat the cost of leaking information.


[BSP-2012] E.V. Belmega, L. Sankar, H. V. Poor, “Pricing mechanisms for cooperative state estimation”, ISCCSP 2012, Rome,Italy, May 2012.


Numerical Illustration [BSKP-2012]

Scenario : α = 0.5, β = 0.9, σ21 = 0.1, σ2

2 = 0.1, Dj = Dmin,j + (Dmax,j − Dmin,j )/2

Optimal rate as functions of the price : R∗1 (p1)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6−3

−2

−1

0

1

2

3

4

p1

Opt

imal

rat

e of

RT

O 1

α=0.5, β=0.9, σ12=σ

22=0.1

R1(p1)

Rmin,1(D2)

R∗

1

T0,1

=0.5

T2,1

=1.0548

T1,1

=1.5093

If p1 < T2,1 then R∗j = Rmin,j (Di ). If T2,1 ≤ p1 < T1,1 then

R∗j = Rj (pj ). If p1 ≥ T1,1, then R∗

j → ∞.

Optimal distortion of RTO 2 as function of the price : D∗2 (p1)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

0.4

p1

Opt

imal

dis

tort

ion

of R

TO

2

α=0.5, β=0.9, σ12=σ

22=0.1

T0,1

=0.5

T2,1

=1.0548

T1,1

=1.5093

Strictly positive pricing is required for the RTOs to revealinformation. Any distortion Di ∈ (Dmin,i , Di ]

can be achieved by tuning the price pj .


Games Framework Repeated Interactions

Plan


2 System Model

3 Games Framework⊲ Pricing Mechanisms⊲ Repeated Interactions

4 Conclusions



Repeated games

Pricing techniques may appear artificial (RTOs are paid for their sharing data)

With no pricing, can better distortion levels be achieved as a result of repeated interactionsamong RTOs ?

Assumptions :

G is played repeatedly

T > 1 rounds (finite or infinite)

At each round, players can observe the history of the game and condition theircurrent play on past action profiles



Definitions and notations

a(t) = (a(t)1 , a(t)

2 ) action profile at stage t

h(t+1) = (a(1), . . . , a(t)) ∈ H(t+1) history at the end of stage t ≥ 1

Pure strategy{

s(t)j

}T

t=1, contingent plan of how to play in each stage t for any history h(t)

stj : H(t) −→ (Dmin,i , Di ]

s(t)j (h(t)) = a(t)j ∈ (Dmin,i , Di ].

Discounted payoff for player j for joint strategy s = (s1, s2)

vj (s) = (1 − ρ)

T∑

t=1

ρt−1uj(a(t))

Present payoffs are more important than future ones.



Subgame perfect equilibrium

For any history h(t) ∈ H(t), the game from stage t onwards is a subgame GR(h(t))

Final history in GR(h(t)) : h(T+1) = (h(t), a(t), . . . , a(T ))

Strategies for GR(h(t)) are functions of the possible histories that are consistent with h(t)

DefinitionA subgame perfect equilibrium, s∗, is a strategy profile (in a multi-stage game with observedactions) such that for all h(t) ∈ H(t), the restriction s∗|h(t) is a Nash equilibrium for the subgameGR(h(t)).

Refined equilibrium concept : allows players to make only credible commitments on whichthey have incentives to follow through.



Subgame perfect equilibrium

For any history h(t) ∈ H(t), the game from stage t onwards is a subgame GR(h(t))

Final history in GR(h(t)) : h(T+1) = (h(t), a(t), . . . , a(T ))

Strategies for GR(h(t)) are functions of the possible histories that are consistent with h(t)

DefinitionA subgame perfect equilibrium, s∗, is a strategy profile (in a multi-stage game with observedactions) such that for all h(t) ∈ H(t), the restriction s∗|h(t) is a Nash equilibrium for the subgameGR(h(t)).

Refined equilibrium concept : allows players to make only credible commitments on whichthey have incentives to follow through.

Are there any non-trivial subgame perfect equilibria in our game ?



Finite-horizon

Number of rounds T < +∞ is finite

Players have perfect knowledge on the ending of the game

Proposition

In the finite-horizon repeated game G(T )R = (P,

{

Sj

}

j∈P,{

vj

}

j∈P, T ), the unique subgame

perfect equilibrium is “not to share any information above the minimum requirement” at eachstage of the game and for both players :

s(t),∗j = Di , ∀t ∈ {1, . . . , T }, ∀j ∈ P.



Finite-horizon

Number of rounds T < +∞ is finite

Players have perfect knowledge on the ending of the game

Proposition

In the finite-horizon repeated game G(T )R = (P,

{

Sj

}

j∈P,{

vj

}

j∈P, T ), the unique subgame

perfect equilibrium is “not to share any information above the minimum requirement” at eachstage of the game and for both players :

s(t),∗j = Di , ∀t ∈ {1, . . . , T }, ∀j ∈ P.

Similar to prisoners’ dilemma

Cooperation by sharing data beyond the minimum requirement is not a crediblecommitment and cannot be enabled in finite-horizon repeated game.



Infinite-horizon

Number of rounds T → +∞

The players are unsure of the ending of the game

Proposition

In the infinite-horizon repeated game G(∞)R = (P,

{

Sj

}

j∈P,{

vj

}

j∈P), if Dmin,j > 0 for all j ∈ P ,

then the strategy “not to share any information above the minimum requirement” at each stage ofthe game and for both players is a subgame perfect equilibrium :

s(t),∗j = Di , ∀t ≥ 1, ∀j ∈ P.



Infinite-horizon

Number of rounds T → +∞

The players are unsure of the ending of the game

Proposition


{

Sj

}

j∈P,{

vj

}

j∈P), if Dmin,j > 0 for all j ∈ P ,

then the strategy “not to share any information above the minimum requirement” at each stage ofthe game and for both players is a subgame perfect equilibrium :

s(t),∗j = Di , ∀t ≥ 1, ∀j ∈ P.

Are there any other subgame perfect equilibria ? Can non-trivial exchange of data besustainable in the long-term ?



Proposition


{

Sj

}

j∈P,{

vj

}

j∈P) and for any agreement

profile (D∗2 , D∗

1 ) ∈ (Dmin,2, D2) × (Dmin,2, D2) such that{

u1(D∗2 , D∗

1 ) > u1(D2, D1)

u2(D∗1 , D∗

2 ) > u2(D1, D2).

if the discount factor is bounded as follows :

1 > ρ > maxj∈P, Di ∈(D∗

i,Di ]

{

uj (Di , D∗j ) − uj (D∗

i , D∗j )

uj(Di , D∗j ) − uj(Di , Dj )

}

and Dmin,j > 0 for all j ∈ P , then the following strategy is a subgame perfect equilibrium : “Anyplayer j : shares data at the agreement point D∗

i in the first stage and continues to share data atthis agreement point so long as the other player i shares data at the agreement point D∗

j . If anyplayer has ever defected from the agreement point, then the players do not cooperate beyond theminimum requirement from this stage on”.



Proposition


{

Sj

}

j∈P,{

vj

}

j∈P) and for any agreement

profile (D∗2 , D∗

1 ) ∈ (Dmin,2, D2) × (Dmin,2, D2) such that{

u1(D∗2 , D∗

1 ) > u1(D2, D1)

u2(D∗1 , D∗

2 ) > u2(D1, D2).

if the discount factor is bounded as follows :

1 > ρ > maxj∈P, Di ∈(D∗

i,Di ]

{

uj (Di , D∗j ) − uj (D∗

i , D∗j )

uj(Di , D∗j ) − uj(Di , Dj )

}

and Dmin,j > 0 for all j ∈ P , then the following strategy is a subgame perfect equilibrium : “Anyplayer j : shares data at the agreement point D∗

i in the first stage and continues to share data atthis agreement point so long as the other player i shares data at the agreement point D∗

j . If anyplayer has ever defected from the agreement point, then the players do not cooperate beyond theminimum requirement from this stage on”.

If the discount factor is large enough, in the long-term, it is possible to achieve betterdistortion levels.



Numerical IllustrationScenario : r1 , w1/w1, r2 , w2/w2, α = 0.9, β = 0.5, σ2

1 = σ22 = 0.1,

Dj = Dmin,j + 0.5(Dmax,j − Dmin,j), Dmin,1 = 0.3088, D1 = 0.3926, Dmin,2 = 0.2183,D2 = 0.2388.

Sustainable distortion pairs (r1 = 1, r2 = 5)

Relatively asymmetric distortion pairs are not achievable.

Sustainable distortion pairs (r1 = 5, r2 = 5)

Higher the impact of the fidelity w.r.t. the leakage term on thepayoff function, the larger the sustainable region is and lower

the distortion levels.


Conclusions

Conclusions

Questions : How high does the discount factor have to be ? Is it always possible to achievenon-trivial distortion pairs in the long term ?

Knowledge required at each RTO :

Their own state measurements

Perfect knowledge of the history of play

Overall system parameters

Tradeoff : signaling between the RTOs and the distributed SE problem

Extension to more than two RTOs : finding the achievable RDL tradeoff is an open issue[Sankar-Kar-Poor-Tandon]


Conclusions

Bibliography

L. Sankar, S. K. Kar, R. Tandon, H. V. Poor, “Competitive Privacy in the Smart Grid : AnInformation-theoretic Approach”, Smart Grid Communications, Brussels, Belgium, Oct.2011.

T. Van Cutsem, J. L. Horward, and M. Ribbens-Pavella, “A two-level static state estimatorfor electric power systems”, IEEE Trans. Power Apparatus and Systems, vol. 100, no. 8, pp.3722–3732, Aug. 1981.

A. Gómez-Expósito, A. Abur, A. de la Villa Jaén, and Catalina Gómez-Quiles, “A MulitlevelState Estimation Paradigm for Smart Grids”, Proc. IEEE, vol. 99, no. 6, pp. 952–976, Jun.2011.

E.V. Belmega, L. Sankar, and V. Poor, “Pricing mechanisms for cooperative stateestimation”, ISCCSP 2012, Rome, Italy, May 2012.


Conclusions


[SKTP-2011] : (R1, R2, D1, D2, L1, L2)

For RTO j ∈ {1, 2} :


Rj =1

2log

(

cjm2j

Di − Dmin,i

)

and Lj =1

2log

(

m2j

m2j Dmin,j + n2

j (Di − Dmin,i)

)


Vi/(Vi − qj))

/2

where q1 = β, q2 = α, V1 = 1 + α2 + σ21 , V2 = 1 + β2 + σ2

2 , E = α + β

cj =V1V2 − E2

Vi,n1 =

V2 − βE

V1V2 − E2, n2 =

V1 − αE

V1V2 − E2,m1 =

αV2 − E

V1V2 − E2, m2 =

βV1 − E

V1V2 − E2

Dmin,1 = 1 −(β2V1 + V2 − 2βE)

(V1V2 − E2),Dmin,2 = 1 −

(V1 + α2V2 − 2αE)

(V1V2 − E2), Dmax,j = 1 −

1

Vj


Documents

Repeated games for privacy-aware ... - Lalitha Sankar · Repeated games for privacy-aware distributed state estimation in the Smart Grid E. Veronica Belmega†, Lalitha Sankar‡,