Upload
tilden
View
65
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Monitoring and Controlling the Smart Electric Power Grid with Isis2. Ken Birman. The power grid is getting old…. Today’s electric power grid is a legacy of a long evolution that has many structural vestiges of the monopoly period As a result much power is wasted - PowerPoint PPT Presentation
Citation preview
1
MONITORING AND CONTROLLING THE SMART ELECTRIC POWER GRID WITH ISIS2.
Ken Birman
2
The power grid is getting old…
Today’s electric power grid is a legacy of a long evolution that has many structural vestiges of the monopoly period
As a result much power is wasted With new “green” power sources, some utilities
literally purchase power and throw it away A great deal of what we generate is lost in transmission And much of what gets delivered ends up being used to light empty rooms,
keep water hot when nobody is home to use the shower, or gets turned into heat by computers that has to be removed using air conditioning
Statistics 66% of the energy produced for electricity is lost, 10% of that in transmission. 71% of the energy produced for transportation is wasted. 20% of the energy produced to run American industry is lost, and 20% of the energy that we use in our commercial and residential buildings is
wasted.
3
Smart-Grid Vision
Within the home, office or building: a smart-meter dispatches power-hungry tasks at times convenient to the grid Home A/C, freezer Hot water heater, dish or clothes washer Recharging an electric vehicle
The meter knows a lot about you and privacy is a concern here. “Privacy preservation” highly desired
4
The smart grid would also control:
High-power inverters for DC / AC linkages
High-tension long-haul interconnects for sharing excess power needed to maintain load balance
Power storage units (pumped water, inertial storage systems, even batteries)
Microgenerators that phase in when demand is high and prices justify doing so, out when not in use
Grid use of excess power from home solar arrays
5
A little bit of history
Prior to the 1960’s, most power in the United States was controlled by regional monopolies They had sole control over the power generation,
delivery and billing structure There was no competition, so prices were
regulated Owning a utility was a guarantee of hefty revenue
But the lack of competition also meant that there was little incentive for innovation
Similar stories in many countries worldwide
6
Restructuring: The US response A process of restructuring has slowly changed
this: Big regions broken into a collection of power
producing companies, “independent service operator” (ISO), consumer-delivery companies
Emergence of 24-hour ahead of time power exchange markets where larger “parcels” of power can be bought and sold, permitting 24-hour lead time planning to ensure that the grid will meet its needs
ISO helps ensure competitiveness and stability Increased use of long-distance high-tension lines to
share “reactive power” between regions
7
How a small power grid operates Power flows “like water”
Path of least resistance Governed by Kirchoff’s Law Power enters at every generator,
exits at every load Hierarchical structure:
Primary “power busses” Secondary smaller local feeds
10-Generator, 39-bus New England System
8
Operating a restructured grid
Many entities, each with very limited visibility into the state of their neigbors Two issues: limited sharing of topological data
and limited sharing of real-time status, driven by competitive concerns, security worries, “evolution” of the system topology as lines break, are fixed, are changed
In practice, you “know your own grid” and know (a little) about tie-lines to your neighbors
Various ad-hoc sharing and collaboration solutions are in place.... long list of blackouts in past decade testify to the issues with this approach!
9
State estimation
This is the problem of collecting measurements of the power system
Then fitting the measurements to a model generated from the systems bus architecture
Like a least-squares optimization… complicated by: Transients as devices come on/go offline Imprecision of our models of the physical grid
itself: The bus architecture is really a simplification of the actual system and fails to capture many aspects
10
Ways of doing state estimation Historical: SCADA
Track the line voltage and frequency If frequency drifts, adjust generators “State estimation” by fitting power equation to
observed data, a computationally costly task
New: “synchrophasor measurement unit” or PMU Directly measures the phase angle of a power bus Enables fast, hierarchical state estimation... but
only if the current topology is accurately known
11
Adoption Limitations
Renewables very hard to “plan” for, hence often offer power at times when it isn’t convenient and can sag suddenly (visualize: cloud over solar field)
Poor past experiences with machine-control solutions in the physical loop
In a restructured grid, visibility into neighboring regions is of limited quality, topology data may be unavailable
Commercial market place dominated by six big players, and they sell all the products. Extremely proprietary
12
Power grid under attack?
Richard Clarke has argued (in his book CyberWar) that we are at grave risk of attack on the grid He believes that the grid is already so connected
to the Internet that it can be disrupted easily Argues that some forms of attack would destroy
huge amounts of hard to replace infrastructure: “logic bombs”
And he suggests that many countries may actually have prepositioned these logic bombs for use during times of political stress or war
13
GridCloud: Scalable, secure monitoring
14
As deployed: Mutually-distrusting domains
Distinct owners: Peersplus hierarchy (ISO)
Owner controls data flow:distinct security policies
ISO integrates data...but as we get furtherfrom sources, quality ofinformation is apotential concern
15
... and we’re using Isis2 to build it! Our challenge:
Ken and his students understand the Isis2 library
But the people who create the smart applications for the smart grid are not likely to be Isis2 programmers
Can we leverage Isis2 without it being too visible?
16
Initial insights
We want to run at cloud scale, but with an emphasis on accurate real-time data Recall the Facebook challenge of computing
with extensive non-coherent caching We can only use that technique for data
that changes very rarely, and in GridCloud there would not be a great deal of such data
Thus we will need to use coherent replication very heavily
17
Initial Insights
Most GridCloud users won’t be “programmers” They won’t build new programs, but may
run existing ones that evolved over decades in other settings
They will want to important those solutions and run them in a cloud but with minimal changes
So GridCloud needs to leverage Isis2 yet the GridCloud user probably won’t be a person who could learn to use Isis2 directly
18
Key design features
Redundancy to overcome network or cloud scheduling delays, faults
Open-flow network routing scheme integrated with a new security architecture we’ve developed
On cloud, can host various applications: some ignore sharded data format but others could leverage it
We use Isis2 to manage and control the system Free open source group communication
system Includes a new scalable distributed “data
analysis” tool
19
What will GridCloud applications do?
Consume monitoring data, output “advice” (later, perhaps do direct control for some technologies)
These applications fall into categories Use of machine-learning to develop optimized plans
for dispatch of power Tools for helping operators manage and control their
smart grid networks Contingency reaction solutions for dealing with various
kinds of environmental (or other) disruptions “What-if?” technologies to assist the owner in evolving
their system with new technology capabilities
20
Dangers of Inconsistency
Inconsistency causes bugs Cloud control system speaks with “two
voices” In physical infrastructure settings,
consequences can be very costly
“Switch on the 50KV Canadian bus”
“Canadian 50KV bus going offline”
Bang!
GridCloud Consistency: Based on Isis2
21
Virtually synchronous runs are indistinguishable from synchronous runs
We are using Isis2 to create a control and management system for GridCloud
p
q
r
s
t
Time: 0 10 20 30 40 50 60 70
p
q
r
s
t
Time: 0 10 20 30 40 50 60 70
Synchronous execution Virtually synchronous execution
Non-replicated reference execution
A=3 B=7 B = B-A
A=A+1
22
Security
Mutually suspicious organizations that nonetheless must collaborate safely while repelling attackers Human issue: design acceptable information
sharing policies at a relatively “fine grained” level API issue: Need suitable ways to express these
policies so that operators will understand them Technical issue: GridCloud managed network would
need to correctly implement the desired behaviors Given complexity of the setting, these goals
represent significant challenges
23
Red Sky
First part of our solution, being developed by PhD students Zhiyuan Teo and Erluo Li Special secure hardware endpoints for
sensors Redundant network routing over Open Flow
switches Security policy controls connections Data arrives in Cloud on sharded data
collectors Special “TCP Reliability” (TCPR) layer allows
us to shift endpoints from collector to collector in a seamless way
Isis2 used within Red Sky but invisble to the user!
24
DMake
Created by PhD student Theo Gkountouvas Role is to monitor and manage cloud
application Each application is provided with
configuration data (runtime command line arguments plus XML files with parameter information)
As it runs, each creates XML files of status, which DMake collects
DMake then runs an optimization module that updates the configuration data, which is pushed back to the applications
25
Why do we call it DMake?
On Linux, the Make program and Makefile format is familiar and very popular DMake uses the exact same format for its control
files! New functionality is to collect information in a
secure, consistent and fault-tolerant way Optimization component is a “plug in” program
provided by the GridCloud user
Thus DMake uses Isis2 and the GridCloud user benefits without seeing Isis2 directly!
26
Experiments with our prototype 2 Area System Simulation – Standard IEEE
test system Two Induced Contingencies
@ 1min mark – line 7-8 disconnected Reconnected after 5 seconds
@ 5min mark – lines 7-8 & 8-9 disconnected Not reconnected
27
Early experience with GridCloud Starting point: An existing state
estimation systemWSU Code Cornell Code WSU Code
28
Modified version leverages redundancy
29
Latency DistributionGraphs: Number of times a particular latency occurs
1751902052202352502652802953103253403553703854004154304454604754905055205355505650
50
100
150
200
250
300
350
400
Best ReplicaReplica 3Replica 2Replica 1
1751902052202352502652802953103253403553703854004154304454604754905055205355505650
50
100
150
200
250
300
350
400
Best Replica
30
JitterGraphs: Jitter of previous latency graphs
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40-30-20-10
010203040
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)Jitt
er (m
s)
31
Replica Latency ReductionGraphs: Latency – Length of time until all necessary data is available
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)
Late
ncy
(ms)
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)
Late
ncy
(ms)
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)
Late
ncy
(ms)
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)La
tenc
y (m
s)
32
Improvement Summary
Vastly reduced latency and jitter All replica latency reduced from typical [235 , 350]
to [215 , 250] All replica jitter reduced from typical [-10 ,10] to [-
5 , 5] Consistency from replica to replica has
improved Consistency between “all replicas” to a
particular replica has improved Consistency over time for each run has
improved
33
Today: Latency DistributionGraphs: Number of times a particular latency occurs
1751902052202352502652802953103253403553703854004154304454604754905055205355505650
50
100
150
200
250
300
350
400
450
500
Best ReplicaReplica 3Replica 2Replica 1
1751902052202352502652802953103253403553703854004154304454604754905055205355505650
50
100
150
200
250
300
350
400
450
500
Best Replica
34
Today: JitterGraphs: Jitter of previous latency graphs
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40-30-20-10
010203040
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)Jitt
er (m
s)
35
Case Study: Failure of 1/6th of Collector Nodes
We fail Replica1-Node2 of the collectors for 1 minute
This results in 62 out of 294 PMU streams being unavailable for 1 minute
36
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)
Late
ncy
(ms)
Failure Case: Replica Latency ReductionGraphs: Latency – Length of time until all necessary data is available
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)
Late
ncy
(ms)
0 50 100 150 200 250 300 350175
275
375
475
575
Late
ncy
(ms)
Data Loss
0 50 100 150 200 250 300 350175
275
375
475
575
Time (Seconds)La
tenc
y (m
s)
37
Failure Case: Latency DistributionGraphs: Number of times a particular latency occurs
175 188 201 214 227 240 253 266 279 292 305 318 331 344 357 370 383 396 409 422 435 448 461 474 487 500 513 526 539 552 5650
50
100
150
200
250
300
Best ReplicaReplica 3Replica 2Replica 1
175 188 201 214 227 240 253 266 279 292 305 318 331 344 357 370 383 396 409 422 435 448 461 474 487 500 513 526 539 552 5650
50
100
150
200
250
300
Best Replica
38
Failure Case: JitterGraphs: Jitter of previous latency graphs
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40-30-20-10
010203040
Time (Seconds)
Jitter
(ms)
0 50 100 150 200 250 300 350-40
-20
0
20
40
Time (Seconds)Jitt
er (m
s)
39
A long road ahead...
Future monitoring need could involve tens or hundreds of thousands of PMUs operated by multiple mutually distrusting organizations
Security requirements will force us into a world with multiple side-by-side “micro-clouds” that share data only in controlled ways
Any serious system will host many “smart” applications and need to automate dispatch, scheduling, configuration management
40
Building Smart Grid Applications These need to be machine learning algorithms
For example: in these weather conditions, demand for electric power tends to follow such-and-such a curve
With this knowledge, you can predict demand... and plan accordingly
If demand will exceed supply, you can learn what steps would lower demand, and by how much
So the typical application is a machine learning application that performs well (in real-time) over the Ida infrastructure where the data is living
41
“Killer apps”
Use grid state to Predict load Make decisions about what to repair first after a
storm Explore options if something unexpectedly fails and
the grid must be operated in a damaged mode Decide what additional power lines would have the
best impact on power pricing Use electric power from a wide region as “assurance”
to permit better use of renewable energy sources Identify malfunctioning components or those close to
failure
42
Created by Professor Dave BakkenExpert on Electric Power Grids and Smart Grid
Some Killer Apps slides
#1: Decision Support
Operators must constantly make decisions Conditions change as lines come up or go down,
power demand rises or falls, generators go online or offline…
Emerging Smart Grid is more and more complex: best decision is not evident and impact unclear Smart tools can really help with planning
Acknowledgments: Professor Anjan Bose
#2: Integration of Renewables
More and more homes have micro-generators But to use this green, renewable power the grid
needs to be able to “plan” and balance loads Even a cloud can have a big impact on use
patterns GridCloud commissions thousands of nodes
analyzing candidate control steps in parallel Best approach (actionable steps) is given to
operators Acknowledgements: Prof. Anjan Bose (WSU)
#3: Mitigation Control
Rare combination of events do happen Have lead to many blackouts when not mitigated!
E.g., N-3 contingency (3 failures) never planned for Infrequent but hugely expensive to analyze GridCloud commissions thousands of nodes
analyzing candidate mitigation steps in parallel Best approach (actionable steps) is given to
operators Acknowledgements: Prof. Mani
Venkatasubramanian (WSU)
#2: Oscillation Alarm Processing
Grids oscillate between regions Negatively damping can lead to blackout E.g., Oregon/California in July 1996: 0.3 Hz (!!)
GridCloud commissions massive parallel computations exploring huge permutation space Looking for trends and correlations of alarm data Also huge number of model-based simluations too Finds root cause much faster than possible today in
much broader set of conditions Acknowledgements: Prof. Mani
Venkatasubramanian (WSU)
#3: Post-Tripping Fault Diagnosis
Protection scheme trips a relay, but why? Underlying cause must be ascertained post facto
GridCloud commissions massive computations to identify the fault(s) that provoked the trip(s) Many different kinds of fault diagnosis
algorithms, all could be run in parallel Possible integration candidate: openFLE (fault
location engine) from Grid Protection Alliance Acknowledgements: Prof. Anuraug Srivastava
(WSU)
#4: Multi-Resolution Frequency Disturbance Visualization
Grid operates in very narrow range unless stressed Frequency excursions outside this give clues to problems
Frequency disturbance recorder (FDR): new device recording frequency disturbances at high rates E.g., internal sampling of FNET device (in our lab): 1440 Hz
GridCloud commissions thousands of parallel frequency rendering computations Provide operators a rich suite of visualizations with which
to better understand nature and cause of present excursion
Acknowledgements: Prof. Yilu Liu (University of Tennessee, Knoxville)
#5: Multi-Dimensional Computations over Both Space and Time
Two existing GridSim apps can be combined in rich ways possible only with cloud computing
Hierarchical linear state estimation: rich coverage of (geographical) space At one snapshot in time Obvious extensions over more space with more PMUs
Oscillation monitoring Uses moving window of time (a few seconds typically) Over streaming data Produces a single number: damping factor Obvious parallel computations over different sets of
data with different time windows and algorithms
#5: Multi-Dimensional Computations over Both Space and Time (cont.)
Combination: provide rich set of two-dimensional (space, time) data to any desired location Enables extremely powerful new families of
applications operating coherently over both space and time
At each location: different time windows, different algorithms, different sets of data
If available, people would inevitably think of many uses for this data
Acknowledgements: Prof. Anjan Bose (WSU)
#6: Ultimate Scale: Tertiary Monitoring Centers
Balancing authorities (144 in North America) must have remote backup control centers Hot backups with same data and apps
TVA found great value in having a tertiary control center Limited to monitoring: control outputs
computed but not used Obvious candidates for the cloud But this is barely scratching the surface here…
#6: Ultimate Scale: Tertiary Monitoring Centers (cont.)
Major problem today: balancing authorities have almost no visibility anywhere in grid except for a few places in a few neighbors “Flying blind”, The Economist, 2004
Why not just share more? Data stored at another utility is problematic for
owner Storing in cloud could alleviate this
Only access a subset of data and/or derived info Access opened up when grid sufficiently stressed
#6: Ultimate Scale: Tertiary Monitoring Centers (cont.)
Above is static with default steady state Could also drill down on demand with
elastic computations Using higher-fidelity algorithms Using higher-resolution data
Acknowledgements: Russell Robertson (Grid Protection Alliance), for the TVA example (though not the cloud possibilities)
#7: Robust Adaptive Topology Control (RATC)
Use software to optimize grid topology switching as the control resource
Technology: use topology control to enhance operations and manage disruptions in grid
• Massively parallel computations to Detect, classify, and respond to grid
disturbances Ensure the grid maintains efficient
operations while guaranteeing reliability Acknowledgements: Prof. Mladen
Kezunivoc, Texas A&M University. Funded by the ARPA-E GENI program
Prosumer: An economically motivated power system participant that can consume, produce, store, or transport electricity Interact with other prosumers through
services – generation, consumption, storage, and transportationE.g. A utility prosumer aggregating
heterogeneous home user prosumers to provide consumption and storage services to a distribution ISO prosumer
Drastically increased data acquisition rates, autonomy, distributed control capability
#8: Prosumer-Based Distributed Autonomous
Cyber-Physical Architecture (cont.)
#8: Prosumer-Based Distributed Autonomous
Cyber-Physical Architecture (cont.) GridCloud commissions massive parallel computations
exploring huge permutation space Heterogeneous data aggregation for utility level
device management that accounts for instantaneous interoperability Home users can change their strategies (e.g. local
storage is not available) Scenario generators for prosumers at different level
(in scale) Data organization and processing
Acknowledgements: Prof. Santiago Grijalva (Georgia Institute of Technology, Georgia) Funded by ARPA-E GENI program
57
Famous baseball quote (Yogi Berra)“The future is hard to predict, if it hasn’t already happened.”
Future Vision
58
Future vision
System continously collects data from a wide range of data sources
Pulls it into an archive and also delivers it to real-time state estimation and “problem sensing” apps
Other applications run from time to time to assist in planning, system management, rapid response to problems that have arisen
Even the “use” of electricity is increasingly controlled and scheduled for optimization
59
Western privacy concerns
Smart electric power meters will know a great deal about the lives of the customers When they are home, how many are in the
house What time do they take showers? What movies do they watch? Do they drink coffee with breakfast? Make
toast? People do not trust the public utilities
with this data. Can we optimize decision making while
preserving data privacy?
60
GridCloud envisions a kind of“GooglePlex” for the Smart Grid
A complex mix of technical, security andeven social challenges Research can demonstrate feasibility but actual uptake
will depend on economics and political factors Prototype uses Isis2 in many ways (isis2.codeplex.com)
We’re trying to hide the technologies we use, such as Isis2, from users. Analogy: conductor of an orchestra Nonetheless, the style of power grid computing changes Even our open source model is new for this community