1
Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks Waclaw Godycki, Christopher Torng, Ivan Bukreyev Alyssa Apsel, Christopher Batten School of Electrical and Computer Engineering Cornell University 1 Abstract Recent work has shown that monolithic integration of voltage regulators will be feasible in the near future, enabling reduced system cost and the potential for fine-grain voltage scaling (FGVS). In this project, we use architecture-level modeling to explore a new dynamic voltage/frequency scaling controller called the fine-grain synchronization controller (FG-SYNC+). FG-SYNC+ enables improved performance and energy efficiency at similar average power for multithreaded applications with activity imbalance. We then use circuit-level modeling to explore various approaches to organizing on-chip voltage regulation, including a new approach called reconfigurable power distribution networks (RPDNs). RPDNs allow one regulator to “borrow” energy storage from regulators associated with underutilized cores resulting in improved area/power efficiency and faster response times. We evaluate FG-SYNC+ and RPDN using a vertically integrated research methodology, and our results demonstrate a 10–50% performance and 10–70% energy-efficiency improvement on the majority of the applications studied compared to no FGVS, yet RPDN uses 40% less area compared to a more traditional per-core regulation scheme. 2 Motivation Monolithic integration using a standard CMOS process provides a tremendous cost incentive for integrating closed-loop voltage regulators on the die. Recent technology trends suggest that it is now becoming feasible to integrate switching regulators on-chip (e.g., Intel Haswell), enabling reduced system cost as well as the potential for fine-grain voltage scaling (FGVS) to exploit fine-grain activity imbalance in multi-threaded applications for performance and energy efficiency benefits. Integrated Voltage Regulation in Intel Haswell Processors Haswell V LLC V DDQ Graphics Engine Ring Interconnect System Agent + PCIe + DMI DRAM Cntl V CCIN FIVR CPUs + Cache V CPU0 V RING V GPU V SA V IOA V CPU1 Figure from D. Kanter, MPR’13 Fine-Grain Activity Imbalance in Multi-Threaded Applications bfs rsort strsearch 0 7 0 7 0 7 Application Start Application End 90μs 7μs 2.3μs 3 Target System Our target system is an embedded processor composed of: eight in-order, single-issue, five-stage, RISC cores; private, coherent 16 KB instruction and data L1 caches; and a shared 512 KB unified L2 cache. We implemented the core and L1 memory system for this design in RTL and used a commercial standard-cell-based CAD toolflow targeting a TSMC 65 nm process to generate layout. Each core can run at 333 MHz at 1 V and the full eight-core system is approximately 6 mm 2 . Target System Layout 4 Types of Integrated Voltage Regulation The three primary types of step-down voltage regulators are linear regulators (e.g., LDOs), inductor-based switching regulators (e.g., buck), and capacitor-based switching regulators (e.g., switched-capacitor). 5 FGVS Architecture Design: FG-SYNC+ We explore a new FGVS controller called the fine-grain synchronization controller (FG-SYNC+) that exploits fine-grain scaling in level (i.e., many voltage levels), space (i.e., per-core regulation), and time (i.e., fast transition times between levels) to improve performance and energy efficiency while maintaining similar average power. FG-SYNC+ uses a thread library instrumented with hint instructions to inform the hardware about which cores are doing useful work vs. useless work (e.g., waiting for a task, waiting at a barrier). (d) rsort with 100 ns Response Time (b) splash2-lu with Eight Domains time (a) splash2-lu with Two Domains (c) rsort with 1000 ns Response Time time time time 7 Core 0 Exec. Time Reduced by 20% 7 Core 0 Super-Sprint Sprint Nominal Rest Busy Waiting 7 Core 0 7 Core 0 In these application activity plots illustrating FG-SYNC+, rows show controller decisions per-core and black strips above cores show when that core is active. We compare SPLASH-2 LU factorization with two vs. four voltage domains (a,b). We also illustrate the impact of slow voltage-settling response times over a small excerpt from radix sort (c,d). Fine-Grain Voltage Scaling with FG-SYNC+ We use three sensitivity studies to understand the implication of varying: (1) the number of voltage levels, (2) the number of voltage domains, (3) and voltage-settling response times. 0.7/1.0V 1.0/1.15V 1.0/1.33V 0.7/1.0/1.33V 0.7/1.0/1.15V 0.7/1.0/1.15/1.33V 2 Domains 4 Domains 8 Domains 1000 ns 100 ns 0 ns isopower isopower Increasing sprinting levels isopower Increasing number of domains Decreasing response time isopower Norm. Energy Efficiency 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.0 1.1 1.2 1.3 1.4 1.5 Speedup Speedup Norm. Energy Efficiency 2.0 1.8 1.6 1.4 1.2 1.0 0.8 1.0 1.1 1.2 1.3 1.4 1.5 0.4 0.6 0.8 1.0 1.2 1.4 2-Level Controllers 3-Level vs. 4-Level Controllers Different Numbers of Domains Different Response Times Speedup Speedup To exploit fine-grain activity imbalance, (1) at least three levels are required and four levels helps further; (2) more domains results in improved performance and energy efficiency; (3) response times of 100 ns or faster are required. 6 FGVS Circuit Design: RPDN Shown in (a), we use a single fixed-voltage regulator (SFVR) as a baseline to compare against more sophisticated regulation schemes. We choose a configuration that can provide 80% efficiency at 1 V with an area of 0.26 mm 2 (4% of the core/L1 area). Shown in (b), multiple adjustable voltage regulators (MAVR) enable fine-grain voltage scaling in space and level. The power efficiency vs. area plot in (c) shows how we choose a per-core regulator area of 0.08 mm 2 to allow efficient voltage regulation for super-sprint. Note that designing for super-sprint significantly over-provisions for rest, nominal, and sprint modes; also, only one or two cores will ever be using the super-sprint mode at any given time. Core 1 with Adjustable Regulator Vin AVR Control Unit Core Phase 0 Core 0 with Adjustable Regulator (b) MAVR Abstract Schematic Vin Flyback Cap Core 0 SFVR Control Unit S P P Vout = Vin/2 S Phase 0 Core 1 Core 7 (a) SFVR Abstract Schematic 80 75 70 65 60 55 0.02 0.04 0.06 0.08 Power Efficiency (%) Area (mm 2 ) Nominal Mode Super-Sprint Mode 0.08 mm 2 80% Efficiency at Super-Sprint Chosen Design Point (c) MAVR Efficiency vs. Area Reconfigurable Power Distribution Networks We propose a new approach called reconfigurable power distribution networks (RPDNs). As shown below, RPDNs include many small “unit cells” that each contain the flyback capacitance and regulator switches required for a SC regulator. These cells can be flexibly reconfigured through a switch fabric and combined with per-core control circuitry to effectively create multiple differently-sized SC regulators “on-demand” for cores. The inset shows how 16 unit cells can be allocated to four cores operating in four different modes. Vout Control Loop Core 0 Core 1 RPDN Fabric Control RPDN Switch Fabric D C B A Unit Cell X r S N Allocating 16 Cells The efficiency vs. power plot shown for a single core illustrates how lower-voltage modes require less area for high efficiency. A resting core only requires a single cell; RPDN would allow a super-sprinting core to “borrow” three cells from a resting core while maintaining high efficiency. Power Efficiency 10 20 30 40 50 60 70 0 60% 65% 70% 75% 80% 85% Rest (0.7V) Nominal (1.0V) Sprint (1.15V) Super-Sprint (1.33V) Output Power (mW) 7 Cells 6 Cells 5 Cells 4 Cells 3 Cells 2 Cells 1 Cell 100 150 200 250 300 350 400 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 Time (ns) Voltage (V) 120 ns 150 ns 1.35 1.25 1.15 1.05 0.95 0.85 0.75 0.25 1.0 1.75 2.5 3.25 4.0 Time (us) Voltage (V) 1390 ns 960 ns 2900 ns MAVR Transient Response RPDN Transient Response The RPDN architecture provides area savings of 40% over MAVR when supporting per-core supply regulation across the same number of cores. In addition to reducing area overhead, RPDN significantly reduces the voltage-settling response time. The waveforms shown here illustrate the difference in the transient responses for RPDN and MAVR when transitioning between modes. For RPDN, the response time for the nominal to super-sprint transition takes 150 ns while the same transition takes 2.9 μs with MAVR. 7 Evaluation Evaluation Methodology We use a vertically integrated evaluation methodology that uses a mix of circuit-, gate-, register-transfer-, and architectural-level modeling. Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Verilog RTL of Multicore RISC Processor Gate-Level Model Verilog Simulator Verilog Simulator Switching Activity Power Model Layout Synthesis P&R SPICE Circuit Models of PDNs SPICE Simulator Per-Core Stats Cycle-Level Multicore with DVFS Controller Instruction-Based Energy Dictionary System Performance and Energy Results DVFS Mode Transition Times Power Efficiencies for each PDN Evaluation Results We evaluate the performance, energy efficiency, and power of applications on our target system with each PDN. We use a 4-level, 8-domain FG-SYNC+ controller and account for realistic voltage-settling response times and regulator power efficiencies in each DVFS mode for varying load currents. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Norm. Energy Efficiency MAVR RPDN 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 Speedup isopower leak cores regulator efficiency overhead transition System-level Power Breakdowns for SFVR, MAVR, RPDN bfs bilateral dither kmeans mriq pbbs-dr pbbs-knn pbbs-mm rsort splash2-fft splash2-lu strsearch viterbi 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR Normalized Power Our promising results suggest that there is an important opportunity for architecture and circuit co-design of integrated voltage regulation in future systems. 8 Acknowledgments This work was supported in part by NSF CAREER Award #1149464, a Spork Fellowship, and donations from Intel Corporation and Synopsys, Inc. The authors acknowledge and thank Ji Kim for his help in developing the instruction-based energy model, and Derek Lockhart, Shreesha Srinath, and Pol Rosello for their help in writing multicore application kernels. Publication: Appears in the Proceedings of the 47th Int’l Symp. on Microarchitecture (MICRO-47), Dec. 2014. URL: http://www.csl.cornell.edu/~cbatten/pdfs/godycki-rpdn-micro2014.pdf Contact Author: Christopher Torng, 364 Upson Hall, Ithaca, NY 14853, [email protected]

Core 1 with Adjustable Regulator ... - Cornell University

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Core 1 with Adjustable Regulator ... - Cornell University

Enabling Realistic Fine-Grain Voltage Scalingwith Reconfigurable Power Distribution Networks

Waclaw Godycki, Christopher Torng, Ivan BukreyevAlyssa Apsel, Christopher Batten

School of Electrical and Computer EngineeringCornell University

1 Abstract

Recent work has shown that monolithic integration of voltage regulators will befeasible in the near future, enabling reduced system cost and the potential forfine-grain voltage scaling (FGVS). In this project, we use architecture-levelmodeling to explore a new dynamic voltage/frequency scaling controller called thefine-grain synchronization controller (FG-SYNC+). FG-SYNC+ enables improvedperformance and energy efficiency at similar average power for multithreadedapplications with activity imbalance. We then use circuit-level modeling to explorevarious approaches to organizing on-chip voltage regulation, including a newapproach called reconfigurable power distribution networks (RPDNs). RPDNs allowone regulator to “borrow” energy storage from regulators associated withunderutilized cores resulting in improved area/power efficiency and fasterresponse times. We evaluate FG-SYNC+ and RPDN using a vertically integratedresearch methodology, and our results demonstrate a 10–50% performance and10–70% energy-efficiency improvement on the majority of the applications studiedcompared to no FGVS, yet RPDN uses 40% less area compared to a more traditionalper-core regulation scheme.

2 Motivation

Monolithic integration using a standard CMOS process provides a tremendous costincentive for integrating closed-loop voltage regulators on the die. Recent technologytrends suggest that it is now becoming feasible to integrate switching regulatorson-chip (e.g., Intel Haswell), enabling reduced system cost as well as the potentialfor fine-grain voltage scaling (FGVS) to exploit fine-grain activity imbalance inmulti-threaded applications for performance and energy efficiency benefits.

Integrated Voltage Regulation inIntel Haswell Processors

HASWELL’S FIVR EXTENDS BATTERY LIFE First Processor With Integrated Voltage Regulator Won’t Be the Last

By David Kanter (July 29, 2013)

...................................................................................................................

July 2013 © Microprocessor Report

The PC ecosystem has lagged phones and tablets in power management. For many years, voltage regulators (VRs) for PC processors have been a staid and boring af-fair, delivering relatively incremental improvements. Tra-ditional VRs are bulky and consume high-value mother-board real estate close to the processor, but they operate at very low frequencies (e.g., 1MHz) compared with modern 1–3GHz processors.

Intel’s introduction of fully integrated voltage regula-tors (FIVRs) in its newest Core processor, code-named Haswell, is fundamentally changing the power-delivery land-scape. This innovation is the technical underpinning for new platform-level power management that adopts and in some aspects exceeds the capabilities of mobile systems, radically reducing average power consumption for the sys-tem and improving responsiveness. The result is one of the greatest increases in mobile-PC battery life ever, as evi-denced in the initial Haswell-based notebooks that are now shipping.

Specifically, the FIVR’s low latency enables more-agile power management that can quickly turn platform components on or off, enabling the system to spend most of its time in a deep-sleep state. The OS and applications, however, remain alive and connected to the network, and the system swiftly responds to external stimuli. For exam-ple, in Windows 8 Connected Standby mode, Intel claims the Haswell platform reduces power consumption by 20x compared with Sandy Bridge, while still meeting the 300ms wake-up requirement.

Looking beyond the PC, the implications are signifi-cant across a wide range of markets. Properly designed IP blocks such as CPUs, GPUs, FPGA fabrics, wireless modems, and caches can all take advantage of a FIVR. Ap-plications from smartphones to servers all benefit from

improvements in power efficiency. Although Intel is the first to bring FIVRs to market, others will eventually follow.

Tiny Capacitors and Inductors The FIVR is tightly coupled to Haswell, using both on-die and package-level integration. It complements an external voltage-regulator module (VRM) to deliver power to the processor. The FIVR’s main input voltage is supplied by the VRM at 1.8V (nominal) but is dynamically adjusted via serial voltage identification (SVID) that ranges between 1.6V and 2.3V. The output voltages that the FIVR supplies to the different die regions vary from about 0.65V to 2.0V.

Conceptually, the VRM is optimized for high con-version efficiency and acts as a slow and coarse-grained

Figure 1. Block diagram of Haswell’s power distribution. The FIVR converts a single voltage, VCCIN (1.8V nominal), into several independent voltage rails, each of which varies dynamically according to the needs of the circuit.

Haswell

VLLC

VDDQ

GraphicsEngine

Rin

g In

terc

onne

ct

SystemAgent

+PCIe

+DMI

DRAM CntlVCCIN

FIV

R

CPUs+

Cache

VCPU0

VRING

VGPU

VSA

VIOA

VCPU1……

Figure from D. Kanter, MPR’13

Fine-Grain Activity Imbalance inMulti-Threaded Applications

bfs

rsort

str

sea

rch

0

7

0

7

0

7

Application Start Application End

90μs

7μs

2.3μs

3 Target System

Our target system is an embedded processor composedof: eight in-order, single-issue, five-stage, RISC cores;private, coherent 16 KB instruction and data L1 caches;and a shared 512 KB unified L2 cache. We implementedthe core and L1 memory system for this design in RTL andused a commercial standard-cell-based CAD toolflowtargeting a TSMC 65 nm process to generate layout. Eachcore can run at 333 MHz at 1 V and the full eight-coresystem is approximately 6 mm2. Target System Layout

4 Types of Integrated Voltage Regulation

The three primary types of step-down voltage regulators are linear regulators (e.g.,LDOs), inductor-based switching regulators (e.g., buck), and capacitor-basedswitching regulators (e.g., switched-capacitor).

5 FGVS Architecture Design: FG-SYNC+

We explore a new FGVS controller called the fine-grain synchronization controller(FG-SYNC+) that exploits fine-grain scaling in level (i.e., many voltage levels), space(i.e., per-core regulation), and time (i.e., fast transition times between levels) toimprove performance and energy efficiency while maintaining similar average power.FG-SYNC+ uses a thread library instrumented with hint instructions to inform thehardware about which cores are doing useful work vs. useless work (e.g., waiting fora task, waiting at a barrier).

(d) rsort with 100 ns Response Time(b) splash2-lu with Eight Domains

time

(a) splash2-lu with Two Domains (c) rsort with 1000 ns Response Time

time

time time7

Core

0

Exec.Time

Reducedby 20%

7

Core

0

Super-SprintSprintNominalRestBusy Waiting

7

Core

0

7

Core

0

In these application activity plots illustrating FG-SYNC+, rows show controllerdecisions per-core and black strips above cores show when that core is active. Wecompare SPLASH-2 LU factorization with two vs. four voltage domains (a,b). We alsoillustrate the impact of slow voltage-settling response times over a small excerpt fromradix sort (c,d).

Fine-Grain Voltage Scaling with FG-SYNC+

We use three sensitivity studies to understand the implication of varying: (1) thenumber of voltage levels, (2) the number of voltage domains, (3) and voltage-settlingresponse times.

0.7/1.0V 1.0/1.15V

1.0/1.33V

0.7/1.0/1.33V0.7/1.0/1.15V

0.7/1.0/1.15/1.33V

2 Domains 4 Domains

8 Domains

1000 ns 100 ns

0 ns

isopower

isopower

Increasing

sprinting

levels

isopower

Increasing

number of

domains

Decreasing

response

time

isopower

Norm

. E

nerg

y E

ffic

ien

cy 2.0

1.8

1.6

1.4

1.2

1.0

0.80.9 1.0 1.1 1.2 1.3 1.4 1.5 1.0 1.1 1.2 1.3 1.4 1.5

Speedup Speedup

Norm

. E

nerg

y E

ffic

ien

cy 2.0

1.8

1.6

1.4

1.2

1.0

0.81.0 1.1 1.2 1.3 1.4 1.5 0.4 0.6 0.8 1.0 1.2 1.4

2-Level Controllers 3-Level vs. 4-Level Controllers

Different Numbers of Domains Different Response Times

Speedup Speedup

To exploit fine-grain activity imbalance, (1) at least three levels are required and fourlevels helps further; (2) more domains results in improved performance and energyefficiency; (3) response times of 100 ns or faster are required.

6 FGVS Circuit Design: RPDN

Shown in (a), we use a single fixed-voltage regulator (SFVR) as a baseline tocompare against more sophisticated regulation schemes. We choose a configurationthat can provide 80% efficiency at 1 V with an area of 0.26 mm2 (4% of the core/L1area). Shown in (b), multiple adjustable voltage regulators (MAVR) enablefine-grain voltage scaling in space and level. The power efficiency vs. area plot in (c)shows how we choose a per-core regulator area of 0.08 mm2 to allow efficient voltageregulation for super-sprint. Note that designing for super-sprint significantlyover-provisions for rest, nominal, and sprint modes; also, only one or two cores willever be using the super-sprint mode at any given time.

Core 1 with Adjustable Regulator

Vin

AVRControl

Unit

Core

Phase 0

Core 0 with Adjustable Regulator

(b) MAVR Abstract Schematic

Vin Flyback Cap

Core 0

SFVRControl

Unit

SP P

Vout = Vin/2

S

Phase 0

Core 1 Core 7

(a) SFVR Abstract Schematic

80

75

70

65

60

55

0.02 0.04 0.06 0.08

Pow

er

Effic

ien

cy (

%)

Area (mm2)

NominalMode

Super-SprintMode

0.08 mm2

80% Efficiencyat Super-Sprint

Chosen Design Point

(c) MAVR Efficiency vs. Area

Reconfigurable Power Distribution Networks

We propose a new approach called reconfigurable power distribution networks(RPDNs). As shown below, RPDNs include many small “unit cells” that each containthe flyback capacitance and regulator switches required for a SC regulator. Thesecells can be flexibly reconfigured through a switch fabric and combined with per-corecontrol circuitry to effectively create multiple differently-sized SC regulators“on-demand” for cores. The inset shows how 16 unit cells can be allocated to fourcores operating in four different modes.

VoutControlLoop

Core 0 Core 1

RPDNFabricControl

RPDN Switch Fabric

DCBAUnitCell

X

r

S N

Allocating 16 Cells

The efficiency vs. powerplot shown for a singlecore illustrates howlower-voltage modesrequire less area for highefficiency. A resting coreonly requires a single cell;RPDN would allow asuper-sprinting core to“borrow” three cells from aresting core whilemaintaining high efficiency.

Pow

er

Effic

ien

cy

10 20 30 40 50 60 700

60%

65%

70%

75%

80%

85%Rest

(0.7V)

Nominal

(1.0V)

Sprint

(1.15V)

Super-Sprint

(1.33V)

Output Power (mW)

7 Cells

6

Cells

5

Cells

4

Cells

3

Cells

2

Cells

1

Cell

100 150 200 250 300 350 400

1.4

1.3

1.2

1.1

1.0

0.9

0.8

0.7

Time (ns)

Volta

ge (

V)

120 ns

150 ns

1.35

1.25

1.15

1.05

0.95

0.85

0.75

0.25 1.0 1.75 2.5 3.25 4.0Time (us)

Volta

ge (

V)

1390 ns 960 ns

2900 ns

MAVR Transient Response

RPDN Transient Response

The RPDN architecture provides areasavings of 40% over MAVR whensupporting per-core supply regulationacross the same number of cores. Inaddition to reducing area overhead,RPDN significantly reduces thevoltage-settling response time. Thewaveforms shown here illustrate thedifference in the transient responses forRPDN and MAVR when transitioningbetween modes. For RPDN, theresponse time for the nominal tosuper-sprint transition takes 150 ns whilethe same transition takes 2.9 µs withMAVR.

7 Evaluation

Evaluation Methodology

We use a vertically integrated evaluation methodology that uses a mix of circuit-,gate-, register-transfer-, and architectural-level modeling.

CrossCompiler

RISCISA Sim

RISC Binary

C++ MultithreadedApp Using Hints

Cycle-LevelSimulator

Verilog RTL of MulticoreRISC Processor

Gate-Level Model

VerilogSimulator

VerilogSimulator

SwitchingActivity

Power Model

Layout

SynthesisP&R

SPICE CircuitModels of PDNs

SPICESimulator

Per-Core Stats

Cycle-Level Multicorewith DVFS Controller

Instruction-BasedEnergy Dictionary System

Performanceand Energy

Results

DVFS ModeTransition

Times

Power Efficienciesfor each PDN

Evaluation Results

We evaluate the performance, energyefficiency, and power of applications onour target system with each PDN. We usea 4-level, 8-domain FG-SYNC+ controllerand account for realistic voltage-settlingresponse times and regulator powerefficiencies in each DVFS mode forvarying load currents. 1.0

1.1

1.2

1.3

1.4

1.5

1.6

1.7

Norm

. E

nerg

y E

ffic

ien

cy

MAVR RPDN

0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5Speedup

isopower

leak cores regulator efficiency overhead transition

System-level Power Breakdowns for SFVR, MAVR, RPDN

bfs

bilater

aldith

er

kmeans

mriq

pbbs-dr

pbbs-knn

pbbs-mm

rsort

splash2-ff

t

splash2-lu

strsea

rchvite

rbi

0.00.10.20.30.40.50.60.70.80.91.0

SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR

Norm

aliz

ed

Pow

er

Our promising results suggest that there is an important opportunity for architectureand circuit co-design of integrated voltage regulation in future systems.

8 Acknowledgments

This work was supported in part by NSF CAREER Award #1149464, a SporkFellowship, and donations from Intel Corporation and Synopsys, Inc. The authorsacknowledge and thank Ji Kim for his help in developing the instruction-based energymodel, and Derek Lockhart, Shreesha Srinath, and Pol Rosello for their help in writingmulticore application kernels.

Publication: Appears in the Proceedings of the 47th Int’l Symp. on Microarchitecture (MICRO-47), Dec. 2014. URL: http://www.csl.cornell.edu/~cbatten/pdfs/godycki-rpdn-micro2014.pdf Contact Author: Christopher Torng, 364 Upson Hall, Ithaca, NY 14853, [email protected]