48
Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold Processors Chris H. Kim University of Minnesota, Minneapolis, MN [email protected] www.umn.edu/~chriskim/

Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Processors

Chris H. Kim

University of Minnesota, Minneapolis, MN

[email protected]/~chriskim/

Page 2: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

2

Scaling Challenges

Power wall Reliability wallVariability wall

2000 2010 2020

Year

Pow

er (W

)

Page 3: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

3

Overcoming the Power Wall

• Proven solutions: Multi-core chips, dynamic voltage frequency scaling, clock gating, power gating, …

Y=AxBY=AxBY=AxB

Freq=1Vdd=1Throughput=1Area=1Power=1Pwr Den=1

Freq=0.5Vdd=0.5Throughput=1Area=2Power=0.25Pwr Den=0.125

87%↓

Page 4: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

4

Overcoming the Variability WallVID

6+

-DAC+

-

PLimit

VDie

IIR

PCalc Calc

VConnectorA/D

A/D

Power SupplyMicro-Controller

RPackage

Package/Die

VConnectorVDie

RPackage

Intel Foxton Technology

• Proven solutions: Variation aware design, memory assist/repair, lithography techniques, adaptive systems

Page 5: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

5

Overcoming the Reliability Wall

• Possible solutions: Guardbanding, sensing and compensation, wear-leveling, failure resistant systems, …

Page 6: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

6

Outline• Device Reliability Issues

• Reliability Monitors and Measurements

• Reliability Effects in NTV Processors

• Summary

Page 7: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Aging in CMOS Transistors

7

Page 8: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

8

• Transistors are exposed to different stress conditions during normal digital circuit operation

HCI, BTI, and TDDB in Digital Logic

D Inverted Channel

ID Inverted Channel

Page 9: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Practical Solutions for Preventing Aging Related Failures

• BTI and HCI– Gradual decline in performance– Guard banding (static or dynamic), adjust Vmax– CAD, firmware & architecture level support essential

• TDDB– Single incident may lead to outright system failure– Can happen anywhere inside a chip– Improve fabrication procedure, adjust Vmax

• Bottom line: Precise measurement and understanding of circuit degradation a key aspect of robust design

9

Page 10: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Transistor Lifetime Estimation

• Extrapolate stress results with respect to:– Op. conditions based on acceleration models– Larger chip areas (e.g., Poisson scaling for TDDB)– Lower percentiles based on chosen distribution

10

real

sup

ply

volta

ge

Page 11: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Benefits of In-Situ Reliability Monitors over Device Probing

11

• Information from actual circuits (test circuit must be representative)

• High (timing) precision + short measurement interrupt

• No expensive equipment• Short test time and reduced test area• Measurements at use condition

allows realistic lifetime projection• Complements traditional probing

methods

Page 12: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Usage Scenarios and Design Issues of In-situ Reliability Monitors

• Usage scenario 1: Process characterization and yield improvement• Early technology characterization is often performed

before many metallization layers are being fabricated• Library cells may not be available (flip-flops, scan)• Device probing would still be a competitive solution

for extracting analog parameters such as I–V or C–V• Usage scenario 2: In-field monitoring and data

collection• Workload unknown• Simple circuits are practical but they have limited

capabilities• Firmware and architecture support needed

12

Page 13: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Usage Scenarios and Design Issues of In-situ Reliability Monitors

• Usage scenario 3: Sensor for real time aging compensation• Effectiveness versus overhead• Measurements are from a proxy circuit• Practical issues: type of sensor, temporal granularity,

spatial granularity, communication with sensors, interface and protocol

• Personally not a big fan

13

Page 14: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

14

Outline• Device Reliability Issues

• Monitors and Measurements

• Effects in NTV Processors

• Summary

Page 15: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Circuit Based Reliability Monitors (or Silicon Odometers)

15

Die Photo

Odometer Projects

Original Silicon

Odometer

Focused Reliability

Issues

Year 2007 2008 2009 2010 2011

130nm 65nm 65nm 65nm 32nmSOI

All-In-One Odometer

Process

Statistical, Duty-Cycle,

and RTN Odometer

Interconnect Odometer

PBTI and SRAM

Odometer

32nmSOI

SRAM and RTN

Odometer

2012

NBTI Induced

Frequency Degradation

Separately Monitoring

NBTI, HCI and TDDB

Statistical Behavior of

NBTI;RTN on Logic

Circuit

Impact of Interconnect on BTI and HCI Aging

Monitoring PBTI in HKMG

Process;BTI Impact on SRAM Read/

Write

SRAM Timing Issues Due to

BTI;RTN Impact

on Ring Oscillator

Page 16: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Beat Frequency Silicon Odometer

• Beat frequency of two free running ROSCs measured by DFF and edge detector

• Benefits of beat frequency detection system– Achieve ps resolution with μs measurement interrupt– Insensitive to common mode noise such as

temperature drifts– Fully digital, scan based interface, easy to implement

16

Page 17: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Beat Frequency Silicon Odometer

ref

stressbeat ref stress

stress

ref

• Sample stressed ROSC output with reference ROSC– 1% frequency difference before stress N=100– 2% frequency difference after stress N=50– Δf or ΔT sensing resolution is >0.01%

17

Page 18: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

18

ROSC Based Aging Sensor Comparison

Block Diagram

FunctionCount Stress ROSC

periods during externally controlled meas. time

Count Stress ROSC periods during N1 periods

of Ref. ROSC

Count Ref. ROSC periods during one period of

PC_OUT

Features Simple; compact Simple; immune to common mode variations

High resolution w/ short meas. time; immune to

common mode variations

Issues

Voltage and temp. varations; meas. time vs.

resolution tradeoff; requires absolute timing reference

(e.g. oscilloscope)

Meas. time vs. resolution tradeoff

Requires extra circuits (e.g., Phase Comp., edge

detector, etc...)

Meas. time for 0.01% max res. *

30 μs 30 μs

Meas. error wrt. common

mode variations **

+10.18% / -8.57% +0.06% / -0.07%+0.26% / -0.38%

*ROSC period = 3 ns ** simulated with +/- 4% ∆VCC

0.3 μs

System Single ROSC 2 ROSC, simple 2 ROSC, beat freq.

Page 19: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Separately Monitoring NBTI and PBTI

• PBTI becoming an important concern in high-k metal-gate

• Conventional Ring Oscillator (ROSC) can only provide overall frequency degradation information due to combined NBTI and PBTI effects

• New RO structure separates NBTI and PBTI effects

19

PBTI stress

NBTI stress

N/PBTI stressJ. Kim, et al., IBM, IRPS 2011

Page 20: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Separately Monitoring BTI and HCI

20

BTI_ROSC(BTI Stress Only)

DRIVE_ROSC(BTI & HCI Stress)

BTI_REF_ROSC

DRIVE_REF_ROSC

Beat Frequency Detection Circuit 1

SCANOUT

STRESSED

UNSTRESSED

SCANOUT Beat Frequency

Detection Circuit 2

BTI_ROSC

DRIVE_ROSC

Page 21: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Separately Monitoring BTI and HCI

• Backdriving action equalizes BTI in both BTI_ROSC and DRIVE_ROSC

• Negligible HCI in BTI_ROSC: only 3-5% of the switching current in the DRIVE_ROSC

• Fresh power gates are used for frequency measurements21

Page 22: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Temp. and Voltage Dependencies

• HCI slightly reduced with temperature– Due to reduced drain current

• Both mechanisms degrade with stress voltage– Point when HCI begins to dominate pushed out in

time by >1 order of magnitude at 1.8V vs. 2.4V

1.E-02

1.E-01

1.E+00

1.E+0 1.E+1 1.E+2 1.E+3 1.E+4 1.E+5

Freq

uenc

y Shi

ft (%

)

Stress Time (s)

250MHz stress freq.2.0V stress

30OC: HCIDEG , BTIDEG

120OC: HCIDEG , BTIDEG

1.E-02

1.E-01

1.E+00

1.E+01

1.E+00 1.E+02 1.E+04 1.E+06

Freq

uenc

y Sh

ift (%

)

Stress Time (s)

26OC470MHz stress freq.

2.4V stress

1.8V stress

22

Page 23: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Aging Issues in Interconnects

• Interconnect affects the voltage and current shapes– Increased transition time (decreased slew rate)– Increased current pulse; decreased current peak value

• BTI and HCI have different sensitivities to bias conditions

23

Page 24: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Interconnect Aging Monitor

24

• Serpentine wires for a dense chip implementation• Ground shielding on both sides for reducing noise

X. Wang, et al., IRPS 2012, TVLSI 2014

Page 25: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

• BTI aging decreases with interconnect length• HCI degradation peaks at L=500µm

BTI and HCI Aging: With Interconnect

25

Page 26: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

BTI Aging vs. Interconnect Length

• BTI induced frequency degradation decreases with longer interconnect

• Longer transition time shorter PMOS stress duration Less BTI aging

26

Page 27: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

HCI Aging vs. Interconnect Length

• HCI aging exhibits a non-monotonic behavior with respect to interconnect length– Current pulse width increases – Current peak decreases

27

Page 28: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Statistical Behavior of Aging

• Finite number and random spatial distribution of discrete charges NBTI & HCI variation

• Inversely proportional to AGATE worse with scaling• Small number of aging measurements not sufficient to

characterize aging28

Spread in ∆Vt increases with scaling CDF of ∆Vt at different stress timesS. Pae, et al., TDMR‘08 S. Rauch, TDMR, Dec. ‘07

Page 29: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Statistical Reliability Monitor• Need stressed &

reference ROSC frequencies to be close

• Difficult, costly to tune each stressed ROSC

• Use multiple ref. ROSCs with different frequencies

• Cover the frequency distribution of the stressed array

SCANOUTRESULTS

FSM+

ScanChain

Column Peripherals

Ref ROSC 3

Ref ROSC 1

Ref ROSC 2

3 Silicon Odometer Beat

Frequency Detection Systems

J. Keane, et al., IEDM 2010, JSSC 201129

Page 30: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

65nm Test Chip Data

• Fresh and post-stress ROSC frequency PDFs• No significant correlation of the frequency shift

with fresh frequency

Perc

enta

ge o

f RO

SCs

245 254 263 272DUT Frequency (MHz)

Fresh3.1hr Stress1.8V

2.0V

2.2V

30

0.011

-0.126

-0.112

20OC, DC; 2.2V, 2.0V, & 1.8V120 ROSCs each @ 11200s

Correl. Coef.

Page 31: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

SRAM Memory Design Challenges at Low Supply Voltages

• Ratio-ed operation leads to poor noise margin at low voltages for 6T SRAM cells

• Conflicting requirements: a stronger access transistor improves write margin but worsens read margin

BLBB

L

31

Page 32: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

• With BTI: Read stability degrades

• Cell recovers on a fail

32

Impact of BTI on SRAM Read

Bit

Failu

re R

ate

(BFR

, %)

Page 33: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

33

Impact of BTI on SRAM Write

• With BTI: Write stability improves or remains unchanged

• Cell recovers on a pass

Bit

Failu

re R

ate

(BFR

, %)

Page 34: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

34

Representative SRAM Reliability Macro

256x128b SRAM array

Deco

der

FSM

VMEAS, VSTRESS

128b scan reg.

Col. peripheralVCO

Peripheral supply

Off-chip DAQ

CLK

Array supply domain

Peripheral supply domain

WL

BL

Supply switches

• Represents a product SRAM sub-array• BIST function done by on-chip FSM with supply switches

P. Jain, et al., IEDM, 2012

0.1

1

100

Read

Fai

lure

Rat

e (%

)TSTRESS (s)

1 10 100 1000

10

TMEAS increased from 3µs to 2ms

32nm, 0.52V, 85°C

10x

Page 35: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

35

• Implemented on IBM’s z196 Enterprise systems for long term degradation under real-use conditions.

• Over 500 days worth of ring oscillator degradation data from customer systems

• Other companies have aging monitors too, but they tend not to publish their work

Aging Monitor in IBM MicroprocessorsPongfei Flu, Keith Jenkins, IBM, IRPS 2013

Page 36: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

36

• Time-zero problem: Some time will elapse between applying voltage (burn-in, test, operation) and making the first measurement time-zero frequency is completely unknown incorrect time slope of 0.42

• Use fitting parameters assuming Δf = A(t-to)n-Atn time slope of 0.172

Aging Monitor in IBM MicroprocessorsPongfei Flu, Keith Jenkins, IBM, IRPS 2013

Page 37: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

37

Design Considerations Examples of Practical Issues

BTI, HCI, TDDB, RTN, transient errors, memory bit failures, etc.Type of Sensor

Temporal Granularity Sensing period, threshold setting, dynamic range, etc.

Spatial Granularity Per CPU/GPU/memory, per functional unit, per sub-block, etc.

Stress and Measurement

ConditionAC vs. DC, accelerated vs. usage

condition, fast measurement

CommunicationBetween data gathering sensor,

across sensors, between sensors and processor

Interface and ProtocolInterrupt based, polling, event

alarms, performance counter based, etc.

Aging Sensor Implementation in IBM z196 Server [3]

Ring Oscillator based BTI monitor for long-term frequency degradation measurement

Sampling period: once a week

Total: 5 sensors per chip; One sensor per core (x4 cores) plus one sensor in L2 cache

AC stress, usage condition, 0.5ms measurement time

Sensors are integrated with IBM z196 pervasive infrastructure with firmware

support

Interrupt based in-field frequency degradation measurement

Testing and Calibration

Similar to any other on-chip monitor circuit

Time 0 frequency shift unknown since first sample is taken after some stress

Aging Monitor in IBM MicroprocessorsPongfei Flu, Keith Jenkins, IBM, IRPS 2013

Page 38: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

Outline• Device Reliability Issues

• Monitors and Measurements

• Effects in NTV Processors

• Summary

38

Page 39: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

DVFS Systems in ISSCC 2014

22nm Intel Haswell processor N. Kurd, et al., ISSCC, 2014

• Latest trends: On-chip distributed VRM (fast transients, supply noise suppression), per-core DVS, NTV/Turbo

22nm IBM POWER8 processorZ. Toprak-Deniz, et al., ISSCC, 2014

<1% area overhead

39

Page 40: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

40

Frequency Fluctuation in DVFS(BTI Example)

Time

V DD

Freq

uenc

y

Time

V DD

Freq

uenc

y

Time

V DD

Freq

uenc

y

• Constant VDD: Frequency degrades with stress • High VDD to low VDD: Freq. dips due to lower VDD followed

by recovery• Low VDD to high VDD: Freq. jumps and then degrades

Freq. dip

Freq. peak

Page 41: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

41

Time

V DD

Freq

uenc

y

Time

V DD

Freq

uenc

y

Time

V DD

Freq

uenc

y

fCLK

GuardbandfCLK

Guardband

fCLK

Guardband

• Constant VDD: Frequency degrades with stress • High VDD to low VDD: Freq. dips due to lower VDD followed

by recovery• Low VDD to high VDD: Freq. jumps and then degrades

Frequency Fluctuation in DVFS(BTI Example)

Page 42: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

42

Modeling Approach using Superposition• Rationale for empirical

superposition method– Complicated VDD trace

can be broken down into multiple pulses

– Suitable for long-and short-term, DC and AC

– Computation is more efficient, short runtime

Δf(%

)

ΔV T

ΔV T

V T2

ΔV T

1V D

D

V1

V2V3

V1

ΔVT1

V2ΔVT2

V3ΔVT3

ΔVT=ΔVT1+ΔVT2+ΔVT3

Time (a.u.)

Not captured in previous work

Time (a.u.)

C. Zhou, et al., IRPS 2014

Page 43: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

43

BTI Recovery Model using Superposition

• Stress model: tn (power law)• Recovery model derived from superposition property:

ΔVT,recovery(t) = tn-(t-t0)n

Page 44: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

44

Translating VT Shift to Delay Shift

• ROSC mimics logic path• Translate ΔVT to pull-up,

pull-down delayPull-down Delay Pull-up Delay

0.6 0.7 0.8 0.9 1.00

0.005

0.010

0.015

0.020

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Dealy

(normalized)Δ

V T (n

orm

aliz

ed)

VDD (normalized)

0.6 0.7 0.8 0.9 10

0.005

0.010

0.015

0.020

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1HSPICE, TT, 60°C

Dealy

(normalized)Δ

V T (n

orm

aliz

ed)

VDD (normalized)

HSPICE, TT, 60°C

Page 45: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

45

Android Development Board for Collecting DVFS Traces

• VDD and operating frequency collected in real time• Navigating websites, running benchmark applications

Linux kernel v 3.4.5Operating system Android v 4.2.2

Processor ARM Cortex A15

System Samsung Exynos 5410 SoC

Frequency 0.8 – 1.8 GHzVoltage 0.9 – 1.25 V

DVFS meas. National Instr. DAQ

Sampling frequency

1000 samples per second

Process 28nm

Page 46: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

46

Sample Waveform and Estimated Frequency Shift

0 1 2 3 4 5 6Time (s)

-0.8-0.6-0.4-0.2

0

Freq

uenc

y Sh

ift (%

)

-1.0-0.90%

high VDD stress

low VDD recovery

Amazon.com

0.60.8

11.2

V DD

(nor

mal

ized

)

• High VDD duration: Freq. degrades with time • Low VDD duration: Freq. shift dips and then recovers

Page 47: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

47

Applying Model to Other DVFS Traces

• Worst case frequency dip– 3D-raytrace: Δf=1.0% at t=6s when VDD drops by 29%

after staying in high VDD mode for 5.8s

Sina.com

Google.comNYTimes.comAmazon.com

Page 48: Circuit Reliability: Mechanisms, Monitors, and Effects in ...web.cse.ohio-state.edu/~teodorescu.1/workshops/... · Circuit Reliability: Mechanisms, Monitors, and Effects in Near-Threshold

48

Summary• Power wall (2000) Variability wall (2010)

Reliability wall (2020) • Example: NTV + RDF + BTI

• Aging sensor deployed for the first time in a commercial processor (IBM z systems)

• Per-Core DVFS with sub-microsecond ramp time becoming a standard feature in new processors

• Turbo boost + NTV: Best of both worlds in terms of power and performance, but presents new reliability challenges