52
Next-Generation Power-Aware Design Prof. Takayasu Sakurai Institute of Industrial Science, University of Tokyo E-mail:[email protected] ISLPED’08 Aug. 13, 16:30-17:30, Bangalore, India

Next-Generation Power-Aware Design - ISLPED · T.Sakurai Next-Generation Power-Aware Design Prof. Takayasu Sakurai Institute of Industrial Science, University of Tokyo E-mail:[email protected]

Embed Size (px)

Citation preview

T.Sakurai

Next-Generation Power-Aware Design

Prof. Takayasu Sakurai

Institute of Industrial Science,University of Tokyo

E-mail:[email protected]

ISLPED’08 Aug. 13, 16:30-17:30, Bangalore, India

T.Sakurai

We can help in two ways for “Cool Earth”

Green of IT

Green by IT

T.Sakurai

Next-Generation Power-Aware Design

3D integration

Deep sub-volt design

Organic integrated circuits (Green by IT)

T.Sakurai

Power distribution is diverse

Clock

ASSP1

LogicMemory

I/O

ASSP2

Clock

Logic

MemoryI/O

MPU1 Clock

Logic

MemoryI/O

MPU2Clock

Logic

Memory

I/O

T.Sakurai

System-on-a-Chip reduces I/O power but…

Separatechips 891mW

240mW

DRAMLogic & memory

DRAM on a chip

Power 16MbitDRAM

Speech codec

Multiplexer

MPEG-4 VideoCodec

HostI/F

DRAM

I/F

PLLCamI/F

DisplayI/F

Pre-filter

VTVT

VT VT

MPEG4 codec

DRAM - logic interface

70% power reduction by DRAM embedding but expensive.

T.Sakurai

3D achieves low power and high performance

2D assembly

3-D SiP

More devicesin closer vicinity

↓Reducing R and C

↓Lower power

Higher performance

Substrate< 20µm thick

8 times more devicesin 1mm distance

with reasonable cost

T.Sakurai

Stacked processor & cache by processor companies

Heat sink from back of processor

Processor, 1TFLOPS at 98W22 mm x 13.75 mm80 cores, face down Each unit is core + router

Stacked memory256KB SRAM per core4x C4 bump density8490 through-Si vias

Package

T.Sakurai

New contender – wireless link

Capacitive-Coupling LinkU. Tokyo and Keio U., (ISSCC’03)

Inductive-Coupling LinkKeio U. and U. Tokyo, (ISSCC’04)

Metal Electrode Metal Coil

Capacitive and inductive-coupling linksWireless data links between stacked chipsNo need for additional wafer process low cost No need for ESD protection circuit

high speed + low power

T.Sakurai

How were we and how are we?

VR

E

Electrode

IT

VR

Coil

+

H

-

VT

IT

+- VR

M

VT

VR

CC

VR=VTCC

CSUB

CC+CSUB

VR=MdITdt

0

2.5

5

7.5

10

12.5

0 10 20 30 40 50

Dat

a R

ate

[Gb/

s]

Communication Distance [μm]

[1]UCLA

[2]

[3]

[4]

[5][6]Sun

[7][8]

[9]

[10]

[11]

[12][12] [12]

CapacitiveInductive

This Work

N.Miura, Y.Kohama, Y.Sugimori, H.Ishikuro, T.Sakurai, and T.Kuroda, "An 11Gb/s Inductive-Coupling Link with Burst Transmission," ISSCC'08, pp.298-299, Feb.2008.

T.Sakurai

Inductive Wireless SuperconnectConnecting multiple chips

D. Mizoguchi, Y. Yusof, N. Miura, T. Sakurai, T. Kuroda, "A 1.2Gb/s/pin Wireless Superconnectbased on Inductive Inter-chip Signaling (IIS) ,”ISSCC’04, pp. 142-143, Feb. 2004. N. Miura, D. Mizoguchi, Y. B. Yusof, T. Sakurai, and T. Kuroda, “Analysis and Design of Transceiver Circuit and Inductor Layout for Inductive Inter-chip Wireless Superconnect,”Symp. on VLSI Circuits, pp. 246-249 June 2004.

Metal Inductor

Chip#1(Logic)

Chip#4

Chip#2(Memory)

Tx

RxChip#3

Tx

RxChip#1

Tx

RxChip#2

VR

VR

ITIT

Chip#3(Memory)

Clock & Power

Analog

T.Sakurai

120µm coils couple through stacked chipsTransmitter (Top Chip)

Top Chip(40,25,10µm-Thick)

Bottom ChipReceiver (Bottom Chip)

120µm

Tx

Rx

Fabricatedin 180nm CMOS

Distance=45,30,15μm

Voltage supply by bonding does not increase power nor decrease speed.

T.Sakurai

Just talk without waiting for clock - fast

Simulated in 180nm CMOS

Keio U. / U. Tokyo(ISSCC’07) Proposed Transceiver

CircuitTopology

Data Rate

Latency

Energy/bit

1Gb/s 11Gb/s

600ps 36ps

0.4pJ/b 1.4pJ/b

+-IT

H

VR

Rxdata

Txdata

+-IT

H

VR

Rxdata

TxdataPulse

Generator

Rxclk

TxclkH-Bridge

ComparatorTiming

Controller

Slow

Long Latency

Synchronous front-end Asynchronous front-end

T.Sakurai

High-Speed Inductive-Coupling Link

VRVB

+ -

Rxdata

IT

TxdataTxdata

RxdataReceiver

Transmitter

Txda

ta[V

]V R

[mV]

1.5

Time [ns]

I T[m

A]

Rxd

ata

[V]

0 0.5 1 1.5

-1.5

100

-100

0.5

-0.5

4

-4

0

0

0

0

T.Sakurai

11Gps over 15µm Measured

10-14

10-12

10-10

10-8

10-6

10-4

10-2

100

6 7 8 9 10 11 12

BER

Data Rate [Gb/s]

X=15

µm

X=30

µm

X=45

µm

3 4 5

120µm

Tx

Rx

Communication Distance, X

Inductive-Coupling Link

T.Sakurai

Lower power solution in scaled CMOS

Parallel

Burst

*Multiple Use of Data Links with 400MHz System Clock

Area Reduction byBurst Transmission

Data Rate

Energy

Area

180nm CMOS(Measured)

90nm CMOS(Simulated)

11Gb/s 30Gb/s

0.11pJ/b ~0.03pJ/b @ 45nm

0.3mm2

(16Links for 6.4Gb/s)0.96mm2

(52Links for20.8Gb/s)0.1mm2

(2Links for 6.4Gb/s)0.08mm2

(2Links for 20.8Gb/s)Burst

1.4pJ/b

Link

Burst 6.4Gb/s 20.8Gb/s

Link

Burst

Parallel*

11.2pJ/b52.6pJ/b

1/3 1/12

For TSV connection: E= PD =ƒCV2/ƒ =CV2 ~C (V~1) = 0.03pJ/b (w/ ESD)

T.Sakurai

L-coupled link: low power contenderEn

ergy

per

bit

tran

sfer

[m

W/G

b/s=

pJ/b

]

N.Miura, H.Ishikuro, T.Sakurai, T.Kuroda, "A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping," Paper#20.2, ISSCC, Feb.2007.

Drastically lowering power consumption of I/O’s

1

10

100

1000

’96Year

’98 ’00 ’02 ’04 ’06

Toshiba

NTT

Hitachi

NEC

NECIntel

TINEC

Our Work

Our Work

Our Work

TeraChip

RambusFlexIO Current

products

’07 ’08 ’09 ’10

0.1

0.01

Our Work

Estimate @45nm

T.Sakurai

EMI? Near Field and Far Field

50G

50M

50k

1μm 1mm 1m 1kmCommunication Distance: x

Sign

al F

requ

ency

: f

[Hz]

RFID (135kHz)

RFID (13.56MHz)

mm-Wave

Cellular

FM

RFID(2.4MHz)

WLAN

f = c/(2πx), x=λ/2πNear Field (Reactive)

Far Field (Radiative)

■Chip-link in SiP

H.Ishikuro, N. Miura, T. Kuroda, “ Wideband Inductive-coupling Interface for High-performance Portable System,” CICC '07, pp.13-20, Sept. 2007.

T.Sakurai

It’s not paradise: remaining issues for 3D SiPKGD (Known Good Die)

At-speed testing of wafer, Wafer burn-in, Huge pin counts

Heat removal and inspection of contactsHeat estimate, Testing by X-ray and ultrasonic

InterposerSecure power distribution circuits, RLC testing

Design environmentsEMC, Noise, Heat, 3D modeling, Simulation

StandardizationProtocol, Electrical, Physical, Testing, Logistics,Legal issues, 3D data handling

T.Sakurai

3D SiP house

Foundry A

Foundry B

Foundry C

TSV /wireless KGD test

KGD

KGD test

Interposer, Assembly & test

3D SiP house

TSV /wireless

TSV /wireless

3D SiP to system customer

Foundry may provide TSV service.

TSV by 3D SiP house is based on “Via-last”

T.Sakurai

Interposer to ensure design freedom

Re-distribution layer to adjust TSV location and material discrepancy among dies.

Foundry AFoundry A

Foundry A or B

Assembly-specific

TSV’ed memory can be considered as one LSI product.

May experience separate shrink and multi-vendor supply.

Silicon/glass or organic depends on design rule (~10µm) and cost trade-off

T.Sakurai

Future power-aware 3D integration

Specialized blocksSensors and information collecting circuits

Proc. unit

Stacked analog/RF,HV, sensors, MEMS…

Power unit

Stacked memories

HV Power grid

Power control proc.

T. Sakurai, “Low Power Digital Circuit Design (Keynote),” ESSCIRC‘04, pp.11-18, Sept. 2004. T.Sakurai, "Moore's Law Plus (Keynote)" VMIC, Oct. 2005.

T.Sakurai

Next-Generation Power-Aware Design

3D integration

Deep sub-volt design

Organic integrated circuits (Green by IT)

T.Sakurai

Ultra-low voltage domain

Nor

mal

ized

del

ay &

pow

er

VDD [V]0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110-5

10-4

10-3

10-2

10-1

100

00.10.20.30.40.50.60.70.80.91

Delay

Power

PD product

Simulation(fitted to measurement)

Nor

mal

ized

PD

pro

duct

T.Sakurai

Ring Oscillators to enable VDDmin Measurement

Ring Oscillator Output Buffer

VDD2

VSS2

Low swing

0V

1V 1V swingVDD2

VSS2(manually tuned)

VDD

VSS

VDD-VSS of output buffer is separated from VDD2-VSS2 of ring oscillator, so that small swing output signal can be measured.NMOS/PMOS body bias voltage of ring oscillators can be tuned independently.

T.Sakurai

Fabricated Ring Oscillators (RO’s)

90nm CMOS process, based on standard cell libraryThree RO’s (11-stage, 101-stage and 1001-stage)More stages up to million gates in subsequent tapeout 1m

m

400µm

MicrographLayout

11-stage

101-stage

1001-stage

T. Niiyama, P. Zhe, K.Ishida, M. Murakata, M. Takamiya, and T. Sakurai, "Dependence of Minimum Operating Voltage (VDDmin) on Block Size of 90-nm CMOS Ring Oscillators and Its Implications in Low Power DFM, ISQED, March, 2008.

T.Sakurai

VDD

VOUT

VDDmin

VOUT

0 20 40 60 80 1000

50

100

150

200

Volta

ge (m

V)

Time (µs) Simulation

VDDmin simulation

T.Sakurai

VDD Dependence of Oscillation Frequency

0.0 0.2 0.4 0.6 0.8 1.010k

100k

1M

10M

100M

1G

10G

Osc

illat

ion

freq

uenc

y (H

z)

VDD (V)

11-stage

1001-stage

VDDmin

13 dies

VDDmin is defined as the supply voltage when the RO’s stop oscillation and no voltage transition from the output buffer are observed.VDDmin of 11-stage ring oscillator is lower than that of 1001-stage ring oscillator.

T.Sakurai

Measured Die-to-Die (D2D) variations

Frequency variations increase with reduced VDD.

11-stage

13 dies

Freq

uenc

y va

riatio

ns(=

σ/ a

vera

ge) (

%)

VDD [V]0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00

5

10

15

1001-stage ring oscillatorsVDDmin

( )

pdpd TH

TH

DDpd

DD TH

pdTH

pd DD TH

pdTH

pd DD TH

dtt V

dVCVt

V Vt

Vt V V

tf Vf t V V

α

Δ = Δ

∝−

Δ α→ ∝ Δ

ΔΔ α→ ∝ − ∝ Δ

T.Sakurai

Measured WID VTH variation of 4000 MOSFET array

The spatial spectrum doesn’t show distinctive peaks at particular spatial frequencies, which indicates that the intra-die VTH variations are not systematic but purely random across 4mm.

D. Levacq, T. Minakawa, M. Takamiya, and T. Sakurai, "A Wide Range Spatial Frequency Analysis of Intra-Die Variations with 4-mm 4000 x 1 Transistor Arrays in 90nm CMOS," CICC, pp.257-260, Sept. 2007.

Spatial frequency (1/µm)0.001 0.01 0.1

0

1

10

100

1000 Transistor

Array

4mm

4mm

(400

0 tr

ansi

stor

s)Fourier transform of VTH

Same 90nm CMOS process

Spec

trum

of V

TH(A

.U.)

T.Sakurai

V1

V2

V3

V4

V5

V6

V7

V8

V9

V10

V11 VINV

V1~V11

VDD=85mV1 2 3 10 11

V1 V2 V3 V10 V11

1 2 3 4 5 6 7 8 9 10 110

10

20

30

40

50

60

70

80

V IN

Vof

inve

rter

V1

~ V 1

1(m

V)

Inverter number

85

FailVOUT_LOW_7 > VINV_8

What’s happening at VDDmin

T.Sakurai

Adaptive body bias to reduce VDDmin

Body bias control

VDDmin=89 mV (Initial)

VDDmin=87 mV

1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11

1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11

Simulated

The body bias of pMOS is adaptively controlled to minimize VDDmin and the body bias of nMOS is fixed. When a common body bias is applied to the 11 inverters, VDDmin improvement is only 2mV.

T.Sakurai

Fine-grained body bias to reduce VDDmin

VDDmin=85 mV

VDDmin=43 mV

Vb1 Vb2 Vb3 Vb4 Vb5 Vb6

1 2 3 4 5 6 7 8 9 10 11

Vb1 Vb2 Vb3 Vb4 Vb5 Vb6

1 2 3 4 5 6 7 8 9 10 11

Vb1Vb2

Vb3Vb4

Vb5Vb6

Vb7Vb8

Vb9Vb10

Vb11

1 2 3 4 5 6 7 8 9 10 11

Vb1Vb2

Vb3Vb4

Vb5Vb6

Vb7Vb8

Vb9Vb10

Vb11

1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11

Simulated

When independent body bias is applied for every 2 inverters, VDDminimprovement is only 4mV.When inverter-by-inverter body bias is applied, VDDmin is drastically reduced to 43mV. But it is impractical.

T.Sakurai

Prob

abili

ty d

istr

ibut

ion

func

tion

Variable: x

( )( )

2

2

2xx

e21xf σ

−−

σπ=

3

2

1

0σ− 2x σ−x x σ+x σ+ 2x σ+ 3x σ+ 4x σ+ 5x σ+ 6x σ+ 7x

1K

1M

1G

Worst-case distribution

( )22

ny x

x2d 1 e dy

dx 2

−−

σ−∞

⎛ ⎞⎡ ⎤⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥πσ⎜ ⎟⎣ ⎦⎝ ⎠∫

k

10

peak ~ x 2 k (if n 10 )

~ x 2 log n

+ σ =

+ σ

101

n=0

Largest value distribution of n variables, each following Gaussian distribution

T.Sakurai “CMOS VLSI design,” 1989, Baifukan, ISBN4-563-03450-9 C3055

102

103104

106

109

105

107108

1010

( ) ( )k

1.4 1.410

1 1SD ~ (if n 10 )k 1log n 1

σ = σ =++

T.Sakurai

Sim075

Sim150

x4 inverter RO

10 100 1k 10k 100k 1M0

100

200

300

400A

vera

ge V

DD

min

(mV)

n: Number of stages

Inverter ROSim125Sim100

VDDmin of million-stage ring oscillator

Matlab (Monte Carlo simulation)Measurement

Model calculation

10

Model calculationWorst case ~ x 2 log n+ σ

Center VTH=0.22V

T. Niiyama, P. Zhe, K. Ishida, M. Murakata, M. Takamiya, and T. Sakurai, “Increasing Minimum Operating Voltage (VDDmin) with Number of CMOS Logic Gates and Experimental Verification with up to 1Mega-Stage Ring Oscillators,” ISLPED’08, pp.117-122, Aug. 2008.

T.Sakurai

Next-Generation Power-Aware Design

3D integration

Deep sub-volt design

Organic integrated circuits (Green by IT)

T.Sakurai

Emergency control

On-demand logistics

Wearable

Individual adaptive marketing

Accidentavoidance

Nano sensor-net on bodyPersonal dataSmart shelfSmart cashier

Pedestrian assistance

Productiondata tracking

Anti-theft

Life supporting robots

Remote medicalassistance

Health monitoring

Environment monitoring

Rescue support Anormaly monitoring& alarming

Physical space

Virtual space

New electronics targets physical space

Disaster prevention

Human bodyHuman body

Transportation systemTransportation system

HomeHome

EnvironmentEnvironment

Town & streetTown & street

InterfacesInterfaces

Original drawing: Professor Hiroyuki Morikawa, University of Tokyo

T.Sakurai

Frequently-mentioned features of organic IC’s

Advantages

Low-cost manufacturingMechanical flexibility

Disadvantages

Low speed (<10-3 of Si VLSI)Low density (<10-4 of Si VLSI)

T.Sakurai

Cost consideration

Cost per function(processors, memories, analog, …)

Organic Si

Cost per area(sensors, display, actuators, …)

Organic Si

Good for Green by IT

T.Sakurai

Unique manufacturing process:Printing large-area

organic transistor array

T.Sakurai

Screen printingEpoxy partitions

Courtesy of Professor Takao Someya, University of Tokyo

T.Sakurai

Inkjet printing Gate electrodes & Word line

Gate electrodes : 45 x 45Word line : 45 lines

28 x 28 cm2

3 mm

T.Sakurai

Organic transistors

Source Drain

Insulator+ -

Gate

OFF ON

Voltage

+ -Current

Pentacene

Organic semiconductors: main elements --- C & H

Organic semiconductor

T.Sakurai

-40 -30 -20 -10 00

10

20

30

VDS [V]

I DS

[µA

]

L=100µm, W=2mm

VGS=-40V

-30V

-20V

-10V

200k

200k

Level 1 SPICE MOS model

MeasurementSimulation

S

D

G

SPICE & VLSI layout tool work.

Modeling by SPICE level1

T.Sakurai

Large-area electronicsHuman-scale interfaces

E-skin

IEDM’03ISSCC’04

Sheet scanner

IEDM’04ISSCC’05

Braille display

IEDM’05ISSCC’06

IEDM’06ISSCC’07

Power sheet

Pressure sensors + OFETsPhotodetectors + OFETs

Actuators + OFETsCoils + MEMS + OFETs

IEDM’07ISSCC’08

Organics + Si co-design

Comm sheet

T.Sakurai

Efficiency ~ 0.1%

Efficiency > 60%

Large-area & high efficiency

Large coil

30x30 cm2 X 1 coil

Receiver coil

Receiver coil

1 inch2 X 64 coils

Many coils& one selected 1 inch2

Selective activation is the key.

1 inch2

Electro- magnetic induction works

T.Sakurai

Combination of MEMS and OFET

Organic FETsPosition-sensing coils Contactless

position-sensing system

Plastic MEMSswitches

Wireless power transmission system

Power transmission coils

Low lossSlow ~ 0.1s# of switching limited

21 x 21 cm2 (8 x 8 cells)

ResistiveFaster < ms# of switching unlimited

T.Sakurai

MEMS switches

~ 5mm x 10mm

T.Sakurai

Wireless power transmission sheet

Size : 21 x 21 cm2Thickness : 1 mmWeight : 50 gEfficiency : 62.3%Max received power : 29.3 WHigh power

Contactless position sensingLarge-area & Low cost

Lightweight & Printable

T.Sakurai

X’mas tree w/o a battery wirelessly powered

21 LEDs

13.56 MHz

Received power : 2 W

T.Sakurai

(Ubiquitous electronics)

Home-care robot Vacuum cleanerIn the floor

In the wall

TV on a wall

In the table

Mobile phone & PC & e-accessories(data can be wireless but USB’s wire delivers power)

Ambient illumination

Demonstration of power transmission

T.Sakurai

No electrical shock

I touched it by my hand. No problem ☺

T.Sakurai

Next-Generation Power-Aware Design

3D integration

Wireless link (0.17pJ/b 0.03pJ/b)

Deep sub-volt design

Watch out for random WID variation

Organic integrated circuit

Large-area electronics for Green by IT

All can help to realize COOL EARTH through “Green of IT” and “Green by IT”.