REGISTER-TRANSFER LEVEL

8/8/2019 REGISTER-TRANSFER LEVEL

1/8

Register-Transfer Level Estimation Techniques for Switching Activity

and Power Consumption

Anand Raghunathan

Dept. of EE, Princeton Univ.

Princeton, NJ 08544

Sujit Dey

C&CRL, NEC USA

Princeton, NJ 08540

Niraj K. Jha y

Dept. of EE, Princeton Univ.

Princeton, NJ 08544

In Proc. IEEE International Conference on Computer-Aided Design, pages 158-165, November 1996

Abstract

We present techniques for estimating switching activity andpower consumption in register-transfer level (RTL) circuits. Pre-vious work on this topic has ignored the presence of glitchingactivity at various data path and control signals, which can lead

to significant underestimation of switching activity. For data pathblocks thatoperateon word-level data, we construct piecewiselin-ear models thatcapture the variation of outputglitching activity andpower consumption with various word-level parameters like mean,standard deviation, spatialand temporal correlations, and glitchingactivity at the blocks inputs. For RTL blocks that operate on datathat need not have an associated word-level value, we present ac-curate bit-level modeling techniques for glitching activity as wellas power consumption. This allows us to perform accurate powerestimation for control-flow intensive circuits, where most of thepower consumed is dissipated in non-arithmetic components likemultiplexers, registers, vector logic operators, etc. Since the finalimplementation of the controller is not available during high-leveldesign iterations, we develop techniques that estimate glitchingactivity at control signals using control expressions and partial de-lay information. Experiments on example RTL designs resulted in

power estimates that were within 7% of those produced by an in-house power analysis tool on the final gate-level implementation.

I. IntroductionTechniquesfor evaluatinga design forvarious metrics like area,

delay,and power consumption at all levels of the design hierarchyare an important part of the design process. While it is typicallythe case that lower-level estimation tools offer higher estimationaccuracy, their use to explorearchitectural tradeoffs during higher-level design tends to be prohibitively time-consuming. Severalefficient techniques for estimating area and delay during high-level design have been proposed [1, 2, 3, 4]. In this paper, wefocus on the problem of estimating power consumption from RTLdescriptions.

Designs at the architecture or RT level are characterized by theinstantiation of pre-designed macro blocks including arithmetic

operators, multiplexers, registers, vector logic operators, etc. Inorder to avoid the increase in computational complexity intro-duced by expanding these blocks to lower-level descriptions, it isnecessary to develop power models for these library blocks. Thedevelopment of such black-box models is an important part ofhigh-level power estimation, and is typically a one-time cost in-curred during library development. Another important task to beperformed in RTL power estimation is the estimation of switchingactivity and signal statistics at various signals in the RTL circuit,

Supported by NEC C&C Research Labsy Supported by NSF under Grant No. MIP-9319269.

that are then fed into the power models for each block to estimatepower consumption.

One of the early architecture level power estimation techniques,called the power factor approximation (PFA) method [5], charac-terized the power consumption in architectural blocks by simu-lating their implementations using random input sequences. Theinability of the PFA technique to account for the dependency ofpower consumption in embedded modules on their input statis-tics was addressed in [6] using a dual bit-type (DBT) model for

word-level signals. Activity-sensitive capacitance models weredeveloped for various library components, and coupled with zero-delayactivity information derivedfromRTL simulation toe stimatepower consumption. As pointed out in [6], the DBT model is mostapplicable to data-flow intensive designs,since it assumes that eachmulti-bitsignal can be associated with a word-levelvalue,whoseprobability distribution satisfies certain assumptions. In [7, 8], theuse of computational entropy and informational energy as mea-sures of switching activity was proposed. An activity-based con-trol model to estimate controllerpowerconsumption was presentedin [9].

Most previous work on RT level power estimation has not fo-cused on control-flow intensive designs, which have significantlydifferent power consumption characteristics. Unlike data-flowintensive designs, non-arithmetic components like multiplexers,registers, vector logic operators, and the controller dominate the

total power consumption. Due to the complex control flow, thecontroller has a significant impact on the power consumption ofthe circuit, necessitating accurate activity analysis techniques forcontrol signals. In addition, control-flow intensive designs oftencontain multi-bit signals that do not collectively have any meaningas a numberand hencecannot be modeled using techniques suchas the DBT model. Previous work has also ignored the presenceof glitching activity at various signals in the RTL circuit, and itseffect on power consumption, which can lead to significant errorsin the power estimates as shown in the following section.

II. MotivationWe illustrate some of the issues involved in RTL power es-

timation through the analysis of an example RTL circuit shownin Figure 1, that computes the greatest common divisor (GCD) oftwo numbers. The RTL blocks used in the GCD data path are onesubtractor, three (one less-than (


2/8


3/8

139.5/41.5

8 8 .5 / 8 8 .5

51.5/51.5

state[0]

state[1]

state[2]

x3

x4 contr[1]

G0

G1

G2

G3

G4

G5

Figure 3: Implementation of control signal c o n t r 1 in the

barcode RTL circuit

the generation and use of power models differs from those pre-sented in [6] in accounting for glitches, and the fact that we usebit-level models for RTL blocks that operate on bit-vectors thatmay not be associated with a word-level value (e.g. multiplex-ers, registers, bit-vector concatenation and splitting, v ector logicoperations, etc).

IV. Estimating glitching activity at the RT levelIn this section,we presentour models for glitch generationin

and propagation through various components of the RTL circuit.

A. Glitching activity at the control signals

The controllers inputs are the status signals from the data path(typically outputs of comparators or combinations thereof), whileits outputs are the control signals that feed the data path. Thecontrol logic is usually represented as control expressions duringthe high-level synthesis process. These control expressions are

expressed in the form

c o n t r =

X

x

Y

j

C j

!

( 1 )

where x represents a decoded controller state variable (corre-sponding to controller state s ), C j represents a status signal,which is typically the output (or inverted output) of a comparatorfrom the data path, and

P

andQ

represent the Boolean OR andAND operations, respectively. Each product term in the controlexpression is derived to flag the occurrence of a particular com-bination of values at the status signals when the controller stateis s . The status signals (C j ) may themselves carry glitches, thatpropagate through the control logic, causing the control signals tobe glitchy. On the other hand, the control logic can itself generatea significant amount of glitching activity.

Accurate estimation of glitch generation and propagationin thecontrol logic requires detailed information regarding the structureof the controller implementation and delays. However, the finalimplementation of the controller is typically not available duringhigh-level design iterations. Hence, we estimate glitching activityat control signals using their control expressions.Estimating Glitch Generation from Control Expressions.Glitch generation in the control logic is a result of the interac-tion of certain logic and temporal conditions, as illustrated by thefollowing example.

Example 1: Let us consider a portion of an RTL circuit that is apre-processor for a barcode reader. We focus on a particular con-trol signal, c o n t r 1 , whose implementation is given in Figure 3.Signals s t a t e 2 , s t a t e 1 and s t a t e 0 are fed by the flip-flops ofthe controller. Signals x 3, x 4 and controlsignal c o n t r 1 are anno-tated with their transition counts including and excluding glitches,

indicating glitch generation at gateG

5. Consider the partial statetransition graphfor the controller that is shown in Figure 4(a). Thefigure indicates a loop involving states s 3 and s 4, that is executeda large number of times. Figure 4(b) shows how the inputs andoutput of gate G 5 vary under these two state transitions. A tran-sition from s 3 to s 4 causes a rising transition on x 4 and a fallingtransition on x 3. However the rising transition on x 4 arrives laterthan the falling transition on

x 3, due to the delays of inverters G 1and G 2, resulting in a 1 ? 0 ? 1 static hazard or glitch at the outputof gate G 5. A similar explanation holds for the controller statetransition from s 4 to s 3. The generation of glitches at G 5 can beattributed to the following two conditions:

S3

S4

x3

x4

x3

x4

G5

G5

contr[1]

contr[1]

(a) (b)

Figure 4: (a) Partial STG for barcode controller, and (b) Gen-

eration of glitches at gate G 5

Logic: correlation between (simultaneousoccurrence of) ris-ing and falling transitions at the inputs of G 5.

Temporal: the controlling 2 to non-controlling transition atthe input of G 5 arrives earlier.

In general, the logic conditions necessary for glitch generation ata gate are as follows.

There should be at least one rising and at least one fallingtransition at the gates inputs.

No input should assume a steady controlling logic value.

Assuming an inertialdelay model,the temporal condition forglitchgeneration in an AND gate is as follows.

The earliest falling transition arrives after the latest risingtransition by an interval that is greater than the gates inertialdelay.

Similar conditions can be derived for glitch generation in othertypes of gates.

Given a control expression in a sum-of-products form (Equa-tion (1)), during the zero-delay RTL simulation, we maintain adistinct glitch counterfor each product term, and also for the ORexpression combining the product terms. In each cycle, we checkthe previous and currentvalues at the variablesinvolved in an ANDor OR expression to see whether the logic conditions for glitch gen-eration are satisfied. If they are, we increment the correspondingglitch counter to indicate the possibility of glitch generation in thecurrent simulation cycle.

As mentioned earlier in this section,checkingwhether the tem-poral conditions for glitch generation are satisfied in an accuratemanner requires the final implementation for the control logic,which is typically not available when performing high-level de-sign optimizations. One possible approach to tackle the lack ofaccurate delay information is to make a pessimistic assumption,i.e. assume that glitches are generatedat a gate wheneverthe logicconditions for glitch generation are satisfied. However, such anassumption often leads to substantial over-estimates of glitchingactivity, as shown in the following example.

Example 2: We would like to estimate the glitching activity atcontrol signal c o n t r 2 in the GCD RTL circuit of Figure 1. Thecontrol expression for c o n t r 2 is x 0 + x 1 C 11 + x 3 C 10. Inthis case, the signals C 10 and C 11 were found to be glitch-free,simplifying the problem to that of estimating glitch generation atc o n t r

2 .Clearly,the first productterm (x 0) cannot generateany glitches.

From the simulation traces, we counted the number of times thelogic conditions for glitch generation were satisfied for the secondand third product terms.

C a s e 1 : C o u n t ( x 1 # C 11 " ) = 15C a s e 2 : C o u n t ( x 1 " C 11 # ) = 20C a s e 3 : C o u n t ( x 3 # C 10 " ) = 35C a s e 4 : C o u n t ( x 3 " C 10 # ) = 30

2A controlling input value for a gate uniquely determines the value at

the gate output, irrespective of the values at the gates other inputs.


4/8

In the above equations, the symbols " and # denote the risingand falling transitions, respectively. From the above numbers,one could conclude that the glitching activity generated due to

the second and third product terms is 35 and 65, respectively. 3

The glitches generated due to each product term propagate to theoutput un-mitigated, since the decoded state variables are mutuallyexclusive. From the given traces, it was observed that the logic

conditions for glitch generation at an OR gate were never satisfiedby the outputs of the product terms. Hence, the glitching activityat control signal c o n t r 2 was estimated to be 100 transitions overthe entire simulation period. A comparison with the glitchingactivity observed by CSIM for the same input traces and reportedin Table 1 (72 ? 20 = 52) shows that the glitching activity atc o n t r 2 was over-estimated by as much as 92 3%!

Although exactarrivaltime information at varioussignals is notavailable, it is often possible to derive partial information aboutdelays from RTL descriptions or during high-level synthesis. Forexample,the outputsof comparators canoften be assumedto arrivelater than the decoded present state signals, even when we do nothave any knowledge of their exact arrival times. Inputs to thecontrol logic are divided into three groups - early arriving signals,late arriving signals, and signals whose arrival time information isassumed to be unknown. We assumethat each controller input that

is marked as late-arriving arrives significantly later than any inputsignal that is marked as early-arriving. No assumption is madeinvolving the arrival time of a signal marked unknown. Similarly,no assumption is made about the relationship between the arrivaltimes of two signalsthat areboth marked eitherearly or late. Whenthe temporal conditions for glitch generation at a gate involvesignals whose arrival time relationship is unknown, we revert tothe pessimistic approach of only checking logic conditions.

Example 3: Let us revisit the control signal c o n t r 2 in the GCDcircuit that was used for the discussions in Example 2. Supposewe are allowed to make the assumption that the comparator outputsignals, C 10 and C 11, arrive after the decoded state variables,x 0, x 1 and x 3. Consider C a s e 1 (x 1 # C 11 " ) in the equationspresented above. Since the rising transition arrives later than thefalling transition in this case, the temporal conditions for glitch

generation are not satisfied. Similarly, it can be seen thatC a s e

3does not satisfy the temporal conditions for glitch generation. Therevised glitching activity estimate for c o n t r 2 is 50, which repre-sents an error of only 4% with respect to the number reported byCSIM.

Glitch propagation through the control logic. Consider againthe generic control expression given in Equation (1). Considera particular comparator output, C 1, that we have predicted to beglitchy based on our data path glitching activity models. Let usre-write the control expression by separating the product termsinto terms in which C 1 appears, terms in which C 1 appears, andterms that do not depend on C 1.

c o n t r = C 1: c o n t r C

1 +

C 1 : c o n t r C

1 (2)

+

X

p r o d u c t t e r m s i n d e p : o f C 1

x

Y

j

C j

c o n t r

C

1 =

X

p r o d u c t t e r m s w i t h C

1

x

Y

C j 6= C

1

C j

c o n t r C

1 =

X

p r o d u c t t e r m s w i t h

C

1

x

Y

C j 6=

C

1

C j

3Note that each time the conditions for glitch generation at a gate are

satisfied, the outputundergoes two transitions. However, for compatibility

with CSIM, which counts each 0 ! 1 and 1 ! 0 transition as half a

transition, we do not multiply by the factor of two.

In order for glitches at C 1 to propagate to the control signal, atleast one of the product terms it is involved in must have non-controlling side-inputs (i.e., 1), and the result of all other productterms should evaluate to 0. Hence, the following equation canbe utilized to estimate the propagation of glitches to the controlsignal.

G l ( C

1) P f ( c o n t r

C

1=

1O R c o n t r

C

1=

1) A N D

X

p r o d u c t t e r m s i n d e p : o f C

1

x

Y

j

C j = 0g (3)

In the above equation, G l ( C 1) represents the glitching activity atC 1, and is multiplied by the probability that the control signal willbe sensitizedto glitches at C 1. This probability canbe computedeasily during the zero-delayRTL simulation. Equation(3) assumesthatthe conditions forglitch generationat C 1 andthe conditionsforthe glitches propagatingthrough the controllogic are uncorrelated,which can leadto errors in the estimated activities. We resolve thisproblem by maintaining separate data statistics for each state, andfor each signal in the transitive fanin of C 1, and hence computeseparate glitching activity estimates for each state [11].

20

30

40

50

60

70

80

90

20 30 40 50 60 70 80

RTLAc

tiv

ity

Es

tima

te

CSIM Activity

Figure 5: Scatter plot of Switching Activity at Control Signals:

RTL Estimate v/s CSIM

In order to get a feel for the accuracy of our switching ac-tivity estimation techniques for control signals, we obtained acomplete gate-level implementation of the GCD circuit and esti-mated switching activities using CSIM. The scatter plot shown inFigure 5 plots the switching activity estimated using our RTL es-timation techniques (y -axis) vs. the switching activity reported byCSIM (x -axis), foreach distinct control signal. As a reference, theplot also shows a solid line for the equation y = x . The figure in-dicates thatour techniques produceestimates thatare quiteclose tothe activity numbers obtained using CSIM after a time-consumingimplementation of the complete GCD controller and data path.

B. Modeling glitch generation and propagation in data pathblocks

For data path blocks which operate on multi-bit input signalsthat are associated with a word-level value (e.g. adders, subtrac-tors, multipliers, and various comparators), previous work [6] has

shown thatit is possible to constructactivity-sensitive powermod-els that utilize word-level statistics (mean,standard deviation,etc.).Several datapath blocks,however, do not associate any word-levelvalue to their multi-bit input signals. Common examples of suchblocks are multiplexers, registers, vector logic operations, logicshift units, etc. We model each bit-slice of such units separately.This allows us to build more accurate glitching activity models forsuch blocks, and to consider the effects of bit-level statistics thatmay not be well reflected by word-level signal statistics in cer-tain situations. Since multiplexers play an important role in glitchgeneration and propagation for control-flow intensive designs,weexplain glitching activity model for multiplexers in detail next.


5/8

Documents

REGISTER-TRANSFER LEVEL