Design of a Novel Radix 4 Booth Multiplier

Embed Size (px)

Citation preview

  • 8/6/2019 Design of a Novel Radix 4 Booth Multiplier

    1/4

    The 2004 IEEE Asia-Pacific Conference onCircuits an d Systems, December 6-9,2004

    Design of a Novel Radix-4 BoothMultiplierHsin-Lei Lin, Rober t C. Chang, Ming-Tsai Chan

    Departmenf of Electrical Engineering,National Chung Hsing University, Taichung, Taiwan

    ABSTRACTThis paper presents a novel radix-4 Booth multiplier. Aconventional Booth multiplier consists of the Boothencoder, the partial-product summation tree, and the c any-propagate adder. Different schemes are addressed toimprove the area and circuit speed effectively. A novelmodified Booth encodeddecoder is proposed and thesummation column is compressed by the proposed MFAr.The proposed design is simulated by Synopsys and Apollo.It results 20% are a red uct ion , 17%&-24% pow er decrease,and 15 % reduction of the delay time o f the critical path.

    1. INTRODUCTIONThe multiplier using the Booth algorithm is a well-knowtechnique for high-speed and low-cost multipliers. Thereare many researches on high-speed Booth multipliers, andthe main technique is the radix-4 Booth encode[l-6].Although radix-4 Booth can reduce the input bits and theoutput bits to half, it also increases the time of compression.In order to get a better system performance, we haveimproved the circuit of the radix-4 Booth multiplier in thispaper.In the CPU and DS P processor design, we use themodified multiplier sch eme widely a nd comm only. Thereare several types of multiplier such as series, parallel, array,and encoding. The property of these multipliers as weknow that the series multiplier is the simplest structure, heparallel scheme is higher-speed, the matrix one is moredifficult when it is used on symbol operation, and theencoding one is much more efficient when it is used onsymbol operation. Therefore we modify the encoder and

    the decoder in order to reduce area and increase the wholespeed.This paper is organized as follows. Section I1 discusses theproposed radix-4 Booth multiplier which this paper isproposed. Section I11 compares the proposed radix-4 Boothmultiplier structure with a standard one. Section IV is theconclusion.

    2. Radix-4 Booth MultiplierIn this section, we present a novel scheme using themodified Booth encoderidecoder (MBE) and the re-modified full-adder (MFAr). It is improved from the Yen'sMB E [ I ] and the original 4-to-2 compressor to reduce the

    critical path and area. Figure 1 sho ws the proposed radix-4Booth multiplier, which consists of the 3-bit Boothencoder/decoders, the com pressors, and the carry-propagate adder [7-111. The Booth encoder/decoder is thefirst part of the multiplier when we start to calculate thevalue of multiplicand a nd multiplicator. Instead of thepartial-product summation tree (PPST), the Boothencoder/decoder makes the calculation faster. The radix-4MBE is useful for the parallel multiplier by 3-bit encodingif the bit number of the operation is not incredible large.The n-bit multiplicator input, denoted as X, is divided into3-bit groups for the Booth encoder. The encodedinformation is for the n-bit multiplicand input, which isrepresented as Y, to get the d 2 rows partial product valueafter the decoding. AAer the decoder, we get the (n+3) bitsof output at the first row, and the (n+2) bits at the others.There would be 2 bits left and shifted between each row,not including the first row, after the decoder. It still needsthe compressor to simplify the counter for calculating thebinary number equal to the number of logic-I inputs.Compressor calculates on n-bit input (n>2) into a 2-bitoutput [4]. The bit number of the multiplier becomes (2n+2)after the process of the compressors. The carry-propagateadder is used for the two outputs of the n+l compressors,cany out and sum values, to result the final value of theproduct of X an d Y.

    1Wfhrm? I IR VI w "i,*r, *, :I, I x*

    Fig.] The proposed radix-4 Booth multiplier

    0-7803-8660-4/04/$20.0002004 IEEE 837

  • 8/6/2019 Design of a Novel Radix 4 Booth Multiplier

    2/4

    There are five outputs after the conventional radix-4encoder. The following equation is the decoding algorithm.P P I , = Z ( M I Y , + M Z Y , + P l Y , + P2Y, . , ) (2)

    Some of the radix-4 Booth encoderldecoders havebeneficial property for area and timing [I]. Here, the newencoding circuit is designed by rebuilding the Boothencoder truth table into a new one shown in Table 1.

    Table 1 Truth table of the new MBE schemeX ~ , + I Xli I X2,.I I X l b I X2b I Ne g0 I o I o I I I I I oI I I0 1 0 I I 1 0 11 100 I I I o I o I I I o

    The novel encoding circuit is shown in Figure 2. It h& a 3-bit input and generates a 3-bit output for the decodingcircuit. Figure 3 shows the de coding circuit which receivesthe signals from the encoding circuit and generates thepartial product results.

    w.1IFig2. The novel encoding circuitFig3. The novel decoding circuit

    2.2. The Eff icient C ompressorThe compressor simplifies the multi-row partial-productdecoded by the Booth decoder into two rows. Figure 4shows the compressor structure for an 8-bit input. Thenumber of bits in each column is different such that eachcolumn has different compression ratio. We construct acompressor with three 340-2 compressors and three 4-to-2 compressors. In Figure 4(a), the first column has nodelay time that does not need to be compressed, the sixthcolumn data with 3-bit input is compressed by a 3-to-2compressor, and the eighth column is compressed by a 4-to-2 compressor. The su m of the output will be generatedsimultaneously if the numbers of compression bits are thesame.

    Y!r x>> xu x u a: X,? D.7 Xlrhlr xu

    (h)Fig.4 Compressors (a) The column position of compression@) The 4-inputs and 3-inputs compressor

    Figure 5 shows a standard compressor. It is composed of8 NAND gates and 4 XO R gates, so that there is a longdelay time for a multiplier.

    838

  • 8/6/2019 Design of a Novel Radix 4 Booth Multiplier

    3/4

    Le t u = l l d 1 3 d I , x=c w e g e t t h e c a r r y b yU , I , a n d Cin .Carry=X ( I ,+Cin)+ X( I ,C in )=U ( I ,+ in) +C(I,cin) (3 )

    =UI, +UCin+ I,CinIn order to shorten the delay time, the algorithm isrewritten as follows.Let U = Il d 3d , an d T = I l dC i n-Carry = rCr+TI, __= (7;cin+ I ,=) U + ( I ,Cin+I,cin)I, (4 )

    = I,U +CinU + I,CinIt demonstrates that equation (4) is equal to equation (3).We resbucture the compressor into the proposed one thatwe name it MFAr. Here, we merge one NAN D-gate andtwo NOR -gates into one XO R-gate, as shown in Fig.6. Itnot only decreases the delay time but also lowers the cost.

    Types of the m odifiedBooth encoder

    3. COMPAIUSION AND ANALYSISThe proposed multiplier is implemented by Synopsysand Apollo library. Table 2 gives the comparison resultsbetweenthe new multiplier and the other four differentkinds of the radix-4 Booth multiplier. Obviously, thenumbers of the transistor of the p roposed circuit is less thanMBE-I11 and MBE-IV. From the delay time calculated bySynopsys, we can see that the proposed circuit is fasterthan MBE-I and MBE-11.

    Transistor Delay(ns) Delay (ns)(Apollo) (Synopsys)MBE-1 P IMBE-II [91MBE-I11 [IO]MBE-W U1

    Proposed

    IO 3.0 1.716 3.5 1.8220 2.0 0.9718 2.0 0.9716 2.5 1.18

    liC F

    S1

    FigS. 4 to 2 compressor

    ce

    The row number of the partial product and the columnbits are decreased after the encoding of the new modifiedBooth encoderldecoder, and the compression ratio isdecreased, too. Figure 7 hows the column bits between theradix-4 Booth multipliers and the matrix multipliers.Because Booth encoders number of bits are decreased, andthus the speed increases. Table 3 fives the comparisonresults between various compressors. The MFArsperformance is better than the others, no matter in delaytime, area, and power. Regarding the delay time of thecritical path, it can reduce about 15% than the others.Regarding the area cost, it can economize about 20%.Regarding the power consumption, it can decrease about17%-24%. From the above, we can h o w that MFArindeed can improve the circuit performance of thecompressor.

    Delay(ns Area Power(mw)C

    ~ .._._.____._I.I_

    ~ Cm

    Fig6. MFAr (proposed)

    Compress- IA 0 1 I 1.34 I 21.4 I 2.75

    MFAr 17.39 2.2915% 20% 17?&24%I I I I INote: AOI(!nd ition); MF A(modify full-adder intocompressor ); MFAr(re-modified full-adder tree )

    839

  • 8/6/2019 Design of a Novel Radix 4 Booth Multiplier

    4/4

    Fig7. Compare the matrix multipliers with the radix-4Booth multipliers

    The area of the radix-4 Booth multiplier is comparedwith the array multiplier by gate-count. The gate-count ofthe radix-4 Booth multiplier is about (N1/2 full adders(FAs)) + (/2 decoders), and the gate-count of the arraymultiplier is about (N1FAs) + (N1/2 AND gates). Assumethe input has 16 bit. If the novel modified radix-4 Boothmultiplier and the array multiplier both use the samecompression method, then the array multipler uses only108FAs. Therefore, the area of the matrix multiplier isabout 1.5 times of that of the Radix-4 Booth multiplier. Itcan not only reduce the power consumption but also reducethe circuit complexity.

    4. CONCLUSIONWe have shown in this paper that the use o f the new Boothencoderldecoder and the proposed compressor can trulydecrease the circuit area and the delay time of the criticalpath. Table 2 shows that the proposed design has smallerarea than MBE-111 and MBE-IV and is faster than MBE-Iand MBE-11. Table 3 gives that the are a of the comp ressoris reduced to SO%, the delay time of the critical path to85% reduction, and the power to 76%-83%.

    6. REFERENCES[l]W.-C. Yeh, C.-W. Jen, High-speed booth encodedparallel multiplier design, IEEE Transactions onComputers.vol.4Y.no.7, p.692-701, July 2000.[2]B. Parham i, Computer Arithmetic, Oxford 2000.[3]P. Bonatto, V.G. Oklobdzija, Evaluation of Boothsalgorithm for implementation in parallel multipliers,Signals, Systems an d Compu ters, Conference Record ofthe Twenry-Ninth Asilo mar, Oct. -Nov., 1995.[4] V.G. O klobdzija, Im proving multiplier design by usingimproved column compression tree and optimized finaladder in CMOS technology, IEEE Tramactions onVery Large Scale Integration Systems, vo1.3, no.2,pp.292-30, July 1995.[5]V.G. Oklbdzija, D. Villeger, S. S . Lin, A method forspeed optimized partial product reduction andgeneration of fast parallel multipliers using an

    algorithmic approach, IEEE Transactions onComputers, vo1.45, no.3, pp.294-306, March 1996.[6]D. Villeger, V.G. Oklobdzija, Analysis of Boothencoding efficiency in parallel multipliers usingcompressors for reduction of partial products,Proceeding of the 27 Asilomar Conference onSignals, Systems and Com puters,pp.781-784, 1993.[qA.R. Cooper, Parallel architecture modified Boothmultiplier, IEE Proceedings, ~01 .135 , pt.G, no.3,pp.125-128, June 1988.[8]G. Goto, A. Inoue, A 4.1-11s compact 54 X 54-bmultiplier utilizing sign-select Booth encoders, IEEEJournal of Solid-State Circuits, vo1.32, no.11, pp.416-417,Nov 1997.[9]G. Goto et al., A 54 x 54-b regularly structured treemultiplier, IEEE Journal of Solid-State Circuits,vo1.27, no.9, Sep. 1992.

    [1O]R. Fried, Minimizing energy dissipation in high-speedmultipliers, Intl Symp. Low Power Electronics andDesign, pp. 214-219, 1997.[11]F. Elguibaly, A fast parallel multiplier-accumulatorusing the modified Booth algorithm, IEEETransactions on Circuits and Systems 11:Analog andDigifal Signal Processing, vo1.47, pp. 902-908, Sep.2000.

    5. ACKNOWLEDGMENTThis work was supported by the National Science Councilof Taiwan under grant NSC 92-2220-E-005-004, and MengYauC hip Center.

    840