16
ANALYSIS AND MODELING OF LOW POWER ARRAY MULTIPLIERS USING CADENCE VIRTUOSO SIMULATOR IN 45 nm TECHNOLOGY B. VAMSI KRISHNA 1 & K. DHANUNJAYA 2 1 PG Student, Department of ECE, ASCET, Gudur, Andhra Pradesh, India 2 Head of the Department, Department of ECE, ASCET, Gudur, Andhra Pradesh, India ABSTRACT The increasing demand for the high fidelity portable devices has laid emphasis on the development of low power and high performance systems. In the next generation processors, the low power design has to be incorporated into fundamental computation units, such as multipliers. The characterization and optimization of such low power multipliers will aid in comparison and choice of multiplier modules in system design. In this paper we performed a comparative analysis of the power, delay, and power delay product (PDP) optimization characteristics of five parallel digital multipliers implemented using low power 10 transistor (10T) Adder. Multipliers realized using the 10T Adder circuit because 10T Adder consumed considerably less power compared to Static Energy Recovery Full adder (SERF) and static CMOS Adder for all the configurations studied., design of five different array multipliers are presented. The multipliers presented in this paper were all modeled using Cadence Tools (Virtuoso Simulator). The comparison is done on the basis of three performance parameters i.e. Area, Speed and Power consumption. The difference between the power consumption of the 10 transistor based multipliers is significant at 45 nm. For smaller geometry sizes down to 45 nm, the propagation delay of the multipliers implemented with 10 transistors translates to a better performance measure. This can be attributed to the fast computational capability of the CSA multiplier and 10T adder logic saving more power at deep sub-micron sizes. The proposed SERF-10T Hybrid adder model multipliers consumed the least power of all the Hybrid and regular models with no deterioration in performance. Taken together, these results suggest that SERF-10T Hybrid model based multipliers are suited for ultra low power design and fast computation at smaller geometry sizes. KEYWORDS: CMOS 28T, SERF, PDP, Sacrificing Performance INTRODUCTION Multipliers are heart of the Digital processors such as microprocessors and DSPs. So, power consumption in Digital processors is mainly depends on Multiplier.so,if we reduce the power consumption in multipliers we can reduce the oveall processors power consumption. Their is Always a tradeoff between power and speed. So improving speed results more power consumption, lesser the speed reduces power consumption. So here we try to find out the best trade off solution among the both of them. One of the most important unit in the Multiplier or any other circuit is full adder. If we decrease the power consumption of adder we can reduce the major power consumption.Various multiplier schemes have been proposed till date including Bit Array multiplier. Carry save Array Multiplier, Baugh Wooley Multiplier, Braun Array Multiplier, Vedic Multiplier given a good tradeoff to the problem and design optimizations. The options currently available offer a wide range of tradeoffs in terms of speed, complexity and power consumption. To achieve higher processing speeds, parallel multipliers are usually adopted at the expense of high area complexity. Multiple parallel multiplication algorithms architectures have been proposed to reduce the chip area and increase the speed of the multipliers. Various techniques have been developed to reduce the power dissipation of parallel International Journal of Electronics, Communication & Instrumentation Engineering Research and Development (IJECIERD) ISSN 2249-684X Vol. 3, Issue 4, Oct 2013, 49-64 © TJPRC Pvt. Ltd.

array multiplier using

Embed Size (px)

DESCRIPTION

using cadence virtuso 4-bit array multiplier is designed using low power satic energy recovery full adders

Citation preview

  • ANALYSIS AND MODELING OF LOW POWER ARRAY MULTIPLIERS USING

    CADENCE VIRTUOSO SIMULATOR IN 45 nm TECHNOLOGY

    B. VAMSI KRISHNA1 & K. DHANUNJAYA

    2

    1PG Student, Department of ECE, ASCET, Gudur, Andhra Pradesh, India

    2Head of the Department, Department of ECE, ASCET, Gudur, Andhra Pradesh, India

    ABSTRACT

    The increasing demand for the high fidelity portable devices has laid emphasis on the development of low power

    and high performance systems. In the next generation processors, the low power design has to be incorporated into

    fundamental computation units, such as multipliers. The characterization and optimization of such low power multipliers

    will aid in comparison and choice of multiplier modules in system design. In this paper we performed a comparative

    analysis of the power, delay, and power delay product (PDP) optimization characteristics of five parallel digital multipliers

    implemented using low power 10 transistor (10T) Adder. Multipliers realized using the 10T Adder circuit because 10T

    Adder consumed considerably less power compared to Static Energy Recovery Full adder (SERF) and static CMOS Adder

    for all the configurations studied., design of five different array multipliers are presented. The multipliers presented in this

    paper were all modeled using Cadence Tools (Virtuoso Simulator). The comparison is done on the basis of three

    performance parameters i.e. Area, Speed and Power consumption. The difference between the power consumption of the

    10 transistor based multipliers is significant at 45 nm. For smaller geometry sizes down to 45 nm, the propagation delay of

    the multipliers implemented with 10 transistors translates to a better performance measure. This can be attributed to the fast

    computational capability of the CSA multiplier and 10T adder logic saving more power at deep sub-micron sizes. The

    proposed SERF-10T Hybrid adder model multipliers consumed the least power of all the Hybrid and regular models with

    no deterioration in performance. Taken together, these results suggest that SERF-10T Hybrid model based multipliers are

    suited for ultra low power design and fast computation at smaller geometry sizes.

    KEYWORDS: CMOS 28T, SERF, PDP, Sacrificing Performance

    INTRODUCTION

    Multipliers are heart of the Digital processors such as microprocessors and DSPs. So, power consumption in

    Digital processors is mainly depends on Multiplier.so,if we reduce the power consumption in multipliers we can reduce the

    oveall processors power consumption. Their is Always a tradeoff between power and speed. So improving speed results

    more power consumption, lesser the speed reduces power consumption. So here we try to find out the best trade off

    solution among the both of them. One of the most important unit in the Multiplier or any other circuit is full adder. If we

    decrease the power consumption of adder we can reduce the major power consumption.Various multiplier schemes have

    been proposed till date including Bit Array multiplier. Carry save Array Multiplier, Baugh Wooley Multiplier, Braun Array

    Multiplier, Vedic Multiplier given a good tradeoff to the problem and design optimizations.

    The options currently available offer a wide range of tradeoffs in terms of speed, complexity and power

    consumption. To achieve higher processing speeds, parallel multipliers are usually adopted at the expense of high area

    complexity. Multiple parallel multiplication algorithms architectures have been proposed to reduce the chip area and

    increase the speed of the multipliers. Various techniques have been developed to reduce the power dissipation of parallel

    International Journal of Electronics, Communication &

    Instrumentation Engineering Research and

    Development (IJECIERD)

    ISSN 2249-684X

    Vol. 3, Issue 4, Oct 2013, 49-64

    TJPRC Pvt. Ltd.

  • 50 B. Vamsi Krishna & K. Dhanunjaya

    multipliers. While several of these techniques reduce power dissipation by eliminating spurious transitions others have

    focused on developing novel multiplier architectures.

    In this paper it is tried to find out the best solution to achieve low power consumption, less area required and high

    speed for multiplier operation. This research uses an approach to significantly reduce the power consumption and the chip

    area of the parallel multipliers, without sacrificing performance. The approach is based on using low power, minimal

    transistor count adders that are the determining in the performance of the multiplier. The operation of a parallel multiplier

    can be divided into two parts: (a) formation of the partial products, and (b) summation of these products to form the final

    product of the multiplication.

    In this study, we investigated the power and delay performance characteristics of three different Adders Circuits

    are SERF, 10T and CMOS 28T adders. For comparative study, we realized 10T adder has better performance. By using

    10T adder five Different Multipliers Bit Array, Carry-Save, Braun Array, Vedic Multiplier and Baugh Wooley are realized.

    The tradeoffs between speed and power of these multipliers were compared In Section 2, we describe the CMOS 28T

    adder, SERF, and 10T adder circuits used in our design. Section 3 describes the multiplier architectures. Section 4

    describes the simulation methodology used. In Section 5 the results of simulation study are discussed and Section 6

    presents a summary of the paper and the concluding remarks.

    ADDER MODULES

    Adders are the fundamental building blocks in all the multiplier modules. Hence employing fast and efficient full

    adders plays a key role in the performance of the entire system. In the following section we briefly describe the adder

    modules used in our design.

    Conventional CMOS 28 Transistor (28T) Full Adder

    The 28 Transistor full adders is the pioneer CMOS traditional adder circuit. The schematic of this adder is shown

    in Figure 1.This adder cell is built using equal number of N and P mos transistors. The logic for the Complimentary MOS

    logic was realized using the Eqs. (1) and (2)

    Cout= AB+BCin+ACin (1)

    (2)

    The first 12 transistors of the circuit produce the Cout and the remaining transistors produce the Sum outputs.

    Therefore the delay for computing Cin is added to the total propagation delay of the Sum output. The structure of this adder

    circuit is huge and thereby consumes large on-chip area.

    Figure 1: Conventional CMOS Adder with 28 Transistors. Reprinted with Permission from [2],

    J.M.Rabae y et al., Digital Integrated Circuits, Prentice Hall Publications (2003)

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 51

    SERF Adder

    The Static Energy Recovery Full Adder (SERF adder) circuit was developed implementing energy recovery logic

    and reduced number of transistors.

    Figure 2: Static Energy-Recovery Full (SERF) Adder. Reprinted with Permission from [8],

    R.Shalem et al., A Novel Low Power Energy Recovery Full Adder Cell Proceedings of the Great Lakes Symposium on VLSI (1999), pp.380383

    The schematic of the 10 transistor SERF adder is shown in Figure 2.The basic idea in the SERF adder is the reuse

    of charge stored in the load capacitance during the high output to drive the control logic. In regular non-energy recovery

    adder designs the input charge applied at logic high will be drained off during logic low mode. This is achieved by using

    only one voltage source (VDD) in the circuit. As an added advantage there will be no path from one voltage level (VDD)

    to the other (GND).The elimination of the direct path to the ground removes the short circuit power component for the

    adder module. This reduces the total energy consumed in the circuit and making it an energy efficient design. The SERF

    adder is not only energy efficient but also area efficient due to its low transistor count. The main drawback of the SERF

    adder is the threshold voltage drop at the output voltage for certain input combinations. A detailed comparative study of

    SERF adder with other low power adders can be found in Ref. [4].

    Figure 3: Static Energy-Recovery Full (SERF), Layout

    10T Adder

    In the 10T adder cell, the implementation of XOR and XNOR of A and B is done using pass transistor logic and

    an inverter is to complement the input signal A. This implementation results in faster XOR and XNOR outputs and also

    ensures that there is a balance of delays at the output of these gates. This leads to less spurious SUM and Cout signals.

  • 52 B. Vamsi Krishna & K. Dhanunjaya

    Figure 4: 10 Transistor (10T) Adder. Reprinted with Permission from [5], L.Junming et al., A Novel 10-Transistor

    Low-Power High-Speed Full Adder Cell. Proceedings of 6th International Conference on Solid-State and

    Integrated-Circuit Technology (2001), pp.11551158

    The capacitance at the outputs of XOR and XNOR gates is also reduced as they are not loaded with inverter. If the

    signal degradation at the SUM and Cout is significant for deep sub-micron circuits, drivers can be used to reduce the

    degradation. The driver will help in generating outputs with equal rise and fall times. This results in better performance

    regarding speed, low power dissipation and driving capabilities. The output voltage swing will be equal to the VDD, if a

    driver is used at the output. Figure 4 gives the circuit level diagram of 10T adder.

    Figure 5: 10 Transistor (10T) Adder, Layout

    A detailed comparative study of SERF adder with other low power adders can be found in Ref. [3].

    Power Consumptions at Different Voltages for Different Frequency

    Table 1: Power Consumption Comparison of SERF and 10T Adder at 0.8V

    Supply Voltage 0.8 V

    Frequency(MHz) 100 200 250 400 500

    Adder Power(n W)

    SERF 50.23 101.1 122.8 201.1 250.3

    10T 42.5 83.12 98.04 153.2 189.4

    28T 142.8 282.4 355.1 558.4 695.4

    Table 2: Power Consumption Comparison of SERF and 10T Adder at 1V

    Supply Voltage 1 V

    Frequency(MHz) 100 200 250 400 500

    Adder Power(n W)

    SERF 77.29 157.3 191.0 313.6 390.0

    10T 78.52 163.4 196.2 316.7 392.0

    28T 236.2 470.0 591.3 933.1 1163

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 53

    Table 3: Power Consumption Comparison of SERF and 10T Adder at 1.2V

    Supply Voltage 1.2 V

    Frequency(MHz) 100 200 250 400 500

    Adder Power(n W)

    SERF 109.3 224.0 271.7 446.3 554.7

    10T 115.8 242.4 289.9 467.71 568.6

    28T 354.8 709.2 891.6 1411 1759

    Maximum Delays of Adders

    Table 4: Delay Comparison of SERF and 10T Adder at Different Voltages

    Maximum Delay(p s)

    Supply

    Voltage

    Adder

    10T SERF 28t

    Sum Cout Sum Cout Sum Cout 0.8 V 47.35 86.41 35.05 150.5 176.9 107.1

    1.0 v 32.91 38.22 32.26 56.24 81.39 48.07

    1.2 V 11.58 24.86 29.44 5.738 54.12 31.68

    Power Delay Product (PDP) of Adders

    Table 5: PDP of Three Adders at Different Voltages

    Supply Voltage 1 V

    Frequency(MHz) 100 200 250 400 500

    Adder Power(10 -18

    alto W)

    SERF 4.346 8.846 10.741 17.636 21.933

    10T 2.999 6.241 7.494 12.097 14.974

    28T 12.783 25.43 32.00 50.49 62.94

    Figure 6: Comparison of Power Delay Product(PDP) od Adders

    MULTIPLIER ARCHITECTURES

    Multipliers are in fact complex adder arrays. This is an operation common to a large number of applications, and

    the complexity of this function has lead to a large amount of research directed at speeding up its execution. Multipliers can

    be implemented using different algorithms. Depending on the algorithm used, the performance characteristics of the

    multipliers vary. In the implementation of digital multipliers binary adders are an essential component. With the emergence

    of power as a design consideration, speed is not the only criterion by which various implementations are judged. Designing

    multipliers with low power, energy efficient adders reduce the power consumption and efficiency of multipliers. In this

    paper we have concentrated on the design and characterization of five popular multipliers, The Carry-Save Multiplier, the

    Bit-Array Multiplier, Braun Array Multiplier and Baugh Wooley Multipliers study and the performance evaluation of

    these five parallel digital multipliers we implemented them using 10T adder for 44.

  • 54 B. Vamsi Krishna & K. Dhanunjaya

    In this algorithm the secret image is divided into four parts and these four parts of the secret image is then

    embedded into cover image. The processed four images are then transmitted to the intended receiver. The receiver receives

    four dummy images which contains four parts of the secret image. These four parts of the image is then extracted using an

    algorithm and a access which again works on bit operations. The extracted images are then combined to reconstruct the

    secret image.

    Carry-Save Array Multiplier[3]

    Carry Save Array Multiplier has a very regular structure, which makes it amenable to automation. The algorithm

    is based on the fact that the multiplication result does not change when the output carry bits are passed diagonally

    downwards instead of only to the right. An extra adder, known as vector-merging adder, is added in each stage of the

    multiplication such that the final result is obtained. This is called the carry-save multiplier because the carry bits are not

    immediately added but are rather saved for the next addition stage. In the final stage the carries and the sums are merged in

    a fast-carry propagate adder stage, usually by using a carry-look ahead adder. Due to the additional adder in each stage

    there is a slight increase in the area cost. However, it uses only short wires to the nearest neighboring cells. It can also be

    easily pipelined. Another added advantage is that there is only one critical path rather than the several identical critical

    paths found in the generic array multiplier. The general structure of a Carry-Save Multiplier is shown in Figure 7.The delay

    of this multiplier can be expressed10 as,

    T = Tand+Tfinal+(X 1) Tcarry (3)

    Where Tand is the delay of the pre-product generating AND gates, Tfinal is the delay of the final stage carry-look

    ahead adder, X is the number of partial product stages, and Tcarry is the propagation delay between input and output carry.

    This equation is based on the assumption that the delay for sum generation is equal to that of the carry generation.

    Figure 7: 44 Carry-Save Multiplier. Reprinted with Permission from [2], J.M.Rabae y et al.,

    Digital Integrated Circuits, Prentice Hall Publications (2003)

    Bit Array Multiplier

    Bit Array Multipliers are essentially regular structures and are simple to expand. The structure is similar to the

    previously discussed Carry-Save multiplier but propagates the carry bits from the full adders in a different fashion. A

    simple diagram of a 44 multiplier is shown in Figure 8. Each partial product is generated by the multiplication of the

    multiplicand with one multiplier bit.

    The partial products are shifted according to their bit orders and then added. In array multiplication we need to

    add as many partial products as there are multiplier bits. In order to perform signed multiplication, 2s complement number

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 55

    system is used to represent the multiplicand and the multiplier. This implies that all the adders in a particular stage should

    be of equal bit length.

    Figure 8: 44 Bit-Array Multiplier. Reprinted with Permission from [2], J.M.Rabae y et al.,

    Digital Integrated Circuits, Prentice Hall Publications (2003)

    To achieve this, the sign bits of the partial products in the initial row and the sum and carry signals of each adder

    stage are extended. The extension is carried out until the signals width matches the width of the largest absolute value

    signal in that stage. Also, the generation of X partial products requires XY two-bit AND gates. Large area of the multiplier

    is devoted to perform addition of N partial products, which require (N 1) M-bit adders.1_ 10 The shifting of the partial

    products for proper alignment is performed by simple routing and does not require any logic. The array structure makes it a

    difficult task to measure the propagation delay. There are more than one identical length critical timing paths available in

    the circuit. An approximate equation as shown in Eq.(4) 10 for the propagation delay can be obtained by a detailed study of

    these paths.

    T = Tand+Tsum+ [(Y 1) + (X 2)]Tcarry (4)

    where Tand is the delay of the pre-product generating AND gates, Tsum is the delay between the input carry and the

    sum bit of the full adder, Y is the width of the multiplicand, X is the width of the multiplier, and Tcarry is the propagation

    delay between input and output carry.

    Baugh Wooley Multiplier

    Baugh Wooley Multiplier is used for 2s complement multiplication. It adjusts the partial products to maximize

    regularity of the multiplication array. It moves the partial products with negative signs to the last steps and also adds the

    negation of partial products rather than subtracts.This technique has been developed in order to design regular multipliers,

    suited for 2s complement numbers

    (5)

    Gate-level diagram of a 4-bit Baugh Wooley multiplier is shown in Figure 7. The equation of Baugh-Wooley

    algorithm for an N N multiplication is given by Eq.(5),

  • 56 B. Vamsi Krishna & K. Dhanunjaya

    Figure 9: 44 Baugh Wooley Multiplier. Reprinted with Permission from [2],

    J.M.Rabae y et al., Digital Integrated Circuits, Prentice Hall Publications (2003)

    Where X and Y are N-bit operands, so their product is a 2N bits number. Consequently, the most significant weight

    is 2N 1, and the first term 22N1 is taken into account by adding a 1 in the most significant cell of the multiplier. Each

    of the partial products is formed with AND gates and they are all added together. The outcome is to allow identical stages

    of logic in the early steps of multiplication process and push all the irregularities to the final stage. The delay equation for

    the Baugh Wooley multiplier is similar to that of the Array Multiplier.

    Braun Array Multiplier

    The simplest parallel multiplier is the Braun array. All the partial products are computed in parallel, then collected

    through a cascade of Carry Save Adders. The completion time is limited by the depth of the carry save array, and by the

    carry propagation in the adder. Note that this multiplier is only suited for positive operands. The structure of the Braun

    algorithm for the unsigned binary multiplication is shown in figure 8.

    Figure 10: Braun Array Multiplier Structure, Anitha R, Bagyaveereswaran V, Brauns Multiplier Implementation Using FPGA with Bypassing Techniques, International Journal of VLSI Design &

    Communication Systems (VLSICS) Vol.2, No.3, September 2011. Ref[6]

    Vedic 44 Multiplier

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 57

    The method is explained below for two, 2 bit numbers A and B where A = a1a0 and B = b1b0 as Firstly, the least

    significant bits are multiplied which gives the least significant bit of the final product (vertical). Then, the LSB of the

    multiplicand is multiplied with the next higher bit of the multiplier and added with, the product of LSB of multiplier and

    next higher bit of the multiplicand (crosswise). The sum gives second bit of the final product and the carry is added with

    the partial product obtained by multiplying the most significant bits to give the sum and carry. The sum is the third

    corresponding bit and carry becomes the fourth bit of the final product. s0 = a0b0; (1) c1s1 = a1b0 + a0b1; (2) c2s2 = c1 +

    a1b1

    The 4x4 bit Vedic multiplier module is implemented using four 2x2 bit Vedic multiplier modules. Lets analyze

    4x4 multiplications, say A= A3 A2 A1 A0 and B= B3 B2 B1 B0. The output line for the multiplication result is S7

    S6S5S4 S3 S2 S1 S0. Lets divide A and B into two parts, say A3 A2 & A1 A0 for A and B3 B2 & B1B0 for B. Using the

    fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block

    Sample Representation of 4X4 Multiplication

    Figure 11:44 Vedic Multiplier Reprinted with Permission from. Sowmiya.M, Nirmal Kumar.R, Dr.

    S.Valarmathy3, Karthick.S, Design of Efficient Vedic Multiplier by the Analysis of Adders, International Journal

    for Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008

    Certified Journal, Volume 3, Issue 1, January 2013) Ref[7]

  • 58 B. Vamsi Krishna & K. Dhanunjaya

    SIMULATION RESULTS

    In this section, performance measurement of all the five multipliers using 10T adder has been compared. These

    results were obtained from Virtuoso simulations (Cadence) .Though low power is the objective of our design; we wanted to

    measure the delay and area of these circuits, as they are indicators of good performance.

    Power

    The energy consumption for all the multipliers investigated is presented in below Tables for a 45 nm technology

    size. For all the operand sizes, the SERF adder based multipliers consumed considerably less energy compared to the

    CMOS adder based multipliers. In fact, the SERF based multiplier performed at least thirty-two percent better than any

    CMOS based version. The 10T based 44 Bit-Array multiplier proved to have the greatest advantage over its CMOS with

    a sixty percent improvement.

    The power gain of 10T is less as compared to SERF based multipliers and hence can be used where pass transistor

    logic is used. The power consumed for array multiplier is higher than Braun Array and Carry Save Array Multiplier in 4-bit

    Power Comparisons at 0.8 volts for multipliers are shown in Table 4.

    Table 6: Power Consumption Comparison of Multipliers at 0.8V

    Supply Voltage 0.8 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power(W)

    Baugh Wooley 0.852 1.559 1.948 2.985 3.643

    Braun Array 0.783 1.513 1.918 2.87 3.498

    Bit Array 1.082 1.903 2.198 3.121 3.698

    Vedic 1.332 2.337 2.775 4.093 5.172

    CSA 0.736 1.374 1.786 2.768 3.404

    In this table Braun Array Multiplier and Carry save Array Multipliers are Efficient for All the frequencies

    Compared Above.

    Figure 12: Power Consumption Comparison Chart for Multipliers at 0.8 Volts

    Table 7: Power Consumption Comparison of Multipliers at 1V

    Supply Voltage 1 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power( W)

    Baugh Woolley 1.929 3.262 4.016 6.019 7.337

    Braun Array 1.678 3.131 3.925 5.905 7.243

    Bit Array 1.723 3.116 4.036 6.001 7.095

    Vedic 2.797 4.808 5.809 8.551 10.65

    CSA 2.301 3.473 4.002 5.683 6.892

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 59

    Figure 13: Power Consumption Comparison Chart for Multipliers at 1 Volt

    Table 8: Power Consumption Comparison of Multipliers at 1.2V

    Supply Voltage 1.2 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power( W)

    Baugh Wooley 3.875 6.368 7.848 11.37 13.20

    Braun Array 2.591 4.54 5.663 8.547 10.36

    Bit Array 6.541 10.53 10.72 13.79 15.28

    Vedic 7.91 10.79 12.36 16.38 19.99

    CSA 4.23 5.286 6.297 9.107 10.07

    Figure 14: Power Comparison Chart of Multipliers at 1.2 Volts

    Delay

    Propagation delay is a measure of the speed performance of a circuit, even while consuming low power. In Table

    7 the delay performance characteristics of various multipliers used for our study at 70 nm technology size are given. For all

    the multipliers, the delay or 44, the differences between the adder cells is significant. For 4-bit operands, the delay of 10T

    adder based multipliers is almost 1520% less, and for SERF adder based multipliers, the delay is approximately 25% less

    compared to CMOS 28T adder based multipliers. To further analyze the propagation delay of these circuits at smaller

    technology nodes, we performed simulations for 45 nm technology nodes. The results from Figure 10, 11 and 12 indicate

    that the propagation delay of the multipliers implemented with 10 transistors translates to a better performance even at

    smaller technology node sizes. Even though the timing delay for CSA multipliers is substantially less than other multipliers

    at 70 nm technology nodes, the differences diminish at 45 nm technology node.

    Table 9: Delay Comparison of Multipliers at Different Voltages

    Delay (psec)

    Supply(V) 0.8 1 1.2

    Multipliers Min Max Min Max Min Max

    Baugh Wooley 43.17 960.0 22.02 659.6 21.93 657.9

  • 60 B. Vamsi Krishna & K. Dhanunjaya

    Table 9: Contd.,

    Braun Array 43.53 625.6 22.25 536.2 16.01 456.4

    Bit Array 58.23 904.6 34.04 795.0 25.87 625.9

    Vedic 29.93 552.7 20.2 426.2 19.33 379.9

    CSA 65.08 502.9 37.52 481.2 27.26 412.0

    Figure 15: Delay Comparison Chart for Multipliers at Different Voltages

    PDP Product

    To implement low power dissipation systems, we can either reduce the power consumed by the circuits or

    increase the computations/unit energy. These two optimizations can be realized only when the design tradeoffs between

    power and delay are well understood. The optimal setting for power delay product (PDP) of a particular technology node

    can be obtained by varying the size of the gates (W/L ratios), and the operating voltage. To understand the best PDP zone

    for the five multipliers tested, we simulated Multipliers for 45nm technology MOSFETs. in Virtuoso Simulator (Cadence).

    Table 10: Power Delay Product Comparison of Multipliers at 0.8V

    Supply Voltage 0.8 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power Delay Product (*p=f W)

    Baugh Wooley 0.817 1.496 1.870 2.865 3.497

    Braun Array 0.489 0.945 1.198 1.793 2.186

    Bit Array 0.978 1.721 1.988 2.823 3.345

    Vedic 0.736 1.291 1.533 2.262 2.858

    CSA 0.370 0.690 0.898 1.392 1.711

    Figure 16: Power Delay Product Comparison Chart for Multipliers at 0.8 Voltage

    Table 11: Power Delay Product Comparison of Multipliers at 1V

    Supply Voltage 1 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power Delay Product (*p=f W)

    Baugh Wooley 1.272 2.151 2.648 3.970 4.839

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 61

    Table 11: Contd.,

    Braun Array 0.899 1.678 2.104 3.166 3.883

    Bit Array 1.369 2.447 3.208 4.770 5.640

    Vedic 1.192 2.049 2.475 3.694 4.539

    CSA 1.107 1.671 1.925 2.734 3.316

    Figure 17: Power Delay Product Comparison Chart for Multipliers at 1Voltage

    Table 12: Power Delay Product Comparison of Multipliers at 1.2V

    Supply Voltage 1.2 V

    Frequency(MHz) 100 200 250 400 500

    Multiplier Power Delay Product( *p=f W)

    Baugh Wooley 2.549 4.189 5.163 7.480 8.684

    Braun Array 1.182 2.072 2.584 3.900 4.728

    Bit Array 4.094 6.590 6.709 8.631 9.563

    Vedic 3.005 4.099 4.695 6.222 7.594

    CSA 1.742 2.177 2.594 3.752 4.148

    Figure 18: Power Delay Product Comparison Chart for Multipliers at 1.2 Voltage

    CONCLUSIONS

    In this paper, we have presented the power and speed Performance characteristics of five different multipliers

    realized using 10T Adder After comparisons of three adders 10T, SERF and CMOS 28T static adders PDP. we realized

    44 Carry-Save multiplier have low PDP when compared with all other multipliers(Baugh Wooley, Bit Array, Vedic

    Multiplier and Braun Array Multipliers) when going to higher frequencies, But At Lower Frequencys Braun Array

    Multiplier have low PDP from this paper I realize, when we want to use multiplier at lower frequencys Braun Array

    Multiplier is Efficient As well CSA multiplier is Efficient from 400MHZ on words from above comparisons I conclude

    depending upon the frequency Braun Array Multiplier and Carry save Array Multiplier are Efficient at low frequency and

    high frequency respectively.

  • 62 B. Vamsi Krishna & K. Dhanunjaya

    REFERENCES

    1. Dhireesha Kudithipudi and Eugene John, Implementation of Low Power Digital Multipliers Using 10 Transistor

    Adder Blocks, Journal of Low Power Electronics Vol.1, 111, 2005

    2. J.M.Rabae y et al., Digital Integrated Circuits,Prentice Hall Publications (2003).

    3. J.M.Rabae y, A.Chandrakasan, and B.Nik olic, (Eds.), Digital Integrated Circuits, Prentice Hall Publications

    (2003).

    4. R.Shalem, E.John, and L.K.John, A novel low power energy recovery full adder cell Proceedings of the Great

    Lakes Symposium.

    5. L.Junming et al., A novel 10-transistor low-power high-speed full adder cell. Proceedings of 6th International

    Conference on Solid-State and Integrated-Circuit Technology (2001), pp.11551158

    6. Anitha R, Bagyaveereswaran V, Brauns Multiplier Implementation using FPGA with Bypassing Techniques,

    International Journal of VLSI design & Communication Systems (VLSICS) Vol.2, No.3, September 2011.

    7. Sowmiya.M, Nirmal kumar.R, Dr. S.Valarmathy3, Karthick.S, Design Of Efficient Vedic Multiplier by the

    analysis of Adders, International Journal of Emerging Technology and Advanced Engineering Website:

    www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 1, January 2013)

    8. R.Shalem et al., A novel low power energy recovery full adder cell Proceedings of the Great Lakes Symposium

    on VLSI (1999), pp.380383.

    9. Muhammad H. Rais, Hardware Implementation of Truncated Multipliers Using Spartan-3AN, Virtex-4 and

    Virtex-5 FPGA Devices, Am. J. Engg. & Applied Sci., 3 (1): 201-206, 2010.. on VLSI (1999), pp.380383.

    10. S.Shah, A.J.Al-Khalili, and D.Al-Khalili, Comparison of 32-bit multipliers for various performance measures.

    Proceedings of the 12th International Conference on Microelectronics (2000),pp.7580.

    11. I.S.Ab u-Khater, A.Bellaouar , and M.I.Elmasry , Circuit techniques for CMOS low power high-performance

    multipliers. IEEE Journal of Solid State Circuits (1996), Vol.31, pp.15351546.

    12. G.-K.Ma and F.J.T aylor, Multiplier policies for digital signal processing.IEEE ASSP Magazine (1990), pp.619.

    13. T.K.Calla way and E.E.Sw artzlander, Jr., Power delay characteristics of CMOS multipliers. Proceedings of the

    13th International Symposium on Computer Arithmetic (1997), pp.2632.

    14. G.Goto, A.Inoue, R.Ohe, S.Kashiw ahra, S.Mitarai, T.Tsuru, and T.Iza wa, A 4.1-ns compact 5454 multiplier

    utilizing sign-select booth encoders. IEEE Journal of Solid-State Circuits (1997), Vol. 32, pp.167682.

    15. K.Z.Pekmestzi, Multiplexer-based array multipliers. IEEE Trans.on Computers (1999), Vol.48, pp.1523.

    16. D.Radhakrishnan, Low Voltage CMOS Full Adder Cells. Electronicsletters (1999), Vol.35, pp.17921794.

    17. L.Junming, S.Y an, L.Zhenghui, and W.Ling, A novel 10-transistor low-power high-speed full adder cell.

    Proceedings of 6th International Conference on Solid-State and Integrated-Circuit Technology (2001), pp.1155

    1158.

    18. http://public.itrs.net, International technology roadmap for semiconductors

  • Analysis and Modeling of Low Power Array Multipliers Using Cadence Virtuoso Tool in 45 NM Technology 63

    19. H.T .Bui, Y.W ang, and Y.Jiang, Design and analysis of low-power 10-transistor full adders using novel XOR

    XNOR gates. IEEE Trans. on Circuits and Systems-II: Analog and Digital Signal Processing (2002), Vol.49,

    pp.2530.

    20. A.M.Shams, T.K.Darwish, and M.A.Bayoumi, Performance analysis of low-power 1-Bit CMOS full adder cells.

    IEEE Trans. On VLSI Systems (2002), Vol.10, pp.2029.

    AUTHORS DETAILS

    Bynaboina vamsi krishna received B.Tech. degree in Electronics & Communication Engineering from Mekhapati

    Rajamohan Reddy college of Engineering and Technoloy,Udayagiri,Nellore(D,st), in 2010, He is currently doing M.Tech.

    in VLSI from Audisankara College of Engineering and Technological, Affiliated to JNTUA.Gudur,Nellore(D.St),A.P-

    524101,INDIA

    K Dhanunjaya received his B.Tech Degree in Electronics & Communication Engineering from G.Pulla Reddy

    Engineering college, Kurnool, AP in 1998, M.Tech. in ECE from Jawaharlal Nehru Technological University Kakinada in

    2001. He is currently doing research in Low power VLSI design from Jawaharlal Nehru Technological University

    Anantapur. He has 13 years teaching experience, presently working as Professor & Head of the department of ECE,

    Audisankara College of Engineering and Technology(Autonomous), Affiliated to JNTUA, Gudur. He is a life time

    member of IETE & ISTE and member of IEEE.