Final Project Report 2011_new

Preview:

Citation preview

CHAPTER 1

ADDERS COMPARISON

1.0) INTRODUCTION

By using adder, delay and multiplier we can realize any digital filter which can be further extended to realize any system. So to increase the speed of system we need to focus on increasing the speed of addition and multiplication.

If we look at the conventional binary number system, considering the case of addition of two numbers the carry may propagate all the way from the least significant digit to the most significant. Thus the addition time is dependent on the word length (linear in ripple carry adders).

In this chapter we will design various adders based on different addition techniques on Active HDL software and study their delay and complexity characteristics. Our focus would be to design certain adders in which the addition time is independent of the word length like the RBSD adder and the QSD adder. Carry-free addition in RBSD and QSD addition is achieved by exploiting the redundancy of RBSD and QSD numbers. The redundancy allows multiple representations of any integer quantity. There are two steps involved in the carry-free addition. The first step generates an intermediate carry and sum from the addend and augend. The second step combines the intermediate sum of the current digit with the carry of the lower significant digit.

By designing the adders in which the carry doesn’t propagate we can considerably increase the speed of addition. So, in a system involving large number of adders and multipliers, its response time can be considerably improved.

1

1.1) FULL ADDER

1.1.1) Introduction

A full adder is a logical circuit that performs an addition operation on three binary digits. The full adder produces a sum and carry value, which are both binary digits. It can be combined with other full adders (see below) or work on its own. [1]

A full adder adds binary numbers and accounts for values carried in as well as out. A one-bit full adder adds three one-bit numbers, often written as A, B, and Cin; A and B are the operands, and Cin is a bit carried in (in theory from a past addition).

Figure 1.1: Full adder

The delay through a digital circuit is measured in gate-delays, as this allows the delay of a design to be calculated for different devices. AND and OR gates have a nominal delay of 1 gate-delay, and XOR gates have a delay of 2, because they are really made up of a combination of ANDs and ORs.

A full adder block has the following worst case propagation delays:

From A or B to Cout : 4 gate-delays (XOR → AND → OR)

From A or B to S : 4 gate-delays (XOR → XOR)

From Cin to Cout : 2 gate-delays (AND → OR)

From Cin to S : 2 gate-delays (XOR)

The worst propagation delay in 1 bit full adder is of 4 gate delays so the total propagation delay in 1 bit full adder is of 4 gate delays.

Assuming that both normal and complement form of inputs are present.

2

1.1.2) VHDL code of full adder

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity fullader is

port(

a, b, cin : in bit;

sum, cout : out bit

);

end fullader;

--}} End of automatically maintained section

architecture fullader of fullader is

begin

sum<= a xor b xor cin after 4 ns;

cout<= (a and b) or (b and cin) or (a and cin) after 4 ns;-- enter your statements here --

end fullader;

3

1.2) RIPPLE CARRY ADDER

1.2.1) Introduction

It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple carry adder, since each carry bit "ripples" to the next full adder. [2]

Figure 1.2: Ripple Carry Adder

Because the carry-out of one stage is the next's input, the worst case propagation delay is then:

4 gate-delays from generating the first carry signal (A0/B0 → C1).

2 gate-delays per intermediate stage (Ci → Ci+1).

2 gate-delays at the last stage to produce both the sum and carry-out outputs (Cn-1 → Cn

and Sn-1).

So for an n-bit adder, we have a total propagation delay, tp of:

tp = 4 + 2(n − 2) + 2 = 2n + 2 (1.1)

This is linear in n, and for a 32-bit number, would take 66 cycles to complete the calculation. This is rather slow, and restricts the word length in our device somewhat. We would like to find ways to speed it up.

4

1.2.2) VHDL code of ripple carry adder

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity ripplecarry is

port( a, b: in bit_vector(3 downto 0); ci: in bit;

s: out bit_vector(3 downto 0); co: out bit

);

end ripplecarry;

--}} End of automatically maintained section

architecture ripplecarry of ripplecarry is

component fullader

port (a, b, cin: in bit;

cout, sum: out bit);

end component;

signal c: bit_vector(3 downto 1);

begin

fa0: fullader port map (a(0), b(0), ci, c(1), s(0));

fa1: fullader port map (a(1), b(1), c(1), c(2), s(1));

fa2: fullader port map (a(2), b(2), c(2), c(3), s(2));

fa3: fullader port map (a(3), b(3), c(3), co, s(3));

end ripplecarry;

5

Figure 1.3: VHDL simulation of ripple carry adder

1.3) CARRY LOOK AHEAD ADDER [2]

1.3.1) Introduction

The generate function, Gi, indicates if that stage causes a carry-out signal Ci to be generated if no carry-in signal exists. This occurs if both the addends contain a 1 in that bit:

Gi = Ai . Bi (1.2)

The propagate function, Pi, indicates if a carry-in to the stage is passed to the carry-out for the stage. This occurs if either the addends have a 1 in that bit:

Pi = Ai + Bi (1.3)

Note that both these values can be calculated from the inputs in a constant time of a single gate delay. Now, the carry-out from a stage occurs if that stage generates a carry (Gi = 1) or there is a carry-in and the stage propagates the carry (Pi·Ci = 1):

Ci+1 = AiBi + AiCi + Bi Ci (1.4)

Ci+1 = AiBi + (Ai + Bi) Ci (1.5)

Ci+1 = Gi + Pi Ci (1.6)

6

Ci+1 = Gi + Pi (Gi-1 + Pi-1 Ci-1) (1.7)

Ci+1 = Gi + Pi Gi-1 + Pi Pi-1(Gi-2 + Pi-2 Ci-2) (1.8)

.

.

Ci+1 = Gi + Pi Gi-1 + Pi Pi-1Gi-2 + PiPi-1Pi-2 Gi-3 + … + Pi Pi-1... PiPi-1…P1P0C0 (1.9)

Note that this does not require the carry-out signals from the previous stages, so we don't have to wait for changes to ripple through the circuit. In fact, a given stage's carry signal can be computed once the propagate and generate signals are ready with only two more gate delays (one AND and one OR). Thus the carry-out for a given stage can be calculated in constant time, and therefore so can the sum.

Operation Required Data Gate Delays

Produce stage generate and propagate signals Addends (a and b) 1

Produce stage carry-out signals, C1 to Cn P and G signals, and C0 2

Produce sum result, S Carry signals and addends 3

Total 6

Figure 1.4: Carry Look Ahead adder

A basic carry-lookahead adder is very fast but has the disadvantage that it takes a very large amount of logic hardware to implement. In fact, the amount of hardware needed is approximately quadratic with n, and begins to get very complicated for n greater than 4.

Due to this, most CLAs are constructed out of "blocks" comprising 4-bit CLAs, which are in turn cascaded to produce a larger CLA.

7

1.3.2) VHDL code for carry look ahead adder

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity claadder is

port (x,y : in bit_vector (3 downto 0); cin : in bit;

s : out bit_vector (3 downto 0); cout,gout,pout : out bit);

end claadder;

architecture claadder of claadder is

component gpfulladder

port (a,b,cin : in bit;

g,p,so : out bit);

end component;

component clalogic

port (g,p : in bit_vector (3 downto 0); ci : in bit;

c : out bit_vector (3 downto 1) ; co,go,po : out bit);

end component;

signal g,p : bit_vector (3 downto 0);

signal c : bit_vector (3 downto 1);

begin

carrylogic : clalogic port map (g,p,cin,c,cout,pout,gout);

gpfa0 : gpfulladder port map ( x(0),y(0),cin,g(0),p(0),s(0));

gpfa1 : gpfulladder port map (x(1),y(1),c(1),g(1),p(1),s(1));

8

gpfa2 : gpfulladder port map (x(2),y(2),c(2),g(2),p(2),s(2));

gpfa3 : gpfulladder port map (x(3),y(3),c(3),g(3),p(3),s(3));

end claadder;

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity gpfulladder is

port (a,b,cin : in bit;

g,p,so : out bit);

end gpfulladder;

--}} End of automatically maintained section

architecture gpfulladder of gpfulladder is

signal p_int : bit;

begin

g <= a and b;

p <= p_int;

p_int <= a xor b;

so <= p_int xor cin;

-- enter your statements here --

end gpfulladder;

9

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity clalogic is

port (g,p : in bit_vector (3 downto 0); ci : in bit;

c : out bit_vector (3 downto 1) ; co,go,po : out bit);

end clalogic;

--}} End of automatically maintained section

architecture clalogic of clalogic is

signal go_int,po_int : bit;

begin

c(1) <= g(0) or (p(0) and ci);

c(2) <= g(1) or (p(1) and g(0)) or (p(1) and ci);

c(3) <= g(2) or (p(2) and g(1)) or (p(3) and p(2) and g(1)) or ( p(2) and p(1) and p(0)and ci);

po_int <= p(3) and p(2) and p(1) and p(0);

go_int <= g(3) or (p(3) and g(2)) or (p(3) and p(2) and g(1)) or (p(3) and p(2) and p(1) and g(0));

co <= go_int or (po_int and ci);

po <= po_int;

go <= go_int;

-- enter your statements here --

end clalogic;

10

Figure 1.5: VHDL simulation of Carry Look Ahead adder

1.4) REDUNDANT BINARY SIGNED ADDER [3]

1.4.1) Introduction

In such a system, a “carry–free” addition can be performed, where the term “carry–free” in this

context means that the carry propagation is limited to a single digit position. In other words, the carry

propagation length is fixed irrespective of the word length. The addition consists of two steps. In

the first step, an intermediate sum si and a carry ci are generated, based on the operand digits xi

and yi at each digit position i. This is done in parallel for all digit positions. In the second step, the

summation zi = si + ci-1 is carried out to produce the final sum digit zi. The important point is that it is

always possible to select the intermediate sum si and carry ci-1 such that the summation in the second step

does not generate a carry. Hence, the second step can also be executed in parallel for all the digit

positions, yielding a fixed addition time, independent of the word length.

Figure shows an example for an 8-bit redundant binary addition. In the Figure, X and Y are n-

digit redundant binary integers. I-Sum and I-Cin are intermediate sum and carry-in. Final Sum

(F-Sum), which is obtained by adding I-Sum and I-Cin. Note that there is no carry generation in

11

the addition of I-Sum and I-Cin to satisfy a carry-free condition and the LSB of I-Cin is set to

logic zero.

Figure 1.6: Signed addition

The addition of two signed digit takes place in two steps. In the first step intermediate carry and

intermediate sum is written using the above table, then in the second step the intermediate sum

and intermediate carry is added to obtain the final sum. The above table is designed such that the

addition of intermediate sum bit and intermediate carry bit does not produce a carry.

(1.10)

Figure 1.7: Signed adder cell [4]

12

If the delay of NAND, NOR gate is considered to then delay of the circuit for the circuit becomes

Tdelay = to+2to+2to+ to+ to = 7to (1.11)

1.4.2) Rules For Redundant Binary Addition

Type Augend

digit

(xi)

Addend

digit

(yi)

Digit at the next lower

order position

(xi-1 , yi-1)

Intermediate

Carry

(ci)

Intermediate

Sum

(si)

1 1 1 ------------------ 1 0

2 1

0

0

1

Both are non-negative 1

0

-1

1Otherwise

3 0 0 ------------------ 0 0

4 1

-1

-1

1

------------------ 0

0

0

0

5 0

-1

-1

0

Both are non-negative 0 -1

Otherwise -1 1

6 -1 -1 ------------------ -1 0

Figure 1.8: Rules table for intermediate carry and intermediate sum

The addition of two signed digit takes place in two steps. In the first step intermediate carry and

intermediate sum is written using the above table, then in the second step the intermediate sum

and intermediate carry is added to obtain the final sum. The above table is designed such that the

addition of intermediate sum bit and intermediate carry bit does not produce a carry.

Figure 1.9: Steps of RBSD addition

1.4.3) VHDL code of RBSD adder

13

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity rbsdadder is

port (a,b,c,d,e,f,g,h : in bit;

c2,c1,s2,s1 : out bit);

end rbsdadder;

--}} End of automatically maintained section

architecture rbsdadder of rbsdadder is

begin

-----------------------------------------------------------------------------------------------

c2 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)

and g)))

or (g and (((not b) and c and f)

or ( a and (not d) and (not f)))) or (c and (((not b)and (not f) and g) or ( a and b

and (not h))))

or (a and b and c and (not f));

-----------------------------------------------------------------------------------------------

c1 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)

and g)))

or (g and (((not b) and c and f) or ( a and (not d) and (not f)))) or (c and (((not b)and (not

f) and g) or ( a and b and (not h))))

or (b and((a and c and (not f)) or ( (not a) and (not c) and d and f)))

14

or ((not a) and b and(not c) and (not h) and (d or ( not e )))

or ((not a) and b and(not c) and (not f) and (d or (not g)))

or ((not b) and (not c ) and d and (((not e) and (not h)) or ((not f) and (not g))))

or ((not e) and f and (not g ) and (((not a) and b and (not d)) or ((not b) and (not c) and

d)));

-------------------------------------------------------------------------------------------------

s2 <= (b and (not d) and (((not e) and (not h)) or ((not f) and (not g))))

or ((not b) and d and (((not e) and (not h)) or ((not f) and ( not g))))

or ((not e) and f and (not g) and (b xor d));

-------------------------------------------------------------------------------------------------

s1 <= (f and (b xor d)) or (b and (not d) and ((not h) or (not f)))

or ((not b) and d and ((not h) or (not f)))

end rbsdadder;

15

Figure 1.10: Waveform of RBSD adder cell

Figure 1.11: VHDL simulation of RBSD adder cell

1.5) HYBRID SIGNED DIGIT ADDER [5]

1.5.1) Introduction

Here, instead of insisting that every digit be a signed digit, we let some of the digits to be signed and leave the others unsigned. For example, every alternate or every third or fourth digit can be signed; all the remaining ones are unsigned. We refer to this representation as a Hybrid Signed-Digit (HSD) representation. In the following, we show that such a representation can limit the maximum length of carry propagation chains to any desired value. In particular, we prove that the maximum length of a carry propagation chain equals (d + 1), where d is the longest distance between neighboring signed digits.

16

Unsigned digit position Signed digit position

Figure 1.12: signed and unsigned adder cell [5]

In HSD for d=1 (the distance between signed digit positions) the delay is;

(1.12)

Here, the two delays of 1.5 units in parenthesis are due to the two complex gates in the lowerorder signed digit cell. The last 1.5 units of delay (shown within the square brackets) is associated with the XNOR gate at the higher order signed digit where the carry propagation terminates. The terms in between are proportional to d since the carry ripples through all the unsigned digit positions.

17

Figure 1.13: Critical path delay vs. distance between signed digits [5]

18

Figure 1.14: Transistor count vs. Distance between signed digits [5]

Figure 1.15: Transistor count *Delay vs. distance between signed digits [5]1.5.2) VHDL code of unsigned position adder cell

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity unsigned_new is

port(

ai_1 : in STD_LOGIC;

19

bi_1 : in STD_LOGIC;

vi_2 : in STD_LOGIC;

wi_2 : in STD_LOGIC;

vi_1 : out STD_LOGIC;

wi_1 : out STD_LOGIC;

ei_1 : out STD_LOGIC

);

end unsigned_new;

architecture unsigned_new of unsigned_new is

component nor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

component and2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

20

);

end component;

component not1

port(

a : in STD_LOGIC;

y : out STD_LOGIC );

end component;

component xnor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

component xor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

21

component or2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

signal s1,s2,s3,s4,s5,s6,s7: STD_LOGIC;

begin

N1: not1 port map (wi_2,s1);

N2: xnor2 port map (ai_1,bi_1,s2);

N3: and2 port map (vi_2,s1,s3);

N4: or2 port map (s1,vi_2,s4);

N5: and2 port map (s2,s4,s5);

N6: or2 port map (s3,s5,vi_1);

N7: nor2 port map (ai_1,bi_1,wi_1);

N8: xor2 port map (vi_2,wi_2,s6);

N9: xor2 port map (ai_1,bi_1,s7);

N10: xor2 port map (s7,s6,ei_1);

end unsigned_new;

22

Figure 1.16: Waveform of HSD unsigned position adder cell

Figure 1.17: VHDL simulation of HSD unsigned position adder cell

1.5.3) VHDL code of signed position adder cell

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity signedaddercell is

23

port(

xis_c : in STD_LOGIC;

yis_c : in STD_LOGIC;

xia : in STD_LOGIC;

yia : in STD_LOGIC;

vi_1_c : in STD_LOGIC;

wi_1 : in STD_LOGIC;

vi : out STD_LOGIC;

wi : out STD_LOGIC;

zia : out STD_LOGIC;

zis_c : out STD_LOGIC

);

end signedaddercell;

--}} End of automatically maintained section

architecture signedaddercell of signedaddercell is

component nor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

24

y : out STD_LOGIC

);

end component;

component and2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

component xnor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

component nand2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

25

y : out STD_LOGIC

);

end component;

component xor2

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

component nor3

port(

a : in STD_LOGIC;

b : in STD_LOGIC;

c : in STD_LOGIC;

y : out STD_LOGIC

);

end component;

signal s1,s2,s3,s4,s5,s6: STD_LOGIC;

begin

N1: nand2 port map (xis_c,yis_c,wi);

26

N2: nor2 port map (xis_c,yis_c,s1);

N3: nor2 port map (xia,yia,s2);

N4: xor2 port map (xia,yia,s3);

N5: and2 port map (s3,wi_1,s5);

N6: xor2 port map (wi_1,s3,s6);

N7: nand2 port map (s6,vi_1_c,zis_c);

N8: xnor2 port map (vi_1_c,s6,zia);

N9: nor3 port map (s1,s2,s5,vi);

-- enter your statements here --

end signedaddercell;

27

Figure 1.18: Waveform of HSD signed position adder cell

Figure 1.19: VHDL simulation of HSD signed position adder cell

28

1.6) QUATERNARY SIGNED DIGIT NUMBERS

1.6.1) Introduction

QSD numbers are represented using 3-bit 2’s complement notation. Each number can be

represented by:

(1.13)

Where xi can be any value from the set {-3, -2, -1, 0, 1, 2, 3} for producing an appropriate

decimal representation. For digital implementation, large number of digits such as 64, 128, or

more can be implemented with constant delay. A high speed and area effective adders and

multipliers can be implemented using this technique.

1.6.2) QSD ADDER

We can achieve carry-free addition by exploiting the redundancy of QSD numbers and the QSD

addition. The redundancy allows multiple representations of any integer quantity. There are two

steps involved in the carry-free addition. The first step generates an intermediate carry and sum

from the addend and augend. The second step combines the intermediate sum of the current digit

with the carry of the lower significant digit.

To prevent carry from further rippling, we define two rules.

1) The first rule states that the magnitude of the intermediate sum must be less than or equal

to 2(or -2).

2) The second rule states that the magnitude of the carry must be less than or equal to 1(or -

1).

Consequently, the magnitude of the second step output cannot be greater than 3 which can be

represented by a single-digit QSD number; hence no further carry is required. In step 1, all

29

possible input pairs of the addend and augend are considered. The output ranges from -6 to 6 as

shown in figure 1.19.

Figure 1.20: QSD representation

Both inputs and outputs can be encoded in 3-bit 2’scomplement binary number. The mapping

between the inputs, addend and augend, and the outputs, the intermediate carry and sum are

shown in binary format. Since the intermediate carry is always lies between -1 and 1, it requires

only a 2-bit binary representation. Finally, five 6-variable Boolean expressions can be extracted.

The intermediate carry and sum circuit is shown in Figure 1.20.

30

Figure 1.21: The intermediate carry and sum generator

Figure 1.22: The second step QSD adder

In step 2, the intermediate carry from the lower significant digit is added to the sum of the

current digit to produce the final result. The addition in this step produces no carry because the

current digit can always absorb the carry-in from the lower digit.

31

Figure 1.23: N-digit QSD adder

By using N cells in parallel we can make N digit adder. The delay in this N digit adder is

constant which is equal to delay of single digit adder.

32

Table1.1 : The mapping between the inputs and outputs of the Intermediate carry and sum

33

1.6.3) VHDL code of QSD adder

library IEEE;use IEEE.STD_LOGIC_1164.all;

entity qsdadder isport (a0,a1,a2,b0,b1,b2 : in bit;

c0,c1,s0,s1,s2 : out bit);end qsdadder;

architecture qsdadder of qsdadder is begin

c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1) and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0));

c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or ((not a1) and (not a0) and b2 and (not b1))or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2));

s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1)) or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1) or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))or (a2 and a1 and a0 and b2 and b1 and b0);

s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0);

s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)or ((not a0 ) and (not b2) and b0);

end qsdadder;

34

Figure 1.24: Waveform of QSD adder cell

Figure 1.25: VHDL simulation of QSD adder cell

On simulation in Xilinx we get the delay of 13.931ns for QSD adder.

35

1.6.4) Single Digit QSD Multiplier

There are generally two methods for a multiplication operation : parallel and iterative. QSD

multiplication can be implemented in both ways, requiring a QSD partial product generator and a

QSD adder as basic components. A partial product M i is a result of multiplication between an n-

digit input , A n-1 – A0 , with a single digit input Bi , where i = 0…n-1 .

The primitive component of the partial product generator is a single digit multiplication unit. The

single digit multiplication produces M as a result and C as a carry to be combined with M of the

next digit. The range of the out is from -9 to 9 which can be represented with M and C in QSD

form. The value of M and C should lie between -2 and 2.

The mapping between inputs A (Multiplicand) and B (Multiplier) and the outputs M and C is shown in the Table 1.2.

Table 1.2: The mapping between multiplicand and multiplier

1.6.5) VHDL code for single digit multiplier

library IEEE;

36

INPUT OUTPUTQSD Binary Decimal QSD BinaryA B A B Product C M C M 3 3 011 011 9 2 1 010 001-3 -3 101 101 9 2 1 010 001 3 2 011 010 6 1 2 001 010 2 3 010 011 6 1 2 001 010-3 -2 101 110 6 1 2 001 010-2 -3 110 101 6 1 2 001 010 2 2 010 010 4 1 0 001 000-2 -2 110 110 4 1 0 001 000 3 1 011 001 3 1 -1 001 111-3 -1 101 111 3 1 -1 001 111 1 3 001 011 3 1 -1 001 111-1 -3 111 101 3 1 -1 001 111 2 1 010 001 2 0 2 000 010-2 -1 110 111 2 0 2 000 010 1 2 001 010 2 0 2 000 010-1 -2 111 110 2 0 2 000 010 1 1 001 001 1 0 1 000 001-1 -1 111 111 1 0 1 000 001 3 0 011 000 0 0 0 000 000 2 0 010 000 0 0 0 000 000 1 0 001 000 0 0 0 000 000 0 1 000 001 0 0 0 000 000 0 2 000 010 0 0 0 000 000 0 3 000 011 0 0 0 000 000 0 0 000 000 0 0 0 000 000-3 0 101 000 0 0 0 000 000-2 0 110 000 0 0 0 000 000-1 0 111 000 0 0 0 000 000 0 -1 000 111 0 0 0 000 000 0 -2 000 110 0 0 0 000 000 0 -3 000 101 0 0 0 000 000 1 -1 001 111 -1 0 -1 000 111-1 1 111 101 -1 0 -1 000 111 2 -1 010 111 -2 0 -2 000 110-1 2 111 010 -2 0 -2 000 110 1 -2 001 110 -2 0 -2 000 110-2 1 110 001 -2 0 -2 000 110 3 -1 011 111 -3 -1 1 111 001-1 3 111 011 -3 -1 1 111 001-3 1 101 001 -3 -1 1 111 001 1 -3 001 101 -3 -1 1 111 001 2 -2 010 110 -4 -1 0 111 000-2 2 110 010 -4 -1 0 111 000 3 -2 011 110 -6 -1 -2 111 110 -2 3 110 011 -6 -1 -2 111 110-3 2 101 010 -6 -1 -2 111 110 2 -3 010 101 -6 -1 -2 111 110 3 -3 011 101 -9 -2 -1 110 111-3 3 101 011 -9 -2 -1 110 111

use IEEE.STD_LOGIC_1164.all;

entity QSD_SINGLE_DIGIT_MULT is

port(

a2 : in STD_LOGIC;

a1 : in STD_LOGIC;

a0 : in STD_LOGIC;

b2 : in STD_LOGIC;

b1 : in STD_LOGIC;

b0 : in STD_LOGIC;

c2 : inout STD_LOGIC;

c1 : inout STD_LOGIC;

c0 : inout STD_LOGIC;

m2 : inout STD_LOGIC;

m1 : inout STD_LOGIC;

m0 : inout STD_LOGIC

);

end QSD_SINGLE_DIGIT_MULT;

--}} End of automatically maintained section

architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is

begin

c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand b0));

c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);

37

c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or ( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1 nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1 xor b1));

m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1)) or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2 nor a0));

m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));

m0<= a0 and b0;

-- enter your statements here –

end QSD_SINGLE_DIGIT_MULT;

38

Figure 1.26: Single digit QSD multiplier

On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.

1.7) COMPARATIVE RESULT OF DIFFERENT ADDERS

39

Figure 1.27: Delay vs. Number of bits for addition for different adding schemes

Figure 1.28: complexity vs. number of bits for addition of different adding schemes

CHAPTER 2

ADAPTIVE FILTER [6]

2.1) INTRODUCTION

An adaptive filter is a filter that self-adjusts its transfer function according to an optimization algorithm driven by an error signal. Because of the complexity of the optimization algorithms, most adaptive filters are digital filters. By way of contrast, a non-adaptive filter has a static transfer function. Adaptive filters are required for some applications because some parameters of the desired processing operation (for instance, the locations of reflective surfaces in a reverberant space) are not known in advance. The adaptive filter uses feedback in the form of an error signal to refine its transfer function to match the changing parameters.

Generally speaking, the adaptive process involves the use of a cost function, which is a criterion for optimum performance of the filter, to feed an algorithm, which determines how to modify filter transfer function to minimize the cost on the next iteration.

As the power of digital signal processors has increased, adaptive filters have become much more common and are now routinely used in devices such as mobile phones and other communication devices, camcorders and digital cameras, and medical monitoring equipment.

40

ripple

carry

addition

carry

look ahea

d addition

redundan

t binary

addition

hybrid

signed

addition

quartinary

signed

digit a

ddition0

200

400

600

10 13106

14130

20 30

212

28

260

40

500424

56

520

2 bit4 bit8 bit

The block diagram, shown in the following figure, serves as a foundation for particular adaptive filter realizations, such as Least Mean Squares (LMS) and Recursive Least Squares (RLS). The idea behind the block diagram is that a variable filter extracts an estimate of the desired signal.

Figure 2.1: Adaptive filter

To start the discussion of the block diagram we take the following assumptions:

* The input signal is the sum of a desired signal d(n) and interfering noise v(n)

x(n) = d(n) + v(n) (2.1)

* The variable filter has a Finite Impulse Response (FIR) structure. For such structures the impulse response is equal to the filter coefficients. The coefficients for a filter of order p are defined as

wn=[wn (0), wn (1),……. Wn(p)]T (2.2)

* The error signal or cost function is the difference between the desired and the estimated signal

e(n) = d(n)- (n) (2.3)

The variable filter estimates the desired signal by convolving the input signal with the impulse response. In vector notation this is expressed as

(n) = wn * x(n) (2.4)

where

x(n)=[x(n),x(n-1),…….,x(n-p)]T (2.5)

is an input signal vector. Moreover, the variable filter updates the filter coefficients at every time instant

41

wn+1 = wn+ ∆wn (2.6)

where ∆wn is a correction factor for the filter coefficients. The adaptive algorithm generates this correction factor based on the input and error signals. LMS and RLS define two different coefficient update algorithms.

2.2) LEAST MEAN SQUARE ADAPTIVE FILTER [6]

2.2.1) Introduction

Adaptive algorithms are a mainstay of Digital Signal Processing (DSP). They are used in a variety of applications including acoustic echo cancellation, radar guidance systems, and wireless channel estimation, among many others.

An adapative algorithm is used to estimate a time varying signal. There are many adaptive algorithms such as Recursive Least Square (RLS) and Kalman filters, but the most commonly used is the Least Mean Square (LMS) algorithm. It is a simple but powerful algorithm that can be implemented to take advantage of Lattice FPGA architectures. Developed by Window and Hoff, the algorithm uses a gradient descent to estimate a time varying signal. The gradient descent method finds a minimum, if it exists, by taking steps in the direction negative of the gradient. It does so by adjusting the filter coefficients to minimize the error.

The LMS reference design consists of two main functional blocks - a FIR filter and the LMS algorithm. The FIR filter is implemented serially using a multiplier and an adder with feedback. The FIR result is normalized to minimize saturation. The LMS algorithm iteratively updates the coefficient and feeds it to the FIR filter. The FIR filter than uses the coefficient e(n) along with the input reference signal x(n) to generate the output y(n). The output y(n) is then subtracted to from the desired signal d(n) to generate an error, which is used by the LMS algorithm to compute the next set of coefficients.

Figure 1 is a block diagram of system identification using adaptive filtering. The objective is to change (adapt) the coefficients of an FIR filter, W, to match as closely as possible the response of an unknown system, H. The unknown system and the adapting filter process the same input signal x[n] and have outputs d[n] (also referred to as the desired signal) and y[n].

42

Figure 2.2: Least Mean Square adaptive filter

2.2.2) GRADIENT-DESCENT ADAPTATION [6]

The adaptive filter, W, is adapted using the least mean-square algorithm, which is the most widely used adaptive filtering algorithm. First the error signal, e[n], is computed as e[n]=d[n]−y[n], which measures the difference between the output of the adaptive filter and the output of the unknown system. On the basis of this measure, the adaptive filter will change its coefficients in an attempt to reduce the error. The coefficient update relation is a function of the error signal squared and is given by

(2.7)

The term inside the parentheses represents the gradient of the squared-error with respect to the

Ith coefficient. The gradient is a vector pointing in the direction of the change in filter

coefficients that will cause the greatest increase in the error signal. Because the goal is to

minimize the error, however, Equation 1 updates the filter coefficients in the direction opposite

the gradient; that is why the gradient term is negated. The constant μ is a step-size, which

controls the amount of gradient information used to update each coefficient. After repeatedly

43

adjusting each coefficient in the direction opposite to the gradient of the error, the adaptive filter

should converge; that is, the difference between the unknown and adaptive systems should get

smaller and smaller. To express the gradient decent coefficient update equation in a more usable

manner, we can rewrite the derivative of the squared-error term as

(2.8)

Or,

(2.9)

(2.10)

(2.11)

which in turn gives us the final LMS coefficient update,

(2.12)

44

The step-size μ directly affects how quickly the adaptive filter will converge toward the

unknown system. If μ is very small, then the coefficients change only a small amount at each

update, and the filter converges slowly. With a larger step-size, more gradient information is

included in each update, and the filter converges more quickly; however, when the step-size is

too large, the coefficients may change too quickly and the filter will diverge. (It is possible in

some cases to determine analytically the largest value of μ ensuring convergence.)

2.2.3) CONVERGENCE AND STABILITY [6]

Assume that the true filter H(n) = H is constant, and that the input signal x(n) is wide-sense

stationary. Then E{W(n)} converges to H as n→∞ if and only if

(2.13)

Where λmax is the greatest eigenvalue of the autocorrelation matrix. If this condition is not

fulfilled, the algorithm becomes unstable and W(n) diverges.

Maximum convergence speed is achieved when

45

(2.14)

where λmin is the smallest eigenvalue of autocorrelation matrix. Given that μ is less than or

equal to this optimum, the convergence speed is determined by μ.λmin, with a larger value

yielding faster convergence. This means that faster convergence can be achieved when λmax is

close to λmin, that is, the maximum achievable convergence speed depends on the eigenvalue

spread of autocorrelation matrix.

A white noise signal has autocorrelation matrix R = σ2I, where σ2 is the variance of the signal.

In this case all eigenvalues are equal, and the eigenvalue spread is the minimum over all possible

matrices. The common interpretation of this result is therefore that the LMS converges quickly

for white input signals, and slowly for colored input signals, such as processes with low-pass or

high-pass characteristics.

It is important to note that the above upperbound on μ only enforces stability in the mean, but the

coefficients of W(n) can still grow infinitely large, i.e. divergence of the coefficients is still

possible. A more practical bound is

46

(2.15)

where tr[R] denotes the trace of autocorrelation matrix. This bound guarantees that the

coefficients of W(n) do not diverge (in practice, the value of μ should not be chosen close to this

upper bound, since it is somewhat optimistic due to approximations and assumptions made in the

derivation of the bound).

CHAPTER 3

47

IMPLEMENTATION OF LMS ADAPTIVE FILTER [7]

3.1) INTRODUCTION

In LMS the weight vector is updated from sample to sample as follows-

hk+1 = hk – μ ∇k (3.1)

hk and ∇k are the weights and the true gradient vectors respectively. At the kth sampling instant, μ controls the stability and the rate of convergence.

LMS algorithm for updating the weights from sample to sample is

hk+1 = hk + 2 μekxk (3.2)

where,

ek = yk - hkTxk (3.3)

3.2) IMPLEMENTATION OF LMS ALGORITHM [7]

1) Initially, set each each weight hk(i), for i=0,1,2,……,N-1 to an arbitrary fixed value such as 0.

For each subsequent sampling instant, k=1,2,….. carry out steps (2) to step (4) below.2) Compute filter output as

(3.4)

3) Compute the error estimateek = yk - nk (3.5)

4) Update the next filter weights

(3.6)

The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each new set of input and output samples.

48

3.3) FLOWCHART FOR THE LMS ADAPTIVE FILTER [7]

49

Update Coefficient

wk+1 = wk + 2μekxk-i

Compute Factor

2μek

Compute Error

ek=yk - nkFilter xk

nk=∑wk(i).xk-i

Read xk and yk

from ADC

Initialize

hk(i) and xk-i

3.4) IMPLEMENTATION OF DIFFERENT ORDERS LMS ADAPTIVE FILTER

3.4.1) Introduction [7]

The LMS algorithm is a linear adaptive filtering algorithm, which, in general, consists of two basic processes:

1) A filtering process, which involves (a) computing the output of a linear filter in response to an input signal and (b) generating an estimation error by comparing this output with a desired response.

2) An adaptive process, which involves the automatic adjustment of the parameters of the filter in accordance with the estimation error.

The combination of these two processes working together constitutes a feedback loop. First we have a transversal filter, around which the LMS algorithm is built, this component is responsible for performing the filtering process. Second, we have a mechanism for performing the adaptive control process on the tap weights of the transversal filter, hence is called adaptive weight-control mechanism.

Figure 3.1: LMS filter

50

3.4.2) 1st order LMS adaptive filter

3.4.2.1) Introduction

Figure 3.2: 1st order LMS adaptive filter

dout is the output of transversal filter

yn is the desired signal

e(n) is the estimation error given as-

e(n) = dout(n) – y(n) (3.7)

w(n+1) = w(n) + 2μe(n)xin(n) (3.8)

w(n+1) is the updated weight and w(n) is the previous weight

51

Components required for designing of 1st order LMS adaptive filter are-

Number of delay elements required = 1

Number of multipliersin transversal filter = 2

Number of multipliersin adaptive weight control mechanism = 3

Number of adders in transversal filter = 1

Number of adders in adaptive weight mechanism = 3

Here total number of multipliers are 5 and total number of adders are 4. The delay of QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 1st order LMS adaptive filter is 112.464ns.

3.4.2.2) VHDL implementation of 1st order LMS adaptive filter

Here we are using μ=0.5.

3.4.2.2.1) VHDL code for 1st order LMS adaptive filter

library IEEE;use IEEE.STD_LOGIC_1164.all;

entity first_order_filter is port( x2,x1,x0:in std_logic ; y5,y4,y3,y2,y1,y0:in std_logic ; q2,q1,q0:in std_logic ; w02,w01,w00: in std_logic ; w12,w11,w10: in std_logic; d5,d4,d3,d2,d1,d0:inout std_logic);end first_order_filter;

--}} End of automatically maintained section

architecture first_order_filter of first_order_filter iscomponent delay_unit port( a ,b,c: in STD_LOGIC; d ,e,f: out STD_LOGIC

52

);end component ;component qsdadder port (b2,b1,b0,a2,a1,a0 : in std_logic; c1,c0,s2,s1,s0 : out std_logic);end component ;

component qsdadder2bit port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ; z5,z4,z3,z2,z1,z0:out std_logic );end component ;

component QSD_SINGLE_DIGIT_MULT port( a2 : in STD_LOGIC; a1 : in STD_LOGIC; a0 : in STD_LOGIC; b2 : in STD_LOGIC; b1 : in STD_LOGIC; b0 : in STD_LOGIC; c2 : inout STD_LOGIC; c1 : inout STD_LOGIC; c0 : inout STD_LOGIC; m2 : inout STD_LOGIC; m1 : inout STD_LOGIC; m0 : inout STD_LOGIC ); end component ; component adaptationunit_first_order port (d5,d4,d3,d2,d1,d0:in std_logic ; y5,y4,y3,y2,y1,y0:in std_logic ; q2,q1,q0:in std_logic ; x12,x11,x10,x22,x21,x20: in std_logic ; w12,w11,w10,w22,w21,w20: in std_logic ; wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );end component ; signal xd12,xd11,xd10:std_logic ;

signal xd22,xd21,xd20 :std_logic ; signal nk02,nk01,nk00: std_logic ;signal nk12,nk11,nk10: std_logic ;signal nki02,nki01,nki00: std_logic ;signal nki12,nki11,nki10: std_logic ;

53

signal do4,do3: std_logic ;

signal ws02,ws01,ws00,ws12,ws11,ws10:std_logic ;begin delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10); mul1: QSD_SINGLE_DIGIT_MULT port map (x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00); mul2: QSD_SINGLE_DIGIT_MULT port map (xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10); add1: qsdadder2bit port map ( nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,d5,d4,d3,d2,d1,d0); adaptation: adaptationunit_first_order port map (d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10, w02,w01,w00,w12,w11,w10,ws02,ws01,ws00,ws12,ws11,ws10); -- enter your statements here --

end first_order_filter;

54

Figure 3.3: VHDL simulation of 1st order LMS adaptive filter3.4.2.2.2) VHDL code for one digit QSD adder

library IEEE;use IEEE.STD_LOGIC_1164.all;

entity qsdadder isport (b2,b1,b0,a2,a1,a0 : in std_logic;

c1,c0,s2,s1,s0 : out std_logic);end qsdadder;

--}} End of automatically maintained section

architecture qsdadder of qsdadder is begin

c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1) and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0)) after 2 ns;

c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or ((not a1) and (not a0) and b2 and (not b1))or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)

55

or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2))

after 2 ns;

s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1)) or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1) or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))or (a2 and a1 and a0 and b2 and b1 and b0) after 2 ns;

s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0) after 2 ns;

s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)or ((not a0 ) and (not b2) and b0) after 2 ns;

-- enter your statements here --

end qsdadder;

56

Figure 3.4: 1 digit QSD adder

3.4.2.2.3) VHDL code for two digit QSD adder

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity qsdadder2bit is

57

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;

z5,z4,z3,z2,z1,z0:out std_logic );

end qsdadder2bit;

--}} End of automatically maintained section

architecture qsdadder2bit of qsdadder2bit is

signal ci5,ci4,ci3:std_logic ;

signal s5,s4,s3,s2,s1:std_logic ;

signal si5,si4:std_logic ;

component qsdadder

port (b2,b1,b0,a2,a1,a0 : in std_logic;

c1,c0,s2,s1,s0 : out std_logic);

end component ;

begin

ci5<=’0’ ;

add1: qsdadder port map ( x2,x1,x0,y2,y1,y0 ,ci4,ci3,z2,z1,z0);

add2: qsdadder port map ( x5,x4,x3,y5,y4,y3 ,s5,s4,s3,s2,s1);

add3: qsdadder port map ( s3,s2,s1,ci5,ci4,ci3,si5,si4,z5,z4,z3);

-- enter your statements here –

end qsdadder2bit;

58

Figure 3.5: 2 digit QSD adder

3.4.2.2.3) VHDL code for single digit multiplier

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity QSD_SINGLE_DIGIT_MULT is

port(

a2 : in STD_LOGIC;

a1 : in STD_LOGIC;

a0 : in STD_LOGIC;

59

b2 : in STD_LOGIC;

b1 : in STD_LOGIC;

b0 : in STD_LOGIC;

c2 : inout STD_LOGIC;

c1 : inout STD_LOGIC;

c0 : inout STD_LOGIC;

m2 : inout STD_LOGIC;

m1 : inout STD_LOGIC;

m0 : inout STD_LOGIC

);

end QSD_SINGLE_DIGIT_MULT;

--}} End of automatically maintained section

architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is

begin

c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand b0));

c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);

c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or ( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1 nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1 xor b1));

m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1)) or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2 nor a0));

m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));

60

m0<= a0 and b0;

-- enter your statements here –

end QSD_SINGLE_DIGIT_MULT;

Figure 3.6: Single digit QSD multiplier

On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.

3.4.2.2.4) VHDL code for complement generator of two digit QSD number

61

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity complement_genrator is

port(a5,a4,a3,a2,a1,a0: in std_logic;

b5,b4,b3,b2,b1,b0: inout std_logic);

end complement_genrator;

--}} End of automatically maintained section

architecture complement_genrator of complement_genrator is

signal f5,f4,f3,f2,f1,f0: std_logic;

signal n2,n1: std_logic ;

signal n0: std_logic ;

component qsdadder

port (a0,a1,a2,b0,b1,b2 : in std_logic;

c0,c1,s0,s1,s2 : out std_logic);

end component;

begin

n2<=’0’;

n1<=’0’;

n0<=’1’;

f5<=’0’;

process (a0)

begin

if a2=’0’ and a1 =’0’ and a0=’0’ then

62

b2<=’0’ ; b1<=’1’ ; b0<=’1’;

end if;

if a2=’0’ and a1 =’0’ and a0=’1’ then

b2<=’0’ ; b1<=’1’ ; b0<=’0’;

end if ;

if a2=’0’ and a1 =’1’ and a0=’0’ then

b2<=’0’ ; b1<=’0’ ; b0<=’1’;

end if ;

if a2=’0’ and a1 =’1’ and a0=’1’ then

b2<=’0’ ; b1<=’0’ ; b0<=’0’;

end if ;

if a5=’0’ and a4 =’0’ and a3=’0’ then

b5<=’0’ ; b4<=’1’ ; b3<=’1’;

end if ;

if a5=’0’ and a4 =’0’ and a3=’1’ then

b5<=’0’ ; b4<=’1’ ; b3<=’0’;

end if ;

if a5=’0’ and a4 =’1’ and a3=’0’ then

b5<=’0’ ; b4<=’0’ ; b3<=’1’;

end if ;

if a5=’0’ and a4 =’1’ and a3=’1’ then

b5<=’0’ ; b4<=’0’ ; b3<=’0’;

63

end if ;

end process;

add1: qsdadder port map (n0,n1,n2,b0,b1,b2,f3,f4,f0,f1,f2) ;

add2: qsdadder port map (f3,f4,f5,b3,b4,b5,f4,f5,b3,b4,b5) ;

-- enter your statements here –

end complement_genrator;

Figure 3.7: Two digit QSD number complement generator

3.4.2.2.5) VHDL code for delay unit

64

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity delay_unit is

port(

a ,b,c: in STD_LOGIC;

d ,e,f: out STD_LOGIC

);

end delay_unit;

--}} End of automatically maintained section

architecture delay_unit of delay_unit is

begin

d<=a after 100 ns;

e<=b after 100 ns;

f<=c after 100 ns;

-- enter your statements here --

end delay_unit;

65

Figure 3.8: Delay unit

3.4.2.2.6) VHDL code for adaptive weight control mechanism

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity adaptationunit_first_order is

port (d5,d4,d3,d2,d1,d0:in std_logic ;

y5,y4,y3,y2,y1,y0:in std_logic ;

q2,q1,q0:in std_logic ;

x12,x11,x10,x22,x21,x20: in std_logic ;

66

w12,w11,w10,w22,w21,w20: in std_logic ;

wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );

end adaptationunit_first_order;

--}} End of automatically maintained section

architecture adaptationunit_first_order of adaptationunit_first_order is

component complement_genrator

port(a5,a4,a3,a2,a1,a0: in std_logic;

b5,b4,b3,b2,b1,b0: inout std_logic);

end component ;

component qsdadder2bit

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;

z5,z4,z3,z2,z1,z0:out std_logic );

end component ;

component QSD_SINGLE_DIGIT_MULT

port(

a2 : in STD_LOGIC;

a1 : in STD_LOGIC;

a0 : in STD_LOGIC;

b2 : in STD_LOGIC;

b1 : in STD_LOGIC;

b0 : in STD_LOGIC;

c2 : inout STD_LOGIC;

c1 : inout STD_LOGIC;

67

c0 : inout STD_LOGIC;

m2 : inout STD_LOGIC;

m1 : inout STD_LOGIC;

m0 : inout STD_LOGIC

);

end component ;

component qsdadder

port (b2,b1,b0,a2,a1,a0 : in std_logic;

c1,c0,s2,s1,s0 : out std_logic);

end component ;

signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;

signal e5,e4,e3,e2,e1,e0:std_logic ;

signal f5,f4,f3,f2,f1,f0:std_logic ;

signal g15,g14,g13,g12,g11,g10:std_logic ;

signal g25,g24,g23,g22,g21,g20:std_logic ;

signal wo14,wo13,wo24,wo23:std_logic ;

begin

complement: complement_genrator port map ( d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);

add1: qsdadder2bit port map ( dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);

68

mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);

mul21:QSD_SINGLE_DIGIT_MULT port map (f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);

mul22:QSD_SINGLE_DIGIT_MULT port map (f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;

add2: qsdadder port map (w12,w11,w10, g12,g11,g10 ,wo14,wo13,wo12,wo11,wo10);

add3: qsdadder port map (w22,w21,w20, g22,g21,g20, wo24,wo23,wo22,wo21,wo20);

-- enter your statements here –

end adaptationunit_first_order;

Figure 3.9: Adaptive weight control mechanism of 1st order LMS adaptive filter

69

3.4.3) 2nd order LMS adaptive filter

3.4.3.1) Introduction

Figure 3.10: 2nd order LMS adaptive filter

dout is the output of transversal filter

yn is the desired output

e(n) is the estimation error given as-

e(n) = dout(n) – y(n) (3.9)

w(n+1) = w(n) + 2μe(n)xin(n) (3.10)

w(n+1) is the updated weight and w(n) is the previous weight

Components required for designing of 2nd order LMS adaptive filter are-

Number of delay elements required = 2

70

Number of multipliersin transversal filter = 3

Number of multipliersin adaptive weight control mechanism = 4

Number of adders in transversal filter = 2

Number of adders in adaptive weight mechanism = 4

Here total number of multipliers are 7 and total number of adders are 6. The delay of QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 2nd order LMS adaptive filter is 163.022ns.

3.4.3.2) VHDL implementation of 2nd order LMS adaptive filter

3.4.3.2.1) VHDL code for 2nd order LMS adaptive filter

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity second_order_filter is

port( x2,x1,x0:in std_logic ;

y5,y4,y3,y2,y1,y0:in std_logic ;

q2,q1,q0:in std_logic ;

w02,w01,w00: in std_logic ;

w12,w11,w10: in std_logic;

w22,w21,w20: in std_logic;

d5,d4,d3,d2,d1,d0:inout std_logic);

end second_order_filter;

--}} End of automatically maintained section

71

architecture second_order_filter of second_order_filter is

component delay_unit

port(

a ,b,c: in STD_LOGIC;

d ,e,f: out STD_LOGIC

);

end component ;

component qsdadder

port (b2,b1,b0,a2,a1,a0 : in std_logic;

c1,c0,s2,s1,s0 : out std_logic);

end component ;

component qsdadder2bit

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;

z5,z4,z3,z2,z1,z0:out std_logic );

end component ;

component QSD_SINGLE_DIGIT_MULT

port(

a2 : in STD_LOGIC;

a1 : in STD_LOGIC;

a0 : in STD_LOGIC;

72

b2 : in STD_LOGIC;

b1 : in STD_LOGIC;

b0 : in STD_LOGIC;

c2 : inout STD_LOGIC;

c1 : inout STD_LOGIC;

c0 : inout STD_LOGIC;

m2 : inout STD_LOGIC;

m1 : inout STD_LOGIC;

m0 : inout STD_LOGIC

);

end component ;

component adaptation_unit

port (d5,d4,d3,d2,d1,d0:in std_logic ;

y5,y4,y3,y2,y1,y0:in std_logic ;

q2,q1,q0:in std_logic ;

x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;

w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;

wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );

end component ;

signal xd12,xd11,xd10:std_logic ;

signal xd22,xd21,xd20 :std_logic ;

signal nk02,nk01,nk00: std_logic ;

signal nk12,nk11,nk10: std_logic ;

signal nki22,nki21,nki20,nk22,nk21,nk20:std_logic ;

signal nki02,nki01,nki00: std_logic ;

73

signal nki12,nki11,nki10: std_logic ;

signal do4,do3: std_logic ;

signal di5,di4,di3,di2,di1,di0:std_logic ;

signal ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws21,ws20:std_logic ;

begin

delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10);

delay2: delay_unit port map (xd12,xd11,xd10,xd22,xd21,xd20);

mul1: QSD_SINGLE_DIGIT_MULT port map (x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00);

mul2: QSD_SINGLE_DIGIT_MULT port map (xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10);

mul3: QSD_SINGLE_DIGIT_MULT port map (xd22,xd21,xd20,ws22,ws21,ws20,nki22,nki21,nki20,nk22,nk21,nk20);

add1: qsdadder2bit port map ( nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,di5,di4,di3,di2,di1,di0);

add2: qsdadder2bit port map (di5,di4,di3,di2,di1,di0,nki22,nki21,nki20,nk22,nk21,nk20,d5,d4,d3,d2,d1,d0);

adaptation: adaptation_unit port map (d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,xd22,xd21,xd20,

w02,w01,w00,w12,w11,w10,w22,w21,w20,ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws21,ws20);

-- enter your statements here --

end second_order_filter;

74

figure 3.11: VHDL simulation of 2nd order LMS adaptive filter

3.4.3.2.2) VHDL code for adaptive weight control mechanism of 2nd order LMS filter

library IEEE;

use IEEE.STD_LOGIC_1164.all;

entity adaptation_unit is

port (d5,d4,d3,d2,d1,d0:in std_logic ;

y5,y4,y3,y2,y1,y0:in std_logic ;

q2,q1,q0:in std_logic ;

x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;

w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;

wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );

end adaptation_unit;

75

--}} End of automatically maintained section

architecture adaptation_unit of adaptation_unit is

component complement_genrator

port(a5,a4,a3,a2,a1,a0: in std_logic;

b5,b4,b3,b2,b1,b0: inout std_logic);

end component ;

component qsdadder2bit

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;

z5,z4,z3,z2,z1,z0:out std_logic );

end component ;

component QSD_SINGLE_DIGIT_MULT

port(

a2 : in STD_LOGIC;

a1 : in STD_LOGIC;

a0 : in STD_LOGIC;

b2 : in STD_LOGIC;

b1 : in STD_LOGIC;

b0 : in STD_LOGIC;

c2 : inout STD_LOGIC;

c1 : inout STD_LOGIC;

c0 : inout STD_LOGIC;

m2 : inout STD_LOGIC;

m1 : inout STD_LOGIC;

m0 : inout STD_LOGIC

76

);

end component ;

component qsdadder

port (b2,b1,b0,a2,a1,a0 : in std_logic;

c1,c0,s2,s1,s0 : out std_logic);

end component ;

signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;

signal e5,e4,e3,e2,e1,e0:std_logic ;

signal f5,f4,f3,f2,f1,f0:std_logic ;

signal g15,g14,g13,g12,g11,g10:std_logic ;

signal g25,g24,g23,g22,g21,g20:std_logic ;

signal g35,g34,g33,g32,g31,g30:std_logic ;

signal wo14,wo13,wo34,wo33,wo24,wo23:std_logic ;

begin

complement: complement_genrator port map ( d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);

add1: qsdadder2bit port map ( dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);

mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);

mul21:QSD_SINGLE_DIGIT_MULT port map (f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);

mul22:QSD_SINGLE_DIGIT_MULT port map (f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;

77

mul23:QSD_SINGLE_DIGIT_MULT port map (f2,f1,f0,x32,x31,x30,g35,g34,g33,g32,g31,g30);

add2: qsdadder port map (w12,w11,w10, g12,g11,g10 ,wo14,wo13,wo12,wo11,wo10);

add3: qsdadder port map (w22,w21,w20, g22,g21,g20, wo24,wo23,wo22,wo21,wo20);

add4: qsdadder port map (w32,w31,w30, g32,g31,g30, wo34,wo33,wo32,wo31,wo30);

-- enter your statements here --

end adaptation_unit;

Figure 3.12: Adaptive weight control mechanism of 2nd order LMS adaptive filter

78

CHAPTER 4

CONCLUSION

We have implemented 1st and 2nd order adaptive filters using LMS algorithm for adaptive weight control mechanism. For implementation of above adaptive filter we have used non conventional quaternary signed digit number system. For this we have designed and implemented addition and multiplication blocks for QSD number system. By use of these blocks we have implemented our adaptive filter. We have shown above that in QSD number system the addition takes place in parallel so the delay is constant and does not depend on number of bits to be added, the delay of QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns.The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each new set of input and output samples, where N is order of the filter. So the delay depends upon the number of multiplication and addition.Here we have implemented the adaptive filter using QSD adders and multipliers the total delay of 1st order LMS adaptive filter is 112.464ns and the total delay of 2nd order LMS adaptive filter is 163.022ns. So the delay is much less in comparison to the implementation of adaptive filter using conventional adders and multipliers.

79

APPENDIX

1. Xilinx report for QSD adder

Release 9.2i - xst J.36

Copyright (c) 1995-2007 Xilinx, Inc. All rights reserved.

--> Parameter TMPDIR set to ./xst/projnav.tmp

CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Parameter xsthdpdir set to ./xst

CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Reading design: qsdadder.prj

TABLE OF CONTENTS

1) Synthesis Options Summary

2) HDL Compilation

3) Design Hierarchy Analysis

4) HDL Analysis

5) HDL Synthesis

5.1) HDL Synthesis Report

6) Advanced HDL Synthesis

6.1) Advanced HDL Synthesis Report

80

7) Low Level Synthesis

8) Partition Report

9) Final Report

9.1) Device utilization summary

9.2) Partition Resource Summary

9.3) TIMING REPORT

=========================================================================

* Synthesis Options Summary *

=========================================================================

---- Source Parameters

Input File Name : "qsdadder.prj"

Input Format : mixed

Ignore Synthesis Constraint File : NO

---- Target Parameters

Output File Name : "qsdadder"

Output Format : NGC

Target Device : xc2s15-6-cs144

---- Source Options

Top Module Name : qsdadder

Automatic FSM Extraction : YES

FSM Encoding Algorithm : Auto

Safe Implementation : No

81

FSM Style : lut

RAM Extraction : Yes

RAM Style : Auto

ROM Extraction : Yes

Mux Style : Auto

Decoder Extraction : YES

Priority Encoder Extraction : YES

Shift Register Extraction : YES

Logical Shifter Extraction : YES

XOR Collapsing : YES

ROM Style : Auto

Mux Extraction : YES

Resource Sharing : YES

Asynchronous To Synchronous : NO

Multiplier Style : lut

Automatic Register Balancing : No

---- Target Options

Add IO Buffers : YES

Global Maximum Fanout : 100

Add Generic Clock Buffer(BUFG) : 4

Register Duplication : YES

Slice Packing : YES

Optimize Instantiated Primitives : NO

Convert Tristates To Logic : Yes

82

Use Clock Enable : Yes

Use Synchronous Set : Yes

Use Synchronous Reset : Yes

Pack IO Registers into IOBs : auto

Equivalent register Removal : YES

---- General Options

Optimization Goal : Speed

Optimization Effort : 1

Library Search Order : qsdadder.lso

Keep Hierarchy : NO

RTL Output : Yes

Global Optimization : AllClockNets

Read Cores : YES

Write Timing Constraints : NO

Cross Clock Analysis : NO

Hierarchy Separator : /

Bus Delimiter : <>

Case Specifier : maintain

Slice Utilization Ratio : 100

BRAM Utilization Ratio : 100

Verilog 2001 : YES

Auto BRAM Packing : NO

Slice Utilization Ratio Delta : 5

83

=========================================================================

=========================================================================

* HDL Compilation *

=========================================================================

Compiling vhdl file "C:/Xilinx92i/lma/adaptive.vhd" in Library work.

Architecture qsdadder of Entity qsdadder is up to date.

=========================================================================

* Design Hierarchy Analysis *

=========================================================================

Analyzing hierarchy for entity <qsdadder> in library <work> (architecture <qsdadder>).

=========================================================================

* HDL Analysis *

=========================================================================

Analyzing Entity <qsdadder> in library <work> (Architecture <qsdadder>).

Entity <qsdadder> analyzed. Unit <qsdadder> generated.

84

=========================================================================

* HDL Synthesis *

=========================================================================

Performing bidirectional port resolution...

Synthesizing Unit <qsdadder>.

Related source file is "C:/Xilinx92i/lma/adaptive.vhd".

Unit <qsdadder> synthesized.

=========================================================================

HDL Synthesis Report

Found no macro

=========================================================================

=========================================================================

* Advanced HDL Synthesis *

=========================================================================

85

Loading device for application Rf_Device from file '2s15.nph' in environment C:\Xilinx92i.

=========================================================================

Advanced HDL Synthesis Report

Found no macro

=========================================================================

=========================================================================

* Low Level Synthesis *

=========================================================================

Optimizing unit <qsdadder> ...

Mapping all equations...

Building and optimizing final netlist ...

Found area constraint ratio of 100 (+ 5) on block qsdadder, actual ratio is 3.

Final Macro Processing ...

=========================================================================

86

Final Register Report

Found no macro

=========================================================================

=========================================================================

* Partition Report *

=========================================================================

Partition Implementation Status

-------------------------------

No Partitions were found in this design.

-------------------------------

=========================================================================

* Final Report *

=========================================================================

Final Results

RTL Top Level Output File Name : qsdadder.ngr

Top Level Output File Name : qsdadder

87

Output Format : NGC

Optimization Goal : Speed

Keep Hierarchy : NO

Design Statistics

# IOs : 11

Cell Usage :

# BELS : 17

# LUT2 : 1

# LUT3 : 2

# LUT4 : 10

# MUXF5 : 3

# MUXF6 : 1

# IO Buffers : 11

# IBUF : 6

# OBUF : 5

=========================================================================

Device utilization summary:

---------------------------

Selected Device : 2s15cs144-6

88

Number of Slices: 7 out of 192 3%

Number of 4 input LUTs: 13 out of 384 3%

Number of IOs: 11

Number of bonded IOBs: 11 out of 86 12%

---------------------------

Partition Resource Summary:

---------------------------

No Partitions were found in this design.

---------------------------

=========================================================================

TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.

FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT

GENERATED AFTER PLACE-and-ROUTE.

Clock Information:

------------------

No clock signals found in this design

89

Asynchronous Control Signals Information:

----------------------------------------

No asynchronous control signals found in this design

Timing Summary:

---------------

Speed Grade: -6

Minimum period: No path found

Minimum input arrival time before clock: No path found

Maximum output required time after clock: No path found

Maximum combinational path delay: 13.931ns

Timing Detail:

--------------

All values displayed in nanoseconds (ns)

=========================================================================

Timing constraint: Default path analysis

Total number of paths / destination ports: 59 / 5

-------------------------------------------------------------------------

Delay: 13.931ns (Levels of Logic = 6)

Source: a0 (PAD)

90

Destination: c0 (PAD)

Data Path: a0 to c0

Gate Net

Cell:in->out fanout Delay Delay Logical Name (Net Name)

---------------------------------------- ------------

IBUF:I->O 10 0.776 1.980 a0_IBUF (a0_IBUF)

LUT4:I0->O 1 0.549 1.035 c138 (c1_map15)

LUT3:I2->O 1 0.549 1.035 c149 (c1_map17)

LUT4:I0->O 2 0.549 1.206 c157 (c1_OBUF)

LUT4:I3->O 1 0.549 1.035 c0 (c0_OBUF)

OBUF:I->O 4.668 c0_OBUF (c0)

----------------------------------------

Total 13.931ns (7.640ns logic, 6.291ns route)

(54.8% logic, 45.2% route)

=========================================================================

CPU : 3.12 / 3.31 s | Elapsed : 3.00 / 3.00 s

-->

Total memory usage is 161748 kilobytes

Number of errors : 0 ( 0 filtered)

Number of warnings : 0 ( 0 filtered)

91

Number of infos : 0 ( 0 filtered)

2. Xilinx report for QSD multiplier

Release 9.2i - xst J.36

Copyright (c) 1995-2007 Xilinx, Inc. All rights reserved.

--> Parameter TMPDIR set to ./xst/projnav.tmp

CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Parameter xsthdpdir set to ./xst

CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Reading design: QSD_SINGLE_DIGIT_MULT.prj

TABLE OF CONTENTS

1) Synthesis Options Summary

2) HDL Compilation

3) Design Hierarchy Analysis

4) HDL Analysis

5) HDL Synthesis

5.1) HDL Synthesis Report

6) Advanced HDL Synthesis

6.1) Advanced HDL Synthesis Report

7) Low Level Synthesis

8) Partition Report

9) Final Report

9.1) Device utilization summary

92

9.2) Partition Resource Summary

9.3) TIMING REPORT

=========================================================================

* Synthesis Options Summary *

=========================================================================

---- Source Parameters

Input File Name : "QSD_SINGLE_DIGIT_MULT.prj"

Input Format : mixed

Ignore Synthesis Constraint File : NO

---- Target Parameters

Output File Name : "QSD_SINGLE_DIGIT_MULT"

Output Format : NGC

Target Device : xc2s15-6-cs144

---- Source Options

Top Module Name : QSD_SINGLE_DIGIT_MULT

Automatic FSM Extraction : YES

FSM Encoding Algorithm : Auto

Safe Implementation : No

FSM Style : lut

RAM Extraction : Yes

93

RAM Style : Auto

ROM Extraction : Yes

Mux Style : Auto

Decoder Extraction : YES

Priority Encoder Extraction : YES

Shift Register Extraction : YES

Logical Shifter Extraction : YES

XOR Collapsing : YES

ROM Style : Auto

Mux Extraction : YES

Resource Sharing : YES

Asynchronous To Synchronous : NO

Multiplier Style : lut

Automatic Register Balancing : No

---- Target Options

Add IO Buffers : YES

Global Maximum Fanout : 100

Add Generic Clock Buffer(BUFG) : 4

Register Duplication : YES

Slice Packing : YES

Optimize Instantiated Primitives : NO

Convert Tristates To Logic : Yes

Use Clock Enable : Yes

Use Synchronous Set : Yes

94

Use Synchronous Reset : Yes

Pack IO Registers into IOBs : auto

Equivalent register Removal : YES

---- General Options

Optimization Goal : Speed

Optimization Effort : 1

Library Search Order : QSD_SINGLE_DIGIT_MULT.lso

Keep Hierarchy : NO

RTL Output : Yes

Global Optimization : AllClockNets

Read Cores : YES

Write Timing Constraints : NO

Cross Clock Analysis : NO

Hierarchy Separator : /

Bus Delimiter : <>

Case Specifier : maintain

Slice Utilization Ratio : 100

BRAM Utilization Ratio : 100

Verilog 2001 : YES

Auto BRAM Packing : NO

Slice Utilization Ratio Delta : 5

=========================================================================

95

=========================================================================

* HDL Compilation *

=========================================================================

Compiling vhdl file "C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd" in Library work.

Entity <QSD_SINGLE_DIGIT_MULT> compiled.

Entity <QSD_SINGLE_DIGIT_MULT> (Architecture <QSD_SINGLE_DIGIT_MULT>) compiled.

=========================================================================

* Design Hierarchy Analysis *

=========================================================================

Analyzing hierarchy for entity <QSD_SINGLE_DIGIT_MULT> in library <work> (architecture <QSD_SINGLE_DIGIT_MULT>).

=========================================================================

* HDL Analysis *

=========================================================================

96

Analyzing Entity <QSD_SINGLE_DIGIT_MULT> in library <work> (Architecture <QSD_SINGLE_DIGIT_MULT>).

Entity <QSD_SINGLE_DIGIT_MULT> analyzed. Unit <QSD_SINGLE_DIGIT_MULT> generated.

=========================================================================

* HDL Synthesis *

=========================================================================

Performing bidirectional port resolution...

Synthesizing Unit <QSD_SINGLE_DIGIT_MULT>.

Related source file is "C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd".

Found 1-bit xor2 for signal <m2$xor0000> created at line 55.

Unit <QSD_SINGLE_DIGIT_MULT> synthesized.

=========================================================================

HDL Synthesis Report

Macro Statistics

# Xors : 1

97

1-bit xor2 : 1

=========================================================================

=========================================================================

* Advanced HDL Synthesis *

=========================================================================

Loading device for application Rf_Device from file '2s15.nph' in environment C:\Xilinx92i.

=========================================================================

Advanced HDL Synthesis Report

Macro Statistics

# Xors : 1

1-bit xor2 : 1

=========================================================================

=========================================================================

* Low Level Synthesis *

98

=========================================================================

Optimizing unit <QSD_SINGLE_DIGIT_MULT> ...

Mapping all equations...

Building and optimizing final netlist ...

Found area constraint ratio of 100 (+ 5) on block QSD_SINGLE_DIGIT_MULT, actual ratio is 4.

Final Macro Processing ...

=========================================================================

Final Register Report

Found no macro

=========================================================================

=========================================================================

* Partition Report *

=========================================================================

Partition Implementation Status

99

-------------------------------

No Partitions were found in this design.

-------------------------------

=========================================================================

* Final Report *

=========================================================================

Final Results

RTL Top Level Output File Name : QSD_SINGLE_DIGIT_MULT.ngr

Top Level Output File Name : QSD_SINGLE_DIGIT_MULT

Output Format : NGC

Optimization Goal : Speed

Keep Hierarchy : NO

Design Statistics

# IOs : 12

Cell Usage :

# BELS : 23

# LUT2 : 1

# LUT3 : 1

# LUT4 : 14

100

# MUXF5 : 5

# MUXF6 : 2

# IO Buffers : 12

# IBUF : 6

# OBUF : 6

=========================================================================

Device utilization summary:

---------------------------

Selected Device : 2s15cs144-6

Number of Slices: 8 out of 192 4%

Number of 4 input LUTs: 16 out of 384 4%

Number of IOs: 12

Number of bonded IOBs: 12 out of 86 13%

---------------------------

Partition Resource Summary:

---------------------------

No Partitions were found in this design.

---------------------------

101

=========================================================================

TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.

FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT

GENERATED AFTER PLACE-and-ROUTE.

Clock Information:

------------------

No clock signals found in this design

Asynchronous Control Signals Information:

----------------------------------------

No asynchronous control signals found in this design

Timing Summary:

---------------

Speed Grade: -6

Minimum period: No path found

Minimum input arrival time before clock: No path found

Maximum output required time after clock: No path found

102

Maximum combinational path delay: 11.348ns

Timing Detail:

--------------

All values displayed in nanoseconds (ns)

=========================================================================

Timing constraint: Default path analysis

Total number of paths / destination ports: 71 / 6

-------------------------------------------------------------------------

Delay: 11.348ns (Levels of Logic = 5)

Source: a1 (PAD)

Destination: c1 (PAD)

Data Path: a1 to c1

Gate Net

Cell:in->out fanout Delay Delay Logical Name (Net Name)

---------------------------------------- ------------

IBUF:I->O 13 0.776 2.250 a1_IBUF (a1_IBUF)

LUT4:I1->O 2 0.549 1.206 c2_SW0 (N21)

LUT3:I0->O 1 0.549 0.000 c1_F (N41)

MUXF5:I0->O 1 0.315 1.035 c1 (c1_OBUF)

OBUF:I->O 4.668 c1_OBUF (c1)

----------------------------------------

103

Total 11.348ns (6.857ns logic, 4.491ns route)

(60.4% logic, 39.6% route)

=========================================================================

CPU : 3.03 / 3.21 s | Elapsed : 3.00 / 3.00 s

-->

Total memory usage is 162324 kilobytes

Number of errors : 0 ( 0 filtered)

Number of warnings : 0 ( 0 filtered)

Number of infos : 0 ( 0 filtered)

104

REFERENCES

1) M. Morris Mano, Digital design 2nd edition, pp. 119-121.

2) Charles H Roth & Lizy Kurian John, Principles of digital system design, pp. no. 66-69 & 186-190.

3) Iljoo Choo and R.G. Deshmukh, “A Novel Fast Parallel Signed-Digit Hybrid Multiplication Scheme for Digital Systems”. 0-7803-5957-7/00 2000 IEEE.

4) Dr. Krishna Raj and Suman Lata, “Fast Processing Using Signed Digit Number System” International Journal of Electronics Engineering, 2(1), 2010, pp. 173-175.

5) Dhananjay S. Phatak and Israel Korean et al “Hybrid Signed Digit Number Systems: A Unified Framework for Redundant Number Representation with Bounded Carry Propagation Chains”, IEEE Transactions on Computers Vol. 43, No. 8, pp 880-891, August 1994.

6) S. Haykin. (1996). Adaptive Filter Theory 3rd edition. pp. 231-240. Prentice Hall.

7) Paulo S.R. Diniz: Adaptive Filtering: Algorithms and Practical Implementation, Kluwer Academic Publishers, 1997

105

106