4
Multipath Greedy Algorithm for Canonical Representation of Numbers in the Double Base Number System Guillaume Gilbert and J. M. Pierre Langlois Department of Electrical and Computer Engineering Royal Military College of Canada Kingston, Ontario, Canada [email protected], [email protected] Abstract— The Double Base Number System (DBNS) has been used in applications such as cryptography and digital lters. Two important properties of this type of representation are high redundancy and sparseness, which are key in eliminating carry propagation in basic arithmetic operations. High redundancy poses challenges in determining the Canonical Double Base Number Representation (CDBNR) of an algebraic value. An exhaustive search for this representation can be computationally intensive, even for relatively small values. The greedy algorithm is very fast and simple to implement, but only allows for a single Near Canonical Double Base Number Representation (NCDBNR). The Multipath Greedy (MG) algorithm discussed in this paper is much faster than exhaustive search and gives better performance since it dramatically increases the likelihood of nding canonical representations. Since multiple starting points are used, this algorithm is able to nd more than one NCDBNR in a single run. I. I NTRODUCTION The Double Base Number System (DBNS) is a positional number system that has an interesting geometric representation [1] and has been applied in the areas of cryptography and digital lters [2]. As its name implies, the radix of such a system is made up of two numbers, with corresponding exponents, that are multiplied together: x = m,n d mn 3 m 2 n where d mn ∈{0, 1} (1) For example, the number 15 can be expressed as 15 = 3 0 2 1 + 3 0 2 2 +3 2 2 0 . From Equation 1, it is possible to construct a DBNS radix table, where a cell is the product of a power of 3 row index and a power of 2 column index as in Table I. A DBNS map is a similar table where each cell is either active or inactive. Each cell represents a radix constant that is the product of the corresponding row and column. In this way, the algebraic value of the DBNS map is the sum of the radix constants of all the active cells. The DBNS is redundant since several different DBNS maps can be used to represent the value of x where x> 2. In fact, for x =3, there are two different representations and for x 4 there are at least three different representations. Proof of this statement is straight forward by using row and column reduction equations outlined TABLE I DBNS RADIX TABLE 2 0 2 1 2 2 ··· 2 M1 3 0 1 2 4 ··· 2 M1 3 1 3 6 12 ··· 3 · 2 M1 3 2 9 18 36 ··· 9 · 2 M1 . . . . . . . . . . . . . . . . . . 3 N1 3 N1 3 N1 · 2 3 N1 · 4 ··· 3 N1 2 M1 4 8 2 1 1 3 9 27 16 4 8 2 1 1 3 9 27 16 4 8 2 1 1 3 9 27 16 4 8 2 1 1 3 9 27 16 Fig. 1. DBNS maps for x = 15 in [2]. For example, the number x = 15 has 9 different DBNS maps, four of which are shown in Figure 1. II. CANONICAL DBNS REPRESENTATION The canonical form of a number representation in a re- dundant system is the representation of a number containing the minimum number of non-zero digits. An example of a redundant number system is the signed digit representation, which can only have a single canonical form for any given algebraic value [3]. However, the DBNS can have more than

[IEEE The 3rd International IEEE-NEWCAS Conference, 2005. - Quebec City, Canada (19-22 June, 2005)] The 3rd International IEEE-NEWCAS Conference, 2005. - Multipath Greedy Algorithm-for

  • Upload
    jmp

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE The 3rd International IEEE-NEWCAS Conference, 2005. - Quebec City, Canada (19-22 June, 2005)] The 3rd International IEEE-NEWCAS Conference, 2005. - Multipath Greedy Algorithm-for

Multipath Greedy Algorithmfor Canonical Representation of Numbers

in the Double Base Number SystemGuillaume Gilbert and J. M. Pierre Langlois

Department of Electrical andComputer Engineering

Royal Military College of CanadaKingston, Ontario, Canada

[email protected], [email protected]

Abstract— The Double Base Number System (DBNS) has beenused in applications such as cryptography and digital filters.Two important properties of this type of representation are highredundancy and sparseness, which are key in eliminating carrypropagation in basic arithmetic operations. High redundancyposes challenges in determining the Canonical Double BaseNumber Representation (CDBNR) of an algebraic value. Anexhaustive search for this representation can be computationallyintensive, even for relatively small values. The greedy algorithmis very fast and simple to implement, but only allows fora single Near Canonical Double Base Number Representation(NCDBNR). The Multipath Greedy (MG) algorithm discussed inthis paper is much faster than exhaustive search and gives betterperformance since it dramatically increases the likelihood offinding canonical representations. Since multiple starting pointsare used, this algorithm is able to find more than one NCDBNRin a single run.

I. INTRODUCTION

The Double Base Number System (DBNS) is a positionalnumber system that has an interesting geometric representation[1] and has been applied in the areas of cryptography anddigital filters [2]. As its name implies, the radix of sucha system is made up of two numbers, with correspondingexponents, that are multiplied together:

x =∑

m,n

dmn3m2n where dmn ∈ {0, 1} (1)

For example, the number 15 can be expressed as 15 = 3021 +3022 + 3220. From Equation 1, it is possible to construct aDBNS radix table, where a cell is the product of a powerof 3 row index and a power of 2 column index as in TableI. A DBNS map is a similar table where each cell is eitheractive or inactive. Each cell represents a radix constant thatis the product of the corresponding row and column. In thisway, the algebraic value of the DBNS map is the sum of theradix constants of all the active cells. The DBNS is redundantsince several different DBNS maps can be used to representthe value of x where x > 2. In fact, for x = 3, there are twodifferent representations and for x ≥ 4 there are at least threedifferent representations. Proof of this statement is straightforward by using row and column reduction equations outlined

TABLE I

DBNS RADIX TABLE

20 21 22· · · 2M−1

30 1 2 4 · · · 2M−1

31 3 6 12 · · · 3 · 2M−1

32 9 18 36 · · · 9 · 2M−1

......

......

. . ....

3N−1 3N−1 3N−1· 2 3N−1

· 4 · · · 3N−12M−1

4 821

1

3

9

27

16 4 821

1

3

9

27

16

4 821

1

3

9

27

164 821

1

3

9

27

16

Fig. 1. DBNS maps for x = 15

in [2]. For example, the number x = 15 has 9 different DBNSmaps, four of which are shown in Figure 1.

II. CANONICAL DBNS REPRESENTATION

The canonical form of a number representation in a re-dundant system is the representation of a number containingthe minimum number of non-zero digits. An example of aredundant number system is the signed digit representation,which can only have a single canonical form for any givenalgebraic value [3]. However, the DBNS can have more than

Page 2: [IEEE The 3rd International IEEE-NEWCAS Conference, 2005. - Quebec City, Canada (19-22 June, 2005)] The 3rd International IEEE-NEWCAS Conference, 2005. - Multipath Greedy Algorithm-for

one canonical representation for any given number. In geo-metric terms, a DBNS canonical representation of a numberis a DBNS map that contains the minimum number of activecells. For example, the number x = 15 has the following twocanonical representations in DBNS:

3120 + 3122 = 3 + 12 = 15

3121 + 3220 = 6 + 9 = 15

An important point to consider when examining the canonicalrepresentation is the size of the DBNS map. In order to obtainthe correct result, we must ensure that the number of rows M

and the number of columns N for the value x that we aretrying to represent is adequate. This is done by the use of thefollowing relationships for M and N :

M = �log3(x)�+ 1 (2)

N = �log2(x)�+ 1 (3)

Next, we define the radix set RMN as:

RMN = {3m2n : 0 ≤ m ≤M, 0 ≤ n ≤ N} (4)

For example, the ordered radix set for M = 3 and N = 5 is:

R35 = {1, 2, 3, 4, 6, 8, 9, 12, 16, 18, 24, 36, 48, 72, 144}

Although the R35 radix set is capable of representing values upto x = 403, it can only be used to represent canonical valuesup to x = 26 since �log2 26�+ 1 = 5 and �log3 26�+ 1 = 3.Attempting to use the set for the canonical representation ofx = 27 would yield an incorrect result; 3320 = 27 is thecanonical representation of x = 27, and 27 is not part of theset R35. The canonical representation of numbers is one crucialelement in performing efficient arithmetic operations in DBNSas outlined in [2]. When attempting to find the canonical repre-sentation of x, only the radix set R(x) = {r ∈ RMN : r ≤ x}

needs to be used. For example, the required radix set forx = 15 would be:

R(15) = {1, 2, 3, 4, 6, 8, 9, 12}

In general, very few active cells are required for the canonicalrepresentation of a number. Hence, the DBNS is not onlyredundant, but also sparse. Also, the problem of finding theCanonical Representation of a number in DBNS appears to bean NP-complete problem [2].

III. SEARCH ALGORITHMS

One obvious way of finding the DBNS canonical form of analgebraic value would be to perform a brute force exhaustivesearch. Although such an approach is inappropriate for largevalues, it serves as a basis in order to illustrate the size ofthe search space. Since the digits dmn can only take on thevalues 1 or 0, the problem can be represented as a simple (yettime consuming) combinatory problem. Considering a numberx, we first determine the set R(x) and define the number ofelements contained in R(x) as z. It follows that there are 2z

possible combinations to be checked. Once all combinationsthat match x have been determined, the ones with the fewest

digits that are equal to 1 represent the canonical form of thenumber. For a relatively low number like x = 100, we have toexamine 220 = 1 048 576 different cases! This is clearly not aneffective way of finding out the set of canonical representationsfor a number.

In [1], a greedy search algorithm was proposed in order toprovide near canonical representations. This type of algorithmhas a computational complexity of roughly O

(log x

log log x

)and

determines a single representation that may or may not bea canonical representation of an algebraic value. However,the representation obtained from this algorithm is an adequatesolution, and an in-depth explanation of its validity is foundin [2].

In order to contrast the strategies used for the differentalgorithms, a simple example will be used. The number x = 41is the smallest value for which the greedy algorithm does notreturn a canonical representation; for x = 41 the canonicalrepresentation is 9+32 = 41. For x = 41, the radix set R(41)used in the composition of the number is:

R(41) = {1, 2, 3, 4, 6, 8, 9, 12, 16, 18, 24, 27, 32, 36}

From this set, it is immediately obvious as to how the greedyalgorithm will proceed in attempting to determine the nearcanonical representation of the number 41:

x0 = 41

x1 = 41− 36 = 5, a1 = 36

x2 = 5− 4 = 1, a2 = 4

x3 = 1− 1 = 0, a3 = 1

The greedy algorithm is very fast, but it yields 1 + 4 + 36as its best attempt at finding the canonical representation ofthe number 41. Even though this result is incorrect, it is still asparse representation that is adequate for arithmetic operations.Another option that is available is the brute force exhaustivesearch. Since the set R(41) contains 14 elements, there are 214

= 16 384 different combinations to compare. Although verylengthy, the exhaustive search will give us the exact canonicalrepresentation.

IV. THE PROPOSED MULTIPATH GREEDY ALGORITHM

The Multipath Greedy (MG) algorithm, is a simple yet moreeffective way of: 1) finding more than one possible canon-ical representation, 2) increasing the likelihood of findingtrue canonical representations. The proposed MG algorithmcompletes in roughly O(log2 x) times that of the greedyalgorithm. Furthermore, it will find many possible canonicalrepresentations of a number simultaneously. This algorithm isinspired by work from [4], where the authors describe a trellisbased searching algorithm. When one studies this algorithm indetail, it is more appropriate to label it multipath greedy, sincethe algorithm has the same behavior as the greedy algorithm,the only difference being the starting point for each greedypath.

The MG algorithm is divided into four phases:• Initialization

Page 3: [IEEE The 3rd International IEEE-NEWCAS Conference, 2005. - Quebec City, Canada (19-22 June, 2005)] The 3rd International IEEE-NEWCAS Conference, 2005. - Multipath Greedy Algorithm-for

Φ(1,3)=3k=4k=5k=6k=7k=8k=9

k=11k=12k=13k=14

k=1k=2

k=10

Φ(1,4)=4Φ(1,5)=6Φ(1,6)=8Φ(1,7)=9Φ(1,8)=12Φ(1,9)=16

Φ(1,11)=24Φ(1,12)=27Φ(1,13)=32Φ(1,14)=36

Φ(1,1)=1Φ(1,2)=2

Φ(1,10)=18

i=1 i=2 i=3

k=3

Fig. 2. Canonical path in Trellis

• Accumulation• Comparison and Selection• Determination of Completion

A. Initialization

In the initialization phase, the algorithm first calculates therequired size of the DBNS radix table. The required size isdetermined by the use of Equations 2 and 3. The set RMN

is then built according to Equation 4. Once RMN has beenobtained, the set R(x) containing only radix constants that areless then x is constructed. The set for x = 41 is as shown inthe previous section. We also define z to be the number ofelements in a given set R. Thus, the number of elements inthe set R(41) is z = 14 and k is an index such that 1 ≤k ≤ z. In this manner, double base radix constants of theform 3m2n in the set R(x) are labeled r(k), i.e. R(x) ={r(1), r(2), . . . r(z)}.

The MG is divided up into search steps, and each stepcontains z states. Visually, the algorithm will be depicted asflowing from left to right in increasing search steps. At eachsearch step, the states are examined in ascending order. Thestates will be represented by the k index. Another variableneeds to be defined, which will be labeled φ(i, k). This vari-able will represent the cumulative result at step i for state k,and its use will become apparent in the next subsection. Duringinitialization, φ(1, k) = r(k), that is, each cumulative result isinitialized as containing its corresponding radix constant.

B. Accumulation

A path in the trellis is defined by a connection from statek to state k′ in two successive search steps i and i + 1. In theaccumulation phase, there is a path from state k to each statek′, which amount to possibly z candidate paths as depicted inFigure 2. Each of these paths also has an associated candidatecumulative result φ(i+1, k′), which is the sum of the previouscumulative result φ(i, k) and the next radix constant r(k′).This can also be seen in Figure 3. Also note that paths wherethe cumulative result would surpass the target cumulative sumx are forbidden; the sum of radix constants along such pathswould yield numbers greater than x.

=39

k=4k=5k=6k=7k=8k=9

k=11k=12k=13k=14

k=1k=2

k=10

Φ(1,4)=4Φ(1,5)=6Φ(1,6)=8Φ(1,7)=9Φ(1,8)=12Φ(1,9)=16

Φ(1,11)=24Φ(1,12)=27Φ(1,13)=32Φ(1,14)=36

Φ(1,1)=1Φ(1,2)=2

Φ(1,10)=18

i=1 i=2 i=3

Φ(1,3)=3

Φ(2,8)=Φ(1,12)+12

k=3

Fig. 3. Non-canonical path in Trellis

C. Comparison and Selection

Once all the candidate sums for all paths originating froma single state k to all candidate states k′ and associated φ(i +1, k′) have been determined, the algorithm simply choses thepath such that φ(i+1, k) = max [φ(i, k) + r(k′)]. The selectedpath is defined as the surviving path.

D. Determination of Completion

The processes of accumulation, comparison and selectionare repeated until the algorithm has determined that a canon-ical representation is available. This will happen when atleast one of the surviving paths has run to completion, thatis φ(i, k) = x. When this situation arises, the algorithmis immediately terminated. Note that there might be manysurviving paths that run to completion at the same time; inthat case, all these different surviving paths are canonicalrepresentations of the number x. Furthermore, it is easy tosee that there will be duplicates; the radix constants that makeup duplicate paths all contain the same states, but in differentorder.

A useful notation to help visualize the MG process is shownin Table II, which depicts the κ matrix of surviving pathradices for the number x = 185. There are a total of fourcanonical representations that were found for this number.The following is a list of these representations, along withthe corresponding k row index for the κ matrix:

3 + 54 + 128, k = 16, 22

8 + 81 + 96, k = 19, 20

9 + 32 + 144, k = 13

9 + 48 + 128, k = 15

Also, the row k = 24 is the path equivalent to the greedyalgorithm. Interestingly enough, the number x = 185 doesnot have four canonical representations, but five. The canonicalrepresentation missing from the list is:

32 + 72 + 81

The reason for this is obvious from the greedy behavior of theMG algorithm. When we use the value 32 as a starting point

Page 4: [IEEE The 3rd International IEEE-NEWCAS Conference, 2005. - Quebec City, Canada (19-22 June, 2005)] The 3rd International IEEE-NEWCAS Conference, 2005. - Multipath Greedy Algorithm-for

TABLE II

TABLE OF SURVIVING PATHS FOR x = 185

κ(1, k) κ(2, k) κ(3, k) κ(4, k)

k=1 1 162 18 4k=2 2 162 18 3k=3 3 162 18 2k=4 4 162 18 1k=5 6 162 16 1k=6 8 162 12 3k=7 9 162 12 2k=8 12 162 9 2k=9 16 162 6 1k=10 18 162 4 1k=11 24 144 16 1k=12 27 144 12 2k=13 32 144 9 0k=14 36 144 4 1k=15 48 128 9 0k=16 54 128 3 0k=17 64 108 12 1k=18 72 108 4 1k=19 81 96 8 0k=20 96 81 8 0k=21 108 72 4 1k=22 128 54 3 0k=23 144 36 4 1k=24 162 18 4 1

8 12 16 24 32 480

10

20

30

40

50

60

70

80

90

100MG vs Greedy Algorithm Results

Number of bits

P (

%)

Fig. 4. MG vs Greedy Algorithm Results

for finding the representation of x = 185, the remainder is153. We then find the highest value in the radix set that willminimize the remainder in the next step, which is 144 andgives us a remainder of 9 (which also belongs in the radixset), allowing us to find a canonical representation. However,using 72 and 81 is also a valid path, but is never taken by theMG algorithm.

V. RESULTS

The MG and Greedy algorithms were both implementedin Matlab c© . Different data sets containing 1000 uniformlydistributed random values for 8, 12, 16, 24 and 32-bit binarynumbers were used (for 8 bit numbers, 255 possible cases

were used). In each experiment, an attempt was made in orderto find the Canonical Representation of each value in the dataset with the use of the MG and the greedy algorithm. Thepercentage of cases P where the MG found an NCDBNRhaving fewer digits than the greedy algorithm were recordedalong with the average difference DP and maximum differenceDMAX between both algorithms for those specific cases. Theresults can be found in Table III and Figure 4. From theseresults, it can be seen that as the size of the number increases,the effectiveness of the MG algorithm compared to the greedyalgorithm increases dramatically, at the expense of longercomputation time.

TABLE III

RESULTS

Number of bits P DP DMAX

8 8.59 % 1.0000 112 14.9 % 1.0268 216 38.6 % 1.1088 324 63.7 % 1.2653 432 74.9 % 1.4473 448 91.0 % 1.7846 5

VI. CONCLUSION

This paper presented the Multipath Greedy algorithm usedin determining the canonical representation of a positive inte-ger value in the DBNS. A Matlab c© application that implementsthe MG algorithm was developed and tested with uniformlydistributed samples for different bit widths. The results ob-tained show that the MG algorithm is much more efficientthan the greedy algorithm for finding minimal representationsas the number size increases, with an increase in computationalcost of roughly O(log2 x).

REFERENCES

[1] V. Dimitrov, S. Sadeghi-Emamchaie, G. Jullien, and W. Miller, “A nearcanonic double-base number system with applications in dsp,” in SPIEConference on Signal Processing Algorithms, vol. 2846, 1996, pp. 14–25.

[2] V. Dimitrov and G. Jullien, “Loading the bases: A new number systemrepresenation with applications,” IEEE Circuits and Systems Magazine,vol. 3, no. 2, pp. 6–23, Second Quarter 2003.

[3] G. W. Reitwiesner, Advances in Computers. Academic Press, 1960,vol. 1, ch. Binary Arithmetic, pp. 231–308.

[4] C.-S. Wu and A.-Y. Wu, “A novel trellis-based searching scheme foreeas-based cordic algorithm,” in Proceedings of the IEEE InternationalConference on Acoustics, Speech, and Signal Processing, vol. 2, May2001, pp. 1229–1232.