4
Information Processing Letters 106 (2008) 96–99 www.elsevier.com/locate/ipl More restrictive Gray codes for necklaces and Lyndon words Vincent Vajnovszki LE2I - UMR CNRS, Université de Bourgogne, BP 47 870, 21078 Dijon Cedex, France Received 22 August 2007; received in revised form 1 October 2007; accepted 24 October 2007 Available online 26 November 2007 Communicated by L. Boasson Abstract In the last years, the order induced by the Binary Reflected Gray Code or its generalizations shown an increasing interest. In this note we show that the BRGC order induces a cyclic 2-Gray code on the set of binary necklaces and Lyndon words and a cyclic 3-Gray code on the unordered counterparts. This is an improvement and a generalization to unlabeled words of the result in [V. Vajnovszki, Gray code order for Lyndon words, Discrete Math. Theoret. Comput. Sci. 9 (2) (2007) 145–152; M. Weston, V. Vajnovszki, Gray codes for necklaces and Lyndon words of arbitrary base, Pure Mathematics and Applications/Algebra and Theoretical Computer Science, in press]; however an algorithmic implementation of our Gray codes remains an open problem. © 2007 Elsevier B.V. All rights reserved. Keywords: Necklaces; Lyndon words; Gray codes; Combinatorial problems 1. Introduction For a word x = uv , the word vu is called a rotation of x .A necklace is a word which is lexicographically min- imal under the rotation; and a Lyndon word is an aperi- odic necklace. An unlabeled necklace is a word which is lexicographically minimal under both rotation and per- mutation of alphabet symbols; and an unlabeled Lyndon word is an aperiodic unlabeled necklace. Generation of necklaces and Lyndon words lexicographically have been widely studied in the literature; see, for instance, [6–9,11,14] and the references therein. We restrict our attention to binary words and we denote by N n , L n , UN n and UL n the set of binary length n necklaces, Lyn- don words, unlabeled necklaces and unlabeled Lyndon E-mail address: [email protected]. words, respectively. These four sets are related by the inclusions: UL n UN n L n N n A k -Gray code for a set of words S is an ordered list for S such that the Hamming distance between any two consecutive words in the list is at most k . If the distance between the last and the first words in the list is also bounded by k , then the Gray code is called cyclic; in addition if k is minimal, then we say that the Gray code is minimal. In [10,13] 2-Gray codes for binary necklaces with fixed density (given number of ones) are given and in [11,14] are presented 3-Gray codes and generating algo- rithms for binary and k -ary necklaces and Lyndon words with no density restriction. In [13] it is shown that, in general, there is no 1-Gray code for binary necklaces or 0020-0190/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.ipl.2007.10.011

More restrictive Gray codes for necklaces and Lyndon words

Embed Size (px)

Citation preview

Page 1: More restrictive Gray codes for necklaces and Lyndon words

Information Processing Letters 106 (2008) 96–99

www.elsevier.com/locate/ipl

More restrictive Gray codes for necklaces and Lyndon words

Vincent Vajnovszki

LE2I - UMR CNRS, Université de Bourgogne, BP 47 870, 21078 Dijon Cedex, France

Received 22 August 2007; received in revised form 1 October 2007; accepted 24 October 2007

Available online 26 November 2007

Communicated by L. Boasson

Abstract

In the last years, the order induced by the Binary Reflected Gray Code or its generalizations shown an increasing interest. Inthis note we show that the BRGC order induces a cyclic 2-Gray code on the set of binary necklaces and Lyndon words and acyclic 3-Gray code on the unordered counterparts. This is an improvement and a generalization to unlabeled words of the resultin [V. Vajnovszki, Gray code order for Lyndon words, Discrete Math. Theoret. Comput. Sci. 9 (2) (2007) 145–152; M. Weston,V. Vajnovszki, Gray codes for necklaces and Lyndon words of arbitrary base, Pure Mathematics and Applications/Algebra andTheoretical Computer Science, in press]; however an algorithmic implementation of our Gray codes remains an open problem.© 2007 Elsevier B.V. All rights reserved.

Keywords: Necklaces; Lyndon words; Gray codes; Combinatorial problems

1. Introduction

For a word x = uv, the word vu is called a rotation ofx. A necklace is a word which is lexicographically min-imal under the rotation; and a Lyndon word is an aperi-odic necklace. An unlabeled necklace is a word which islexicographically minimal under both rotation and per-mutation of alphabet symbols; and an unlabeled Lyndonword is an aperiodic unlabeled necklace. Generationof necklaces and Lyndon words lexicographically havebeen widely studied in the literature; see, for instance,[6–9,11,14] and the references therein. We restrict ourattention to binary words and we denote by Nn, Ln,UNn and ULn the set of binary length n necklaces, Lyn-don words, unlabeled necklaces and unlabeled Lyndon

E-mail address: [email protected].

0020-0190/$ – see front matter © 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.ipl.2007.10.011

words, respectively. These four sets are related by theinclusions:

ULn ⊂ UNn

∩ ∩Ln ⊂ Nn

A k-Gray code for a set of words S is an ordered listfor S such that the Hamming distance between any twoconsecutive words in the list is at most k. If the distancebetween the last and the first words in the list is alsobounded by k, then the Gray code is called cyclic; inaddition if k is minimal, then we say that the Gray codeis minimal.

In [10,13] 2-Gray codes for binary necklaces withfixed density (given number of ones) are given and in[11,14] are presented 3-Gray codes and generating algo-rithms for binary and k-ary necklaces and Lyndon wordswith no density restriction. In [13] it is shown that, ingeneral, there is no 1-Gray code for binary necklaces or

Page 2: More restrictive Gray codes for necklaces and Lyndon words

V. Vajnovszki / Information Processing Letters 106 (2008) 96–99 97

Lyndon words. Here we give 2-Gray codes for length n

binary necklaces and Lyndon words and 3-Gray codesfor their unlabeled counterparts.

In the last years, the order induced by the BinaryReflected Gray Code (BRGC) [5] or its generalizationsshown an increasing interest [1,2,4,11,14]. For example,in [4] the authors use the BRGC order twice: firstly theydefine Gray-necklaces as binary strings minimal (underrotation) in BRGC order instead of lexicographic order,then they list them in BRGC order; the obtained list isalmost a Gray code for necklaces in a non-standard rep-resentation. As in [11,14], our approach here is based onthe BRGC order but the obtained Gray codes for binarynecklaces and Lyndon words are 2-Gray codes (and sominimal) and can be extended to unlabeled necklacesand Lyndon words; however they seem less appropriatefor an algorithmic implementation.

2. The main result

We begin by recalling the definition of the BRGC or-der and giving three technical ‘facts’ whose proof canbe easily recovered by the reader. Here we adopt thefollowing definition of the BRGC which is a slight mod-ification (also called co-BRGC) of the original definitionin [5].

Definition 1. Let a = a1a2 . . . an and b = b1b2 . . . bn bewords in {0,1}n and i the rightmost position in whicha and b differ. We say that a is less than b in BRGC(denoted by a ≺ b) if

∑nj=i aj is even.

Any set X ⊂ {0,1}n of length n binary words listedin ≺ order gives a suffix partitioned list, that is, all thewords in X with a common suffix are contiguous in thelist. An order relation with this property is called genlex(as generalized lexicographical) order [12]. See Table 1for the sets N6, L6, UN6 and UL6 listed in ≺ order.

Fact 1. Let α ∈ {0,1}n and j , 1 � j < n. If α satisfiesthe following conditions: (i) 0j is a prefix of α, (ii) α

has no other 0j factor, then α is a Lyndon word. If, inaddition, α has no 1j factors, then α is an unlabeledLyndon word.

For a binary word α ∈ {0,1}n let |α|1 denote thenumber of 1’s in α.

Fact 2. For λ ∈ {0,1}n with |λ|1 � 1 let λ′ be the binaryword obtained from λ by changing its leftmost 1 bit to 0.

Table 1The set of length 6 binary necklaces, Lyndon words, unlabeled neck-laces and unlabeled Lyndon words in BRGC order

N6 L6 UN6 UL6

0 0 0 0 0 0 �0 0 0 0 1 1 � � �0 1 1 0 1 10 0 1 0 1 1 � � �0 0 1 1 1 1 �1 1 1 1 1 10 1 1 1 1 1 �0 1 0 1 1 1 �0 0 0 1 1 1 � � �0 0 0 1 0 1 � � �0 1 0 1 0 1 �0 0 1 1 0 1 �0 0 1 0 0 1 �0 0 0 0 0 1 � � �

• If X ∈ {Nn,UNn} and λ ∈ X with |λ|1 � 1, thenλ′ ∈ X.

• If X ∈ {Ln,ULn} and λ ∈ X with |λ|1 � 2, then λ′ ∈X.

Fact 3. In ≺ order

• for n � 1 the first word in Nn and in UNn is 0n;• for n � 3 (resp., for n � 4) the first word in Ln

(resp., in ULn) is 0n−212;• for n � 2 the last word in Nn, UNn, Ln and ULn is

0n−11.

Lemma 1. Let X ∈ {Ln,Nn,ULn,UNn} and λ ∈ X.Suppose that r is the rightmost position where λ differsfrom its successor (resp., its predecessor) in ≺ order,assuming that λ is not the last (resp., first) word in X inthis order. Then |λ1λ2 . . . λr−1|1 � 1.

Proof. By contradiction. Let r be the rightmost positionwhere λ differs from μ, its successor. If |λ1λ2 . . . λr−1|1� 2, then by the definition of BRGC order and by Fact2 one of the words λ′ or (λ′)′ belongs to X and is largerthan λ and smaller than μ in ≺ order. The proof is sim-ilar when we consider the predecessor of λ. �

The next corollary is a weak version of Theorem 1at the end of this section; a similar result is given in[11, Theorem 1] and we omit its proof.

Corollary 1. ≺ order induces a 3-Gray code on the setsLn, Nn, ULn and UNn.

Now we will prove that ≺ induces a more restrictiveGray code. Let α be a binary word with |α|1 � 1 and let

Page 3: More restrictive Gray codes for necklaces and Lyndon words

98 V. Vajnovszki / Information Processing Letters 106 (2008) 96–99

j be the leftmost position where αj = 1. If αj+1 = 0,then define α̃ as the binary word α̃i = αi , except α̃j = 0and α̃j+1 = 1. α and α̃ have the shape given by:

α = 0 . . . 0 1 0 αj+2 . . . αn,

α̃ = 0 . . . 0 0 1 αj+2 . . . αn.

If α ∈ Nn, then α̃ ∈ Ln. This statement is formalizedin the first point of the next lemma. This result isnot true for unlabeled words. Indeed, 010101 ∈ UN6but 001101 /∈ UN6 since the representative of 001101is 001011 ∈ UN6. Similarly, 00101101 ∈ UL8 but00011101 /∈ UL8 since the representative of 00011101is 00010111 ∈ UL8. The second point of the next lemmagives a more restrictive unlabeled counterpart of its firstpoint.

Lemma 2. Let X ⊂ {0,1}n, α ∈ X with |α|1 � 1, andlet j be the leftmost position where αj = 1, and supposethat αj+1 = 0.

(1) If X ∈ {Ln,Nn}, then α̃ ∈ Ln.(2) If X ∈ {ULn,UNn} and αj+2 = 0, then α̃ ∈ ULn.

Proof. The length j − 1 prefix of α is 0j−1 and α hasno 0j factor. α̃ has a unique 0j factor which is its lengthj prefix, and so, by Fact 1, it is a Lyndon word and thefirst point holds.

If X ∈ {ULn,UNn}, then α has no 1j factor. In addi-tion, if αj+2 = 0, then α̃ has no 1j factor too. Again, byFact 1, α̃ is an unlabeled Lyndon word, and the secondpoint holds. �

Let X ∈ {Ln,Nn,ULn,UNn}, λ ∈ X. If λ differsfrom its successor in ≺ order in at least two positions,then the leftmost and the rightmost of these positionscan not be arbitrarily far. This situation is formallystated in the next lemma and summarized in the tableat the end of its proof.

Lemma 3. Let X ⊂ {0,1}n and λ ∈ X. Suppose that λ

differs from its successor in ≺ order in at least two po-sitions (assuming that λ is not the last word in X in thisorder). Let � and r be, respectively, the leftmost and therightmost of these positions.

(1) If λ� �= λr and X ∈ {Ln,Nn,ULn,UNn}, then r =� + 1.

(2) If λ� = λr and X ∈ {Ln,Nn}, then r = � + 1.(3) If λ� = λr and X ∈ {ULn,UNn}, then r = � + 1 or

r = � + 2. In addition, if r = � + 2, then λ�+1 �= λ�.

Proof. The proof is by contradiction supposing that r −� > 1 or r − � > 2, respectively. Let μ be the successorof λ in ≺ order.

(1) Suppose that r − � > 1.When λ� = 1 and λr = 0, then μ� = 0 and μr = 1,

and by Lemma 1, λi = 0 for all i, 1 � i < � and� < i < r . Since λ� = 1 and λ�+1 = λ�+2 = 0, applyingLemma 2 we have that the word λ̃ is in X. Moreover, itis easy to check that λ ≺ λ̃ ≺ μ, which is in contradic-tion with μ being the successor of λ in ≺ order.

Similarly, if λ� = 0 and λr = 1, then μ� = 1 andμr = 0. In this case, again by Lemma 1, μi = 0 forall i, 1 � i < � and � < i < r . Since μ� = 1 andμ�+1 = μ�+2 = 0, applying Lemma 2 it results that μ̃

is in X and λ ≺ μ̃ ≺ μ.(2) Suppose that r − � > 1.When λ� = λr = 1, then μ� = μr = 0, and λi = 0

for all i, 1 � i < � and � < i < r . Since λ� = 1 andλ�+1 = 0, by the first point of Lemma 2, λ̃ is in X andλ ≺ λ̃ ≺ μ.

When λ� = λr = 0, then μ� = μr = 1, μ�+1 = 0 andsimilarly μ̃ is in X with λ ≺ μ̃ ≺ μ.

(3) Suppose that r − � > 2.When λ� = λr = 1, then λ�+1 = λ�+2 = 0, and by

point 2 of Lemma 2 we have that λ ≺ λ̃, and as aboveλ̃ ≺ μ.

Similarly, when λ� = λr = 0, we have λ ≺ μ̃ ≺ μ.In addition, if r − � = 2, then λ�+1 �= λ� and μ�+1 �=

μ�.

Ln, Nn, ULn, UNn Ln, Nn ULn, UNn

λ� �= λr λ� = λr λ� = λr

⇓ ⇓ ⇓r = � + 1 r = � + 1 r � � + 2

�Recall that a Gray code is a k-Gray code if succes-

sive words differ in at most k positions; and cyclic if thelast and the first words differ in the same way. As a con-sequence of the previous lemma, together with Fact 3,we can state the main result of this note.

Theorem 1.

(1) BRGC order induces a cyclic 2-Gray code on Nn

and Ln.(2) BRGC order induces a cyclic 3-Gray code on UNn

and ULn.

Moreover, if two successive words, in BRGC order, inNn, Ln, UNn or ULn differ in more than one position,then these positions are consecutive.

Page 4: More restrictive Gray codes for necklaces and Lyndon words

V. Vajnovszki / Information Processing Letters 106 (2008) 96–99 99

3. Final remarks

The BRGC order ≺ induces a cyclic 2-Gray code onNn and Ln, and, as mentioned in Introduction, it is mini-mal. For example, in the Gray code list for N6 in Table 1there are four 2-changes between successive words. Wedo not know if our Gray codes for Nn and Ln minimizethe number of 2-changes between successive words.

Also, our Gray code for UNn and ULn are 3-Graycodes, and in the list for UN6 in Table 1 there is one3-changes, namely between the 6th and the 7th word.A more restrictive (with no 3-changes) list for UN6is (000000, 000001,010101,000111,000101,001001,

001011,000011); this shows that our Gray codes forUNn and ULn might be not minimal.

The main difference between the Gray codes pre-sented here and the ones in [11] is that the latter onesare prefix partitioned lists. The algorithmic implemen-tation of the suffix partitioned Gray codes presented inthis paper seems more difficult. Indeed, the recursivedefinitions for Nn and Ln [3, Theorem 2] (on whichis based the algorithm in [11]), and for UNn and ULn

[3, Theorem 5] work only for prefix partitioned lists.This last remark suggests:

Question 1. Can the Gray codes presented here be ef-ficiently implemented? That is, can Nn, Ln, UNn andULn be efficiently listed in BRGC order (defined in De-finition 1)?

A natural generalization of the BRGC order to ak-ary alphabet is given by Weston and Vajnovszki(see [14]).

Definition. Let k,n > 0 and a = a1a2 . . . an and b =b1b2 . . . bn be words in {0,1, . . . , k−1}n and i the right-most position in which a and b differ. We say that a

is less than b in reflected Gray code order (denoted bya ≺ b) if either

• ai < bi and∑n

j=i+1 aj = ∑nj=i+1 bj is even, or

• ai > bi and∑n

j=i+1 aj = ∑nj=i+1 bj is odd.

Question 2. Can the previous results be extended tok-ary alphabet? Experimental results show that the suc-

cessor, in ≺ order, of the 3-ary length 6 Lyndon word001002 is 012102, and the successor of 012202 is022212; and those are the only consecutive (in ≺ or-der) 3-ary length 6 Lyndon words which do not differ inat most two consecutive positions.

Acknowledgement

The author would like to thank one of the anonymousreferees for helpful suggestions which have improvedthe accuracy of this paper.

References

[1] J.L. Baril, V. Vajnovszki, Minimal change list for Lucas stringsand some graph theoretic consequences, Theoret. Comput.Sci. 346 (2–3) (2005) 189–199.

[2] M.W. Bunder, K.P. Tognetti, G.E. Wheeler, On binary reflectedGray codes and functions, Discrete Math., submitted for publi-cation.

[3] K. Cattell, F. Ruskey, J. Sawada, M. Serra, C.R. Miers, Fastalgorithms to generate necklaces, unlabeled necklaces, and ir-reducible polynomials over GF(2), J. Algorithms 37 (2) (2000)267–282.

[4] C. Degni, A.A. Drisko, Gray-ordered binary necklaces, Electron.J. Combin. 14 (2007), �R7.

[5] F. Gray, Pulse code communication. U.S. Patent 2632058, 1953.[6] F. Ruskey, C. Savage, T. Wang, Generating necklaces, J. Algo-

rithms 13 (1992) 414–430.[7] F. Ruskey, J. Sawada, Generating necklaces and strings with for-

bidden substrings, in: 6th Annual International Combinatoricsand Computing Conference (COCOON), Sydney, Australia, July2000, in: LNCS, vol. 1858, Springer, 2000, pp. 330–339.

[8] F. Ruskey, Combinatorial Generation, Book in preparation.[9] J. Sawada, F. Ruskey, An efficient algorithm for generating neck-

laces with fixed density, SIAM J. Comput. 29 (1999) 671–684.[10] T. Ueda, Gray codes for necklaces, Discrete Math. 219 (1–3)

(2000) 235–248.[11] V. Vajnovszki, Gray code order for Lyndon words, Discrete

Math. Theoret. Comput. Sci. 9 (2) (2007) 145–152.[12] T. Walsh, Generating Gray codes in O(1) worst-case time per

word, in: Proceeding of 4th Conference on Discrete Mathemat-ics and Theoretical Computer Science, Dijon, France, 7–12 July,Springer-Verlag, Berlin, 2003, pp. 73–78.

[13] T.M.Y. Wang, C.D. Savage, A Gray code for necklaces of fixeddensity, SIAM J. Discrete Math. 9 (4) (1996) 654–673.

[14] M. Weston, V. Vajnovszki, Gray codes for necklaces and Lyn-don words of arbitrary base, Pure Mathematics and Applica-tions/Algebra and Theoretical Computer Science, in press.