One-sided and two-sided context in formal grammars

iNFORMATION AND CONTROL 25, 371--392 (1974)

One-Sided and Two-Sided Context in Formal Grammars

MARTTI PENTTONEN

Mathematics Department, University of Turhu, 20500 Turhu 50, Finland

Communicated by Arto Salomaa

A proof for the equivalence of context-sensitive and one-sided context- sensitive languages is given. This yields as a corollary the normal form % / ~ BC, AB --* .4C, A ~ a for context-sensitive grammars.

1. INTRODUCTION

One-sided context is a very natural restriction for context-sensitive grammars. The problem, whether this restriction properly decreases generating power, has for many years been open. An easier problem is, whether all context-sensitive languages can be generated by grammars with one-sided and permuting (type XY---* Y X ) productions. The positive answer to this question can be found in Penttonen (1972) and R4v4sz (1974).

Many examples (Havel, 1970; Samoilenko, 1969, 1971, 1973) have been found to support the equivalence of one-sided context to two-sided context. Many attempts have been made to prove the equivalence. We try to present a detailed proof for the equivalence. Our proof is based on the ideas of Haines (1970) and Gladkij (1973). We follow the proof strategy of Gladkij (1973) and give the needed constructions lacking there.

The task is to simulate context-sensitive productions by left context- sensitive ones. Since each context-sensitive language can be derived from an infinite set of axioms by productions of the type A B -+ CD (cf. Lemma 3), it is sufficient to simulate only productions of this type.

The first attempt is to replace A B --* CD by A --* C and B --~ D, but this leads to parasitic derivations.

There is a way to avoid this difficulty. Left context-sensitive productions can transport information from the left to the right. Each simulation begins with printing an endmarker in the beginning of the word. Information of this operation moves letter by letter to the right. By passing, each letter is marked. Before reaching the end, two successive letters A and B are replaced

371 Copyright © 1974 by Academic Press, Inc. All rights of reproduction in any fozm reserved.

372 MARTTI PENTTONEN

by C and D. Ultimately an endmarker is printed in the end. Now the marks can be erased from the letters and a new simulation cycle can begin. Those derivations are correct in which the numbers of endmarkers in the beginning and in the end are equal. But now we have a new problem with the endmarkers. The number of endmarkers may be exponential to the length of the word we want to derive.

The exponential number of endmarkers can be replaced by a linear one in the following way. Instead of a single endmarker, we introduce k > / 2 endmarkers. The strings of endmarkers are considered as k-ary numbers. By each cycle of simulation, one is added to the k-ary numbers (cf. L e m m a 7). Only linear erasing is needed. Still we have to exclude the parasitic derivations, by which the k-ary numbers are unequal.

The simulation is length preserving. I t begins from a word 01g~VdPoOlg(eo ~, where Po is an axiom (cf. Lemmas 5 and 7). For the whole simulation, the lengths of the number blocks remain equal. In this case, it is possible to test the equality of the k-ary numbers by left context-sensitive productions (cf. Lemma 6, mirror immage lemma).

So we can simulate any context-sensitive grammar by left context-sensitive productions. The only problem is the linear amount of garbage.

We know the ratio of the numbers of garbage letters and terminal letters. This aids to shift the terminals into the end (cf. L e m m a 8). Now the terminals can be spread uniformly over the word (cf. Lemma 9). After that the garbage can be erased by a restricted homomorphism (cf. L e m m a 2). This completes the construction. The whole process is effective but really very impractical.

2. PRELIMINARIES AND BASIC RESULTS

In this section we define basic concepts of grammars and transformations and prove some useful lemmas. We use the notations of Salomaa (1973).

Let VN and VT be disjoint sets and V = VN U V r . A production is context-sensitive iff it is of the form PAQ --~ PRQ, where A E Vw , P ~ V*,

~ V* and R E V +. A production is left context-sensitive iff it is of the form PA --~ PR, where P, A, R are as above.

A (left) context-sensitive transformation is an ordered triple T = (VN, Vr ,F), where F is a set of (left) context-sensitive productions.

A (left) context-sensitive grammar is an ordered quadruple G (VN , VT , S, F), where S c VN is the start symbol.

Given a set F of productions. The relation ~ on V* is defined by: P =~ iff there is P ' ~ Q' in F such that P = P1P'P2 and Q = P1Q'P2. Let ~ be

O N E - S I D E D C O N T E X T 373

the ruth (m ~ 0) power of ~ and ~ be the reflexive, transitive closure of ¢o

~ , i.e. ~ ---- [J~=0 ~ . Let L _C VN* and T be a transformation. We define

T(L) ~- {Q [ P ~ for some P e L} n VT*.

T h e language generated by a grammar is defined by

L(G) = ( 0 1 S ~ O} c~ VT*.

A language is (left) context-sensitive if[ it is generated by a (left) context- sensitive grammar.

We first give a normal form for left context-sensitive grammars. I t is needed in L e m m a 2 and Theorem 2.

LEMMA 1. Every left context-sensitive language can be generated by a

grammar, whose productions are of the form

A - ~ B, A - ~ BC, A B - - ~ AC, A - + a ,

where A, B, and C are nonterminals and a is a terminal.

Proof. Let G be a left context-sensitive grammar generating the language. We replace the productions of G by productions in the required form. We may assume that all productions containing terminals are of the form A --~ a. For each production of G, let X i , Y i , Z i , Ui (i = 1, 2, . . ) be new nonterminals.

Productions A --~ A 1 "" A~ (n ~ 2) are replaced by

A ---> A i X 1 , X 1 --~ A 2 X 2 ,..., X,~_ 2 --+ A~_IA ~ .

Productions A B --> A B 1 "" B~ (n ~ 1) are replaced by

A B ~ A Y e , Y~ ~ Y~_~B . . . . . . Y~ ~ Y~B~, AY~ ~ A B I .

Productions A 1 "" A,~B ~ A 1 "" A,~B 1 "" B~ (m ~ 1, n ~ 1) are replaced by

A1A 2 --~ AxZ2, Z2A3 --~ ZuZ 3 ,..., Zm_lAm --~ Zm_lZra ,

Z~B--~ Z ~ U n , Un--~ U~_~B~ ,..., U 2 --~ U~B~,

ZmU 1 --~ ZmB1, A1Z 2 -~ A1A 2 ,..., Am_IZ,, --~ Am_IA m .

Evidently the new grammar is equivalent to the original grammar G.


I t can be shown that the family of left context-sensitive languages is an AFL. Here we need only the closure under restricted homomorphism. [For definitions, see Salomaa (1973).]

LEMMA 2. Let L be a left context-sensitive language and h a homomorphism 4-restricted on L. Then h(L) - - {A} is a left context-sensitive language.

Proof. Let G be a grammar generating L. We assume that it is of the form of Lemma 1. DenotO

Lo = {P I S *~ P, P ~ VT +, lg(P) ~< 3},

L1 -~ {P[ S *~ P, 4 ~< lg(P) < 8},

L~ - - {P[ P ~ V +, 4 ~< lg(P) < 8},

Lz -~ {P [ P ~ VT+ , 4 ~< lg(P) < 8}.

We construct a grammar G' symbol and for each P ~ L 2 , of G' are the following:

generating h(L) - - {A). Let S ' be the new start let (P) be a new nonterminal. The productions

S ' -+ h(P), if P ~L 0 and h(P) :/= A,

S ' --~ (P), if P ~L1,

( e ) - ~ ( O ) , if P , Q ~ L ~ and P ~ a Q ,

(P) ~(Q1)(Q2), if P, Q1,Q2 ~L2 and P ~ a QaQ~,

(Pa)(P2) --~ (P~)(Q), if / )1 , / )2 , Q ~L2 and PIP~ ~ a P~Q,

(P) -+ h(P), if P ~ L a.

I f the brackets are erased, the derivations by the new grammar become exactly the same as by the original grammar. Therefore L(G') = h(L) - - {A}.

The following 1emma is a modification of a result in Kuroda (1964). because of an error in the original paper, we refer to 8alomaa (1969).

LEMMA 3. Every context-sensitive language L C VTVr+ can be represented in the form T(SI+), where S, I ~ VN and

(i) The productions of T are of the form

A B -+ CD ( A B ~ V 2 - - VT 2, CD ~ V2),

(ii) There is a constant k such that

(VP ~L)(~m < klg (~)) S//g(P)-I ~ P .

1 lg(P) is the length of the word P.

ONE-SIDED CONTEXT 375

Proof. Let G - = (VN, VT, S , F ) be a linear bounded grammar (el. Salomaa, 1969) generating L. Let T = (VN ~3 {I}, Vr ,F ' ) be the context- sensitive transformation, where F ' is defined as follows: Each production A B ---* CD i n F i s also inF ' . I f A --+ B is inF, then all productions X A --+ X B and A X --+ B X ( X ~ Viv u {I}) are inF ' . For each production d ~ B C inF, A I --+ B C is in F ' . In addition to these, F ' contains all productions X I --~ I X and IX--+ XI , where X ~ V u . Clearly L = T ( S I +) and (i) is satisfied. Condition (ii) is satisfied, because there are only k lglp) words of the length Ig(P), where k is the cardinality of Vr u V N u {I}.

In order to keep the size of constructions reasonable, we prove the following lemma. It aids to split our construction in many transformations.

LEMMA 4. I f L is a left context-sensitive language and T a left context- sensitive transformation, then T(L) is a left context-sensitive language.

Proof. Let G 1 = (Va, V~, S, F1) be a left context-sensitive grammar and T = (V2, Vz, F2) a left context-sensitive transformation. We assume that /Y 1 and V a are disjoint. Let V 2' = {X' [ X ~ V2} be a set of new letters. When we replace in all productions of F 2 the letters from V~ by the corresponding primed ones, we get a new set F~' of productions. Let F ' be the set of productions X - + X' , where X ranges over V~. Then T(L) = L(G), where G is the left context-sensitive grammar (V 1 u V 2 u V~', Vs, S,

G uY'uG'). When we simulate context-sensitive transformations by left context-

sensitive ones, we need the following example. It is due to Samoilenko (1969), but we present another proof.

LEMMA 5. {#anbnc~l n ~ 2} is a left context-sensitive language.

Proof. We first construct a left context-sensitive transformation TX~ z such that

(o) Tx~Z(X+YY+ZZ+) = {x~y~z" l h > 1, m ) n ) 2}.

The transformation is length preserving. During a derivation, it will be tested, whether m / > n in the axiom X k Y ~ Z ~. In the positive case X, Y and Z can be replaced correspondingly by x, y and z. Otherwise the derivation does not terminate. Checking takes place by changing nonterminals Y and Z to U and V so that for each change of Z to V, there is at least one change of Y to U. The productions of TX f z are the following:

376

( 1 ) (a ) XY--+ XU", (b) UY--+ VU",

(c) U"Y-+ U"Y',

(d) Y ' Y -+ Y'Y' ,

(2) (a) Y'Z-+ Y'V",

(b) Y'V--~ Y'V',

(c) u"v ~ u"v',

(d) V'V--+ V'V',

(e) V'Z--+ V'V",

(3) (a) u " ~ u, (b) UY' --+ UY,

(c) YY'--+ YY,

(4) (a) YV"-->-YV,

(b) YV' -+ YV,

(c) uv'-- , , uv ,

(d) VV'--+ VV,

(e) V V " ~ VV,

(5) (a) X - ~ x ,

(b) U ~ y ,

(c) V - + ~.

MARTTI PENTTONEN

Each word xkymz~ (k >~ 1, d --- m -- n >~ 1, n >~ 2) can be derived from XkY~Z ~ by the control word

(la)(3a)((15)(3a)) ~-~

• (lb)(1 e)(1 d)--2(2a)(3a)(3b)(3c),~-2(4a)

n--2

• I-J[ ((lb)(lc)(ld)'~-2-i(Zb)(2d)~-~(2e)(3a)(3b)'~-~-~(4b)(4d)~-a(4e)) '/=1

• (1 b)(2c)(2d)n-~(2e)(3a)(4c)(4d)--Z(4e)

• (Sa)~(Sb)~(Sc)-.

If d = 0, then the first line is omitted and (lb) on the second line is replaced by (la). Hence the right side of (0) is included in the left side.


When we prove the reverse inclusion, we may assume that termination by (5) takes place ill the end of the derivation. For every word P derivable from X + Y Y + Z Z +, let k(P) be the number of the subwords U"Y, Y ' Y , U"V, Y ' V , Y 'Z , V ' V and V ' Z in P. Each time, when the number of letters V increases [by (2a) or (2e)], k(P) decreases. On the other hand, k(P) increases only by (la) or (lb), when also the number of U's increases. Since in the last word before termination there are no primed nonterminals, the number of U's in that word must be greater than or equal to the number of V's. Thus only words from {x~'y~z ~ [ k >/ 1, m ~ n ~ 2} are derivable. This completes the proof of (0).

The context-free language

Lo = {#"A '~B '~C '~ I m ~ 2, n ~ 2}

is left context-sensitive. Let T~ be the transformation T e"A'z" added with # ' A B

C' -+ C and T 2 be the transformation TAa BC added with # ' -+ # . Then

T2(Tx(Lo)) ~- T~{# 'A~B~C~ [ m >~ n >/2}

= { # a % ~ c ~ l n ~ 2}

is a left context-sensitive language by Lemma 4.

3. SIMULATION

By Lemma 3, it is sufficient to simulate productions of the type A B ---> CD by left context-sensitive ones. This will be done in Lemma 7, where Lemma 6 is needed to eliminate parasitic derivations.

In this section, we have some standard notations, We use the language of Lemma 5 as the axiom set. The transformations will be length preserving. The decendants of a and b (correspondingly b and e) will always be letters from different alphabets. Thus all derivable words will be of the form Wc = #PcQiRc , where lg (Pc )= lg (Qc)~ Ig(Ri) and Pc and Qi (correspondingly Qi and Re) are words over disjoint alphabets.

LEMMA 6. Let V 1 and V 2 be disjoint alphabets and

L C_ { # P Q R I P ~ V1 +, Q e V2+ , R e Vx+ , lg(P) = lg(Q) = lg(R)}

be a left context-sensitive language. Then also ~

L n {#PQ mi(P) ] P ~ Vl+ , Q ~ V~+}

is a left context-sensitive language.

2 mi(P) means the mirror image of P.

643/25/4-6


Proof. Firstly, L is transformed to

L ' = { # ' a 1 . . . . an'f1' " " f , ' b l ' "'" bn ' la i c V 1 , f i c V2, bi e V l , # a a ... b n eL}

by the transformation

# ---~" # ' , # ' a --+ # ' a ' , a'b ----~ a'b', a'e--+ a' e',

e ' f --~ e' f ' , e' a --+ e 'd , g'b --+ g'b',

where a and b range over V 1 and e and f range over V 2 . Secondly, L is transformed to the intersection by the following trans-

formation T. In the productions a and b range over V 1 , c and d range over V 1 U V~ u {g ] a e/.71} , and e and f range over V S .

(1) (a) # ' a ' - - + # 'a" , (b) ab' --~ ab",

(c) a" c' --+ a" c a ,

(d) c J ' ~ c J o ,

(e) ~aa'~ ~o",

(2) (a) a"--+ a,

(b) aCa ~ ac',

(c) c%-+ c'a',

(3) (a) # ' - - + # ,

(b) ae'--+ ae,

(c) e f ' ~ ef.

Every word # a I "" a ~ f 1 "" fnan "'" al in the intersection can be derived f r o m # ' a I . . . . anlfl . . . . fn'an . . . . a l ' in L ' by the productions of T. Indeed,

( la)( lc)( ld) n+2(n-1)-~(le)(2a)(2b)(2c)~+2('*-l)-I

" r I ((lb)(lc)(1 d)'~+~(~--*) 1(1 e)(2a)(2b)(2c) ~+2(~-'-,) 1) i~2

• (3a)(3b)(3c)~ -1

gives one such derivation. Thus we have proved

(4) L n {P9 mi(P) I P e Vl +, Q c V~ +} _CC T(L') .


To complete the proof, we have to show that

(5) T(L') C_L n {PQ mi(P) ] P ~ Wl+ , Q ~ w2+}.

In all terminating words derivable from L', all terminals are either in the beginning or in the end, and all nonterminals (with the exception of # ' ) are in the middle. Each word contains at most one a", which is the first letter after the first terminal block. We need some auxiliary concepts. A nonterminal c~ is called active iff it is immediately followed by a nonterminal of the type d' or d . To every word W derivable from L' we associate the following words:

B(W) is the terminal prefix P ' of the P-block, except in the case P' is followed by a"Ca. Then B(W) ~ P'a.

E(W) is the terminal suffix of the R-block. M(W)-~ a 1 "" a~, where c,1 ,..., da~ are the active letters appearing in

this order. We show that in every word W of any terminating derivation,

(6) mi(B(W)) = M(W) E(W).

Since the R-block can be terminated only by (le), there are lg(R) applications of (le). By each application of (le), the number of active letters decreases. The number of active letters increases only by (lc) that can and must be applied only lg(P) = lg(R) times. Thus by each application of (lc), M(W) increases. These are the only productions changing M(W). But these are also the only productions changing B(W) or E(W). All productions that might change B(W) or E(W), are (la), (lb), (at), (le) and (2a). Since (la) is applied only in the beginning of the derivation, there are no letters c~. So B(W) cannot change. If B(W) would increase by (lb), then ab" would be followed by some cb • For this b" there is no application of (lc). This is contradiction to the existence of lg(P) applications of (lc). If B(W) increases by (2a), then a" is followed by some c', and after the preceding application of (la) or (lb) there was no application of (lc), the same contradiction.

Thus (lc) and (le) are the only productions changing B(W), M(W) or E(W). Trivially (6) holds for words in L'. By each application of (lc), the same a is joined both to the end of B(W) and to the beginning of M(W). The application of (le) erases some a from the end of M(W) and joins the same a to the beginning of E(W). Thus we have proved that (6) holds for every terminating derivation.

Consider any terminal word #PQR derived from L'. Since (6) holds and there are no active letters, we have R = mi(P). Clearly #PQR eL, for


the productions operate only on the primes and indices. Thus (5) holds. Lemma 6 now follows from Lemma 4.

In the following lemma we need the concept of a k-ary number. Let {x0 ,-.., xk-1} (k ~ 2) be an alphabet. We define functions g and h by

g ( x ~ ' " x i . ) = i~k ~-~ + "'" + i . , ,

h(xq ... x i , ) = ix + ... + ink "-1.

Later we have many alphabets, where the letters are supplied with primes or double primes. For g and h only the subindices are significant.

Now we remind the transformation of Lemma 3.

LEMMA 7. Let L C VTVT + be a context-sensitive language and T the

transformation satisfying the conditions o f Lemma 3. Le t (e o ,..., ek_~} be an alphabet, where k is the constant given in Lemma 3. Then

L 1 -~ { # P Q mi(P) I Q c L , P e {e 0 ,..., ek_l} lg(O), S I lg(O-1) ~/q(/')Q} is a left context-sensitive language.

Proof. By Lemma 5, { # a n b n c ~ l n >~ 2} is a left context-sensitive t t t language. Let ci , ci', Q , di , di , d~' (i = 0,..., k - - 1) be new letters. When

we translate { # a ' b ' c " I n > /2} by the left context-sensitive transformation

# -+ # ' , # ' a -+ # 'Co, Coa --+ CoCo,

cob -+ cos , Sb -+ SI , Ib -+ H ,

I c - + I d o , doc-+ dodo,

we get a left context-sensitive language

Lo = {#'Co nS/~-I do n I n ~ 2).

We shall construct a left context-sensitive transformation T 1 such that

L 1 := TI(Lo) ~ ( # P Q mi(P) [ P c {e o ,..., ek_l) +, Q ~ VT+}.

Then L 1 will be left context-sensitive by Lemmas 4 and 6. Let F T -~ {d ] a ~ VT} be a set of new terminals and F = V x ~3 F T .

When we replace every a in T by ~, we get a new transformation T = ( V N , Vr ,F) . We label the productions i n F by numbers n ---- 1 ..... N:

n: A B -+ CD (n = 1,..., N) .

For each n, let En and F,, be new nonterminals, and for each X in V, let X '


and X" be new nonterminals. We define T 1 by the following list of productions, where i and j range over {0,..., k -- 1}, X and g range over V, and n ranges over {1,..., N}.

(1) (a) # % - - . # % ;

(b) #% ~ #'~+1, (i < k -- 1) (c) ci% ~ ci'c/,

(d) q'c j ---* ci'cf'+l , ( j < k - - 1) ,t C,~C,, ( e ) c i c k - 1 "-~ i o'

(2) (a) c;:X--> c;'X',

(b) ~'A --. ~"E~ . , ( . : A B ~ c n ~P)

(c) X ' Y --+ X ' Y ' ,

(d) X ' A --~ X ' E ~ , (n: A B --+ CD ~ F )

(e) E~B --+ E~F,~ , (n: A B -+ CD E P )

(f) F,~X --+ F,~X",

(g) x"Y-+ x"Y",

(3) (a) F d j - - + F d " j+l , (J < k 1)

(b) X"d~ --+ X"~+~ , ( j < k - - 1)

(c) F,g~_~ -+ Y.do', (d) X"dk_l ~ X"do',

(e) ao'd j --+ ao'd~+~ , ( j < k - - 1)

(f) d0%_1-- do'do',

(4) (a) # ' c{ --+ # ' c , ,

(b) # ' c f --* # ' q ,

(e) cic/-+ cic~, C £,t (d) , ~ --+ cic j

(s) (a) ~ , x ' - ~ ~,x, (b) ciE~ -+ ciC, (n: A B ~ CD c P )

(e) X Y ' ~ X Y ,

(d) X E n -+ X C , (n: A B -+ C O EF)

(e) C F ~ C D , ( n : A B - - + C O ~ F )

(f) X Y " ~ X Y ,


(6) (a) X d ~ ' ~ X d i ,

(b) Xa;' x 4 , (c) d,d/ dd ,

dd" 4 4 , (d)

(a) (b) (c) eicj ~ e~ej ,

(d) ei~---> eia,

(e) ag -+ ab,

(f) adi --+ aei,

(g) e~d~ -+ eiej.

We first show that

(8) L 1 ---~ TI(L0) O { # P Q mi(P) [ P e {e 0 .... , e~_l} +, Q e Vz+ }.

We describe, how a production n: A B - + CD can be simulated by T 1 . Each word contains three blocks, c-block, V-block and d-block. The first and the third block are interpreted as k-ary numbers by g and h. The simulation of T takes place in the second block.

The correct simulation is begun by adding one to the first block. This is done by (la) (lc)* (ld) or (lb) alone. Then the derivation goes on by (le) to the second block. The simulation in the second block begins either with (2a) (2c)* (2d) or (2b) alone. Then (2e) is applied. Herewith we have tested the applicability of A B ---> CIr. The simulation goes on with (2f) (2g)* to the end of the second block, except in the case F~ already is in the end. Then one is added in the third block. If Fn ends the second block, then the simulation is continued by (3a) or (3c) (3f)* (3e). Otherwise it is continued by (3b) or (3d) (3f)* (3e). Now we have done the essential part of a simulation cycle and we can return to the basic state. The primes and the double primes in the first block are erased by (4). The possible primes in the beginning of the second block are erased by (5a) (5c)*. Now (5b) or (5d) is applied to En , and (5e) to F~. Then the double primes are erased by (5f)*. In the third block, the primes and the double primes are erased by the productions (6). This completes the simulation cycle of n: A B -+ CD.

During the cycle, one has been added both to the first and the third block, while in the second block A B has been replaced by CD. When we are going to derive # P Q mi(P), we begin from #'clog(°)Sllg~°)-ldlogl°). We know

(7)


that Q can be derived from SIlg (°)-1 in g(P) steps. We simulate this derivation in the way we have described. After g(P) cycles we have #'P~Q mi(Pa), where ~) differs from Q having bars on letters and Pc and Pe differ from P having ci and d i instead of ei • After the application of productions (7), we have #PQ mi(P). Thus (8) holds.

The inclusion

(9) TI(L0) N {#PQ mi(P) I P ~ {e0 ,-.., e~-l}+, Q E VT +} C L 1

is more difficult to prove. As generally, the problem is in unintended derivations. Without loss of generality we may assume that termination by (7) takes place in the end of the derivation. Furthermore, i t suffices to consider only those derivations that terminate to words with the property P = mi(R). However, in intermediate words, g(P) may be greater than h(R). As a matter of fact, when the application of (1) is unfinished, g(P) may be even greater than in the terminal word. When the application of (3) is unfinished, h(R) is too small. At these steps, the numbers may not be read straight from P and R, but we must take into consideration the primes and the indices. Therefore, we need some auxiliary concepts.

To every word W = # 'PQR derivable from L0, we associate the following numbers:

k 1 ( w ) = k ~ + . . . + k % if P = f oe;~e~_iPl "" e ; ~ _ l n ~ ,

where i v ~ 1 + lg(Pv ""c~'ck_aPm) (v = 1,..., m) and these m appearences of q"fk-I are the only ones. (Thus kl(W ) expresses, how much g(P) is too great.)

k2(W ) is the number of the subwords c~X, X 'Y , EiX , FiX, Fidj, X"Y, X"d~ in W. (Thus k2(W) expresses, how many cycles of simulation are going on in the Q-block.)

k3(W ) ~- k q + ... + k i"*, if R = Rodo'dhR1... do'dj R~ ,

where iv = 1 + lg(Rflo'41""R~_l) (v = 1,... ,m) and these m are all appearences of do'dj. [Thus k~(W) expresses, how much h(R) is too small.]

Now we shall prove the following:

Assertion 1. I f 44',.lg(O).q lg(O)-t lg(O) = rr ~o SI d o *~ #'PQR, where g(P) h(R), then for each word W~ = #'PiQiRi (i = 0 ..... t) of the derivation

(I0) g(P,) = k l ( W i ) ~- k2(W/) + k3(Wi) + h(Ri) (i = 0 ..... ~).

In the case i ~ O, (10) holds trivially, because all terms are equal to zero.


I t suffices to prove that by none of the productions, the difference g(Pi) - -

k i (Wi ) - - k2(Wi) - - ka(Wi) - - h(Ri) decreases. We consider the productions that might decrease the difference.

By (lb), some power of k is added to g(Pi), while the same power of k or nothing is added to k i (Wi) .

By (ld), some power of h, say h r-l, is added to g(Pi). In the case r > 1, k l (Wi ) increases by k r- i or remains unaltered according to, whether the next letter is ck_i • In the case r = 1, h2(Wi) increases by 1 or remains unaltered according to, whether the first letter of (2 is unprimed or not.

By (le), (k - - 1) k r- i is subtracted from g(Pi). I f r > 1, then k r - - k r- i or k ~ is subtracted from k i (Wi ) depending on, whether the next letter is ck_ 1 or not. In the case r = 1, k i (Wi ) decreases by k r and k2(Wi) increases by 1, if the first letter of Q is unprimed; otherwise k l (Wi ) decreases by k ~ and k~(W~) remains unaltered.

By the productions (2), k2(Wi) possibly decreases, while the other terms remain unaltered.

By (3a) and (3b), 1 is subtracted from k~(Wi) and added to h(Ri). By (3c) and (3d), 1 is subtracted from k2(Wi) and k - 1 from h(Ri).

Simultaneously k is added to ka(Wi) , if the second letter of R i is nnprimed, otherwise ka(Wi) remains unaltered.

By (3e), the same power of k is subtracted from ka(Wi) and added to k(Ri) . By (3f), h ~ - k ~-i is added to ka(Wi) , if the next letter is unprimed,

otherwise k ~-i is subtracted from ka(Wi). Simultaneously ( k - 1)k r - i is subtracted from h(Ri).

The productions (4), (5) and (6) leave g(P~) and h(Ri) unaltered, while k l (Wi) , k2(Wi) or ka(Wi) possibly decrease.

We have seen that none of the productions decreases the difference g(Pi) - - k i (Wi ) - - k~(Wi) - - ka(Wi) - - h(Ri). Hence the sequence of the differences is monotonous. Now (10) follows from the fact that the first and the last member of the sequence are equal to zero.

The following assertion completes the proof of Lemma 7.

Assertion 2. I f 7-/-/4'colg(°)Sllg(°)-ldo llg(o) *~ r 1 # ' P Q R , where Q • V + and g(P) = h(R), then Sllg (0)-1 *=>T ~"

The proof is performed by induction on the number of applications of ( lb) and (ld). I f there are no applications of these productions, then none of the productions (2) is applicable and there are no changes in the second block. In this case the assertion holds trivially.

Consider an arbitrary derivation

ONE-SIDED CONTEXT 3 8 5

which contains at least one application of (lb) or (ld). Consider the last application of (lb) or (ld), creating an c~' such that no c; is to the left of this c:'. Let this be the ilth step. By Assertion 1, Wil is of the form #'P~lc~qP'~.QqRq or #'P~kc;XQ'qRq. Again by Assertion 1, when c~ in question is replaced, its right neighbor cannot be G or X. Thus there is an application of (1@ (2a) or (2b) to c~ej, say at the izth step. The appearing letter will be c o , X ' or E . . In the two former cases we continue in the same way until we come to the situation, where E~ is the appearing letter. Let this be the i~th step. Thus W% is of the form #'P'foE~Q~'Ri~ or #'P" Q'~X'E~Q~"R%, where E~ is the descendant of the letter A in the production n: A B -~ CD of T. Again by Assertion 1, Qi"~ is of the form YQi'~, and this Y must be replaced before the replacement of E~. Thus Y = B, and the applied production is (2e) at the i~+lth step. Now we proceed in the same way until d;' appears. Let this be the iqth step. Let Jl ,..., Jq be the steps, at which (4), (5) and (6) are applied to the letters created at the steps i 1 . . . . , iq . By the choice of i 1 , Q = Qt is of the form QCDO, where C and D are descendants of A and B.

Consider now the derivation that we get omitting the steps i 1 ,..., iq, Jl ,...,Jq of the original derivation. Let #'P 'Q'R' be 'the last word of this derivation. Now g(P') = g(P) -- 1 = h(R) -- 1 ~- h(R'), Q' = Q A B ~ , and the number of applications of (lb) or (ld) has decreased by one. By the induction hypothesis and the application of A B --* CD we get

SPg(o)-I ~ Q A B ~ *~ QCDQ = Q. T T

This completes the proof of Assertion 2 and the inclusion (9). In Lemma 7, we have almost gained our goal. Linear erasing is needed to

erase the number-blocks from L 1 . By Lemma 2, the family of left context- sensitive languages is dosed under 4-resticted homomorphism. Therefore, it suffices to mix the blocks uniformly. This is done in the remaining lemmas.

LEMMA 8. I f L C VTVr+ is a context-sensitive language, then L ' = (#e21g(°)Q I Q eL} is a left context-sensitive language.

Proof. By Lemma 7,

L 1 ~- {#PQ mi(P) ] Q eL, P ~ {e o . . . . . ek_l}lg(O) , $ I lg(O)-I :=>'T(gP) Q.}

is a left context-sensitive language. Note that by Lemma 3, every Q e L has


a T-derivation of length g(P) , where P ~ {e0 .... , e~_l}lg c°). Therefore, by applying the transformation

# ~ # ' ,

a i ~ ai' ,

we see that

tei --+ #~e t, etei . - -+ ete t

aitej ~ ai'e e~, errei --+ e"e",

L~ = {# ' e ' l g l ° )a 1 . . . . a~' e"lg ~°) ] a 1 • " a ~ L }

is a left context-sensitive language. By the following list of productions, we define a left context-sensitive

transformation T such that L ' = T(L2) . In the productions a and b range over V r , and x and y range over V T' u V~..

0) (a) e'a'-* e'd, (b) ~x-+ Uxo, (c) x a y ~ Xaya ,

(d) xae" --+ x ,a ,

(2) (a) a----~e',

(b) e'x~ ~ e'x,

(c) x y ~ - . xy,

(d) xa ~ xa",

(3) (a) #'---, #, (b) # e ' --~ #e ,

(c) ee'---~ ee,

(d) ea" --~ ea,

(e) ab" -+ ab.

The words in L 2 are composed of three blocks. The transformation shifts letters one by one from the second block to the third block. Equal length of the blocks guarantees that all incorrect derivations fail.

The word #e2na 1 "" a n o f L ' can be derived from # ' e ' n a l . . . . a~'e "~ by the control word

((la) ( lb) (lc) n-1 (ld) (2a) (2b) (2c) "-1 (2d)) n

• (3a) (3b) (3@ "-1 (3d) (3e) "-1.


This proves

L' c_

The proof of the converse inclusion

T(L2) C_ L'

is somewhat similar to the corresponding proof of Lemma 6. Without loss of generality, we may assume that termination by (3) takes place in the end of the derivation. We call a letter x a active iff it is immediately followed by some y or e". To each word W = # 'PQR derivable from L 2 we associate the following words over V r .

B(W) is obtained from the Q-block by erasing all primes, indices, bars and letters e', except in the case ~x a being a subword. Then also a is erased.

M(W) consists of the indices of the active letters. E(W) is got from the R-block by erasing all primes, indices, bars and

letters e". We show that

(4) E(W) mi(M(W)) B(W)

stays constant during the whole derivation. Consider any terminating derivation that begins from #'e'~al . . . . an'e "n. One can immediately see that it terminates to a word #e2~R, where R ~ VT + and lg(R) • n. There are n applications of (ld). By each application of (ld) the number of active letters decreases. The only production, by which the number of active letters increases, is (lb). Thus there are n applications of (lb). By each of them the number of active letters increases. On the other hand, there are no more applications of (lb), since there are only n letters a'. No other productions change the number of active letters. Neither (lb) nor (ld) changes the length of (4). As the other productions cannot increase the lengths of B(W) or E(W), we conclude that the length sequence of (4) is monotonous. But it is constant, since the lengths of (4) for the axiom and the terminal word are equal. Thus, by none of the productions the length of (4) changes. Therefore, (4) cannot change at all by (la), (lc), (2a), (2b) or (2c). By (lb), one a is erased from the beginning of B(W) and it is joined to the beginning of M(W). By (ld), one a is erased from the end of M(W) and it is joined to the end of E(W). By these productions (4) does not change. The remaining productions cannot change B(W), M(W) or E(W). Therefore (4) stays constant during the derivation, and the terminal word is #e2•a 1 "" a~ . This completes the proof of the second inclusion.

388 M A R T T I P E N T T O N E N

LEMMA 9. I f L C_ V T V T + is a context-sensit ive language, then L . . . .

{#aaee "'" anee [ a 1 "'" an e L} is a left context-sensit ive language.

Proof . By Lemma 8, L ' ~- {# 'e 'Znal ' . " a~" I al "'" a ~ e L } is a left context-sensitive language. We construct a left context-sensitive transformation T such that L" -~ T ( L ' ) . Let T contain the following productions:

(1) (a) (b) #e '--+ #a",

(c) aeee' -+ aeeb",

(d) a"e' -+ a"e,

(e) aOee' --+ a" ee,

(f) a"ea' -+ a" ee,

(2) (a) a"eee' --~ a"eeea,

(b) eae'--+ eaea,

(c) eaa' --+ eae 't,

(3) (a) a" a,

(b) aeee a --~ aeee',

(c) e'e a ~ e'e',

(d) e'e" --+ e'e'.

The significant part of a word in L' is the R-block. We want to spread this part uniformly over the whole word. This is done by guessing letters by (1) and then testing the guesses by (2). The success in spreading is based on the fact that the lengths of the blocks are known.

Again, we prove

L" C_ T ( L ' )

by giving control words corresponding to words inL". The word # a l e e ".. anee

can be derived from #'e12nai . . . . an' by the control word

(la)(lb)~l(ld)(le)(2a)(2b)2"-4(2c)(3a)(3b)(3c)2n-a(3d)

n--1 2n 2(~+1) 2n 2(z+l) • F [ ( ( l c )a , ( ld ) ( l e ) (2a ) (2b) - " (2c)(3a)(3b)(3c)- " (3d))

i=2

• (lc)~.(ld)(lf)(3a),

where the subindices express the choice of the production.


We prove the inclusion

T(L') C_ L"

by the same method as in the preceding lemmas. A letter e a is active iff it is immediately followed by e' or a'. To every word W derivable from L ' we associate the following words:

B(W) is composed of the terminals a e V r appearing in W, except in the case when a"eeea is a subword or a"ee is a suffix. Then a in question is included.

M(W) is composed of the inverses g of the indices of active letters e a . E(W) is the maximal word a~.l -." a ~ , such that a~l "" a'j~ is a suffix of W. Informally, B(W) contains the guesses, M(W) is the memory for testing

of guesses, and E(W) contains the original letters, for which no tests have been performed.

Multiplication between letters a and g is defined by aS = Ha = A. We show that for any terminating derivation

(4) B(W) M(W) E(W)

stays constant. Consider an arbitrary terminating derivation beginning from #'e'2nal .... an'. There are n applications of (2c), by each of which the number of active letters decreases. There must be n applications of (2a), by which the number of active letters increases. These are all and only productions, by which the number of active letters changes. By (lb), (lc), ( ld) or (le), B(W) cannot change, since e' is never followed by e or e~. The production (If) is applicable only to words W, for which M(W) = A. Then (4) does not change, since the same a is joined to B(W) and erased from E(W). By the application of (2a), a is joined to the end of B(W) and ~ to the beginning of M(W). This does not change (4). The application of (2b) has no effect on B(W), M(W) or E(W). By (2c), d is erased from M(W) and a from E(W). When (3a) is applied, there is an eeea immediately following a". Otherwise there would be no application of (2a) for this a, which is a contradiction. Thus B(W) cannot change by (3a). The remaining productions have no effect on B(W), M(W) or E(W). Thus (4) remains constant after all applications, and the terminal word is #alee "" a~ee. This completes the proof of the second inclusion.

4. RESULTS

By means of the preceding lemmas, we can now get some characterizations for context-sensitive and recursively enumerable languages.

390 M A R T T I P E N T T O N E N

THEOREM i. Every context-sensitive language is left context-sensitive.

Proof. The theorem follows by Lemmas 9 and 2 because the short words of the language can be derived directly from the start symbol.

THEOREM 2. Every context-sensitive language can be generated by a grammar, whose productions are of the form

A - + BC, AB--~ AC, A - ~ a,

where A, B, C are nonterminals and a is a terminal.

Proof. By Theorem 1 and Lemma 1, we have a normal form

(1) A--+ B, A -~ B C, A B - + A C, A--~ a.

As in the proof of Theorem 1, we can assume that L __C VTVT+. Thus L can be represented in the form

L = 0 aL,~ , aE V T

where each L . is a context-sensitive language. By Theorem 1, each L . is generated by a grammar G. : (VN., VT., S . ,F.) in the form (1). We assume that the nonterminal alphabets are pairwise disjoint. Let S be the new start symbol and for every a ~ VT, let # . be a new nonterminal. Let G be the grammar with the following productions:

S --~ #aSa for every a ~ Vr ,

P ~ Q, if P - ~ Q E U Fa is not of the type A --* B, g

X A -+ XB, if A --~ B ~F~ and X ~ VN~ U {#~}, for some a ~ V r ,

# ~ - + a for e v e r y a ~ V r .

I t is easy to see that L = L(G). The grammar is of the required form.

THEOREM 3. There is a linear context-free language Llin such that every context-sensitive language L can be represented in the form L = T ( Z l i n ) , where T is a length preserving left context-sensitive transformation.

Proof. As above, we may assume L C VTVr+. By the constructions of Lemmas 5-9, L ' = {#a lee ' " anee [ a 1 "" a~ eL} (# , e ~ VT) can be represented in the form L' = T'(Lo), where L o ~ {#'A~BnC'~ I m >~ 2, n >~ 2}


and T ' = (V~v, VT u {#, e},F') is a length preserving left context-sensitive transformation such that the productions in F ' are of the form P A --~ PB, where lg(P) ~ 3.

Recall the construction in the proof of Lemma 2. Denote

V = Vw k9 Vr u {e} and L~ = # V 3 kd V ~.

For every P ~ L ~ , let (P) be a new letter. Define a homomorphism h by h((P)) = P. Clearly Llin = h-l(Lo) is a linear language. Let T be the transformation containing the productions

(P) -+ (Q) for all P, Q eLz such that P =~r' Q,

(P)(Q) --~ (P)(R) for all P, Q, R EL 2 such that PQ ~T" PR,

(#ace) ~ a for every a e VT ,

(ace) --~ a for every a ~ VT .

Then L = T(Llin) is the required representation.

Remark. In the above construction, T has only productions of the types A --+ B, A B --~ AC, A -+ a. Here again, the productions of the type A -+ B can be eliminated by the method of the proof of Theorem 2.

Remark. Katz (1974) has shown that the linear languageLlin in Theorem 3 can be replaced by a regular language.

From Theorem 2 we get the following normal form for type 0 grammars:

THEOREM 4. Every recursively enumerable language can be generated

by a grammar, whose productions are of the form

A --> BC, AB- -> AC, A --~ a, A--) . A.

Proof. Let L be a recursively enumerable language and b and c new terminals. By Theorem 9.9 in Salomaa (1973), there is a context-sensitive language L a C_ b*cL such that for each P in L, L a contains at least one word b~cP. By Theorem 2, L 1 can be generated by a grammar, whose productions are of the form A --+ BC, A B ~ AC, A --+ a. Consider now b and c nonterminals and add the productions b--+ A and e ~ )~. Clearly this new grammar generates L.

RECEIVED: April 11, 1974


REFERENCES

GLADKIJ, A. V. (1973), "Formal Grammars and Languages," Izdatelstvo Nauka, Moscow.

HAINES, L. (1970), Representation theorems for context-sensitive languages, un- published data.

HAVEL, I. (1970), On one-sided context-sensitive grammars, in "Automatentheorie und Formale Sprachen" (J. D6rr and G. Hotz, Eds.), pp. 221-225, Bibliographisches Institut, Mannheim.

KATZ, B. E. (1974), Synchronized left context-sensitive transformations of regular languages, Nauch. Tehh. Inform. Ser. 2 1974(4), 36.

KURODA, S.-Y. (1964), Classes of languages and linear bounded automata, Inform. Contr. 7, 131.

PENTTON~N, M. (1972), A normal form for context-sensitive grammars, Ann. Univ. Turhu. Ser. A1 156, 1.

R~v~.sz, G. (1974), Comment on the paper "Error detection in formal languages," J. Comp. System Sei. 8, 238.

SALONAA, A. (1969), "Theory of Automata," Pergamon, Oxford. SALOMAA, A. (1973), "Formal Languages," Academic Press, New York. SAMOILENKO, L. G. (1969), On a class of context-sensitive grammars, Kibernetiha

1969 (2), 94. NAMOILENKO, L. C. (1971), On a method of constructing left context-sensitive gram-

mars and languages, Kibernetika 1971 (6), 8. SAMOILENKO, L. G. (1973), On a method of construction and properties of context-

sensitive grammars and languages, in "Mathematical Foundations of Computer Science 1973" (I. M. Havel, Ed.), pp. 307-311, Computing Res. Centre, Bratislava.

Documents

One-sided and two-sided context in formal grammars