22
Tree Automata First: A reminder on Automata on words Typing semistructured data

Tree Automata

  • Upload
    jersey

  • View
    59

  • Download
    0

Embed Size (px)

DESCRIPTION

Tree Automata. First: A r eminder on Automata on words. Finite state automata on words. Transitions. Alphabet. State. Initial state. Accepting states. Nondeterministic automaton: Example. a b a -. a b a a b -. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 2. - PowerPoint PPT Presentation

Citation preview

Page 1: Tree Automata

Tree Automata

First: A reminder on Automata on words

Typing semistructured data

Page 2: Tree Automata

Finite state automata on words

),,,,( 0 FqQ

Alphabet

State

Initial state Accepting states

Transitions

Qq 0 QF

)(: QPQ

Typing semistructured data

Page 3: Tree Automata

q0

Nondeterministic automaton: Example

33

32

21

01

100

,

,

,

,

,,

qq

qq

qq

qqb

qqqa

2

3210 ,,,

,

qF

qqqqQ

ba

a b a a b- a b a- q0

q1

q0 q0

q1

q0

q1

q0 q0

q1

q0 q0 q2

q1

q0

KO OK

Page 4: Tree Automata

• Deterministic– No transition– No alternative transitions such as

• Determinization – It is possible to obtain an equivalent deterministic automaton– State of new automaton = set of states of the original one– Possible exponential blow-up

• Minimization• Limitations – cannot do

– Context-free languages• Essential tool – e.g., lexical analysis

Reminder

Ν, nba nn

100 ,, qqqa 0, qq

Page 5: Tree Automata

Reminder (2)• L(A) = set of words accepted by automata A• Regular languages• Can be described by regular expressions, e.g. a(b+c)*d• Closed under complement

• Closed under union, intersection

– Product automata with states (s,s’) where s is from A and s’ is from A’

)(* AL

)()(

)()(

BLAL

BLAL

Page 6: Tree Automata

Automata on words versus trees

a b b a

a

b

b a

b

b

a b

a

Left to right

Right to left

No difference

Botto

m

up

Top

down

Differences

Page 7: Tree Automata

Automata

Automata on ranked trees

Typing semistructured data

Page 8: Tree Automata

Binary tree automata

• Parallel evaluation

• For leaves:

• For other nodes:

),,,( FQ

)(: QP

)(: QPQQ

a

b

b a

b

a b

a

Botto

m

up

q q’

bq”

q1q”

q2

qqq’

Typing semistructured data

Page 9: Tree Automata

Bottom-up tree automata

• Bottom-up: if a node labeled a has its children in states q, q’ then the node moves nondeterministically to state r or r’

• Accepts is the root is in some state in F

• Not deterministic if alternatives or -transitions:

',',, rrqqa

}',{',, rrqqa ', rr

Page 10: Tree Automata

Example: deterministic bottom-up

1102012112

0002

0102012002

1112

,,,,,,,,

,,

,,,,,,,,

,,

qqqqqqq

qqq

qqqqqqq

qqq

1

10 ,

,,1,0

qF

qqQ

11

01

1

0

q

q

Page 11: Tree Automata

1102

1012

1112

0002

0102

0012

0002

1112

,,

,,

,,

,,

,,

,,

,,

,,

qqq

qqq

qqq

qqq

qqq

qqq

qqq

qqq

Boolean circuit evaluation

v

v

v

1v v1

10

v

0

11

11

01

1

0

q

q

0q 1q 0q

1q1q

1q1q1q

1q

1q

1q

1q

1q

OK

Page 12: Tree Automata

Regular tree language = set of trees accepted by a bottom-up tree

automaton

Typing semistructured data

Page 13: Tree Automata

Regular tree languages

Theorem: the following are equivalent– L is a regular tree language– L is accepted by a nondeterministic bottom-up

automaton– L is accepted by a deterministic bottom-up

automaton– L is accepted by a nondeterministic top-down

automaton

Deterministic top-down is weaker

Page 14: Tree Automata

Top-down tree automata

• Top-down: if a node labeled a is in state q”, then its left child moves to state q, right to q’

• Accepts is all leaves are in states in F• Not deterministic if

',", qqqa

',,',", rrqqqa

Page 15: Tree Automata

Why deterministic top-down is weaker?

• Consider the language– L = { <r> <a\>,<b\> <\r>, <r> <b\>,<a\><\r>) }

• It can be accepted by a bottom-up TA– Exercise: write a BUTA A such that L = L(A)

• Suppose that B is a deterministic top-down TA that accepts both trees in L– Exercise: Show that B also accepts <r> <a\><a\> <\r> – A contradiction

Fact: No deterministic top-down tree automata accepts exactly L

Page 16: Tree Automata

Ranked trees automata: Properties

• Like for words• Determinization • Minimization• Closed under

– Complement– Intersection– Union

Page 17: Tree Automata

But…

• XML documents are unranked:book (intro,section*,conclusion)

Page 18: Tree Automata

Automata

Automata on unranked tree

Typing semistructured data

Page 19: Tree Automata

Unranked tree automata

...,,,,,,

...,,,,,

...,,,,,

...,,,,,,

222

222

222

222

fffffffff

ttftfttt

ftffftff

ttttttttt

Issue: represent an infinite set of transitionsSolution: a regular language

Page 20: Tree Automata

• Rule:• Meaning: if the states of the children of some

node labeled a form a word in L(Q), this node moves to some state in {r1,…,rm}

Unranked tree automata (2)

mrrQLa ,...,)(, 1

fOrwherefOr

fttftOrwheretOr

ftfftAndwherefAnd

tAndwheretAnd

00,

*)(*)(11,

*)(*)(00,

11,

2

2

2

2

Page 21: Tree Automata

Building on ranked trees

a

b

b

b

b

a b

a b

a

b

b

b

b

a b

a b

Ranked tree: FirstChild-NextSibling

F: encoding into a ranked treeF is a bijectionF-1: decoding

Page 22: Tree Automata

Building on bottom-up ranked trees (2)

• For each Unranked TA A, there is a Ranked TA accepting F(L(A))

• For each Ranked TA A, there is an unranked TA accepting F-1(L(A))

• Both are easy to construct

Consequence: Unranked TA are closed under union, intersection, complement

Determinaztaion also possible, a bit more tricky