Upload
jersey
View
59
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Tree Automata. First: A r eminder on Automata on words. Finite state automata on words. Transitions. Alphabet. State. Initial state. Accepting states. Nondeterministic automaton: Example. a b a -. a b a a b -. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 0. q 2. - PowerPoint PPT Presentation
Citation preview
Tree Automata
First: A reminder on Automata on words
Typing semistructured data
Finite state automata on words
),,,,( 0 FqQ
Alphabet
State
Initial state Accepting states
Transitions
Qq 0 QF
)(: QPQ
Typing semistructured data
q0
Nondeterministic automaton: Example
33
32
21
01
100
,
,
,
,
,,
qqb
qqqa
2
3210 ,,,
,
qF
qqqqQ
ba
a b a a b- a b a- q0
q1
q0 q0
q1
q0
q1
q0 q0
q1
q0 q0 q2
q1
q0
KO OK
• Deterministic– No transition– No alternative transitions such as
• Determinization – It is possible to obtain an equivalent deterministic automaton– State of new automaton = set of states of the original one– Possible exponential blow-up
• Minimization• Limitations – cannot do
– Context-free languages• Essential tool – e.g., lexical analysis
Reminder
Ν, nba nn
100 ,, qqqa 0, qq
Reminder (2)• L(A) = set of words accepted by automata A• Regular languages• Can be described by regular expressions, e.g. a(b+c)*d• Closed under complement
• Closed under union, intersection
– Product automata with states (s,s’) where s is from A and s’ is from A’
)(* AL
)()(
)()(
BLAL
BLAL
Automata on words versus trees
a b b a
a
b
b a
b
b
a b
a
Left to right
Right to left
No difference
Botto
m
up
Top
down
Differences
Automata
Automata on ranked trees
Typing semistructured data
Binary tree automata
• Parallel evaluation
• For leaves:
• For other nodes:
),,,( FQ
)(: QP
)(: QPQQ
a
b
b a
b
a b
a
Botto
m
up
q q’
bq”
q1q”
q2
qqq’
Typing semistructured data
Bottom-up tree automata
• Bottom-up: if a node labeled a has its children in states q, q’ then the node moves nondeterministically to state r or r’
• Accepts is the root is in some state in F
• Not deterministic if alternatives or -transitions:
',',, rrqqa
}',{',, rrqqa ', rr
Example: deterministic bottom-up
1102012112
0002
0102012002
1112
,,,,,,,,
,,
,,,,,,,,
,,
qqqqqqq
qqq
qqqqqqq
qqq
1
10 ,
,,1,0
qF
qqQ
11
01
1
0
q
q
1102
1012
1112
0002
0102
0012
0002
1112
,,
,,
,,
,,
,,
,,
,,
,,
qqq
qqq
qqq
qqq
qqq
qqq
qqq
qqq
Boolean circuit evaluation
v
v
v
1v v1
10
v
0
11
11
01
1
0
q
q
0q 1q 0q
1q1q
1q1q1q
1q
1q
1q
1q
1q
OK
Regular tree language = set of trees accepted by a bottom-up tree
automaton
Typing semistructured data
Regular tree languages
Theorem: the following are equivalent– L is a regular tree language– L is accepted by a nondeterministic bottom-up
automaton– L is accepted by a deterministic bottom-up
automaton– L is accepted by a nondeterministic top-down
automaton
Deterministic top-down is weaker
Top-down tree automata
• Top-down: if a node labeled a is in state q”, then its left child moves to state q, right to q’
• Accepts is all leaves are in states in F• Not deterministic if
',", qqqa
',,',", rrqqqa
Why deterministic top-down is weaker?
• Consider the language– L = { <r> <a\>,<b\> <\r>, <r> <b\>,<a\><\r>) }
• It can be accepted by a bottom-up TA– Exercise: write a BUTA A such that L = L(A)
• Suppose that B is a deterministic top-down TA that accepts both trees in L– Exercise: Show that B also accepts <r> <a\><a\> <\r> – A contradiction
Fact: No deterministic top-down tree automata accepts exactly L
Ranked trees automata: Properties
• Like for words• Determinization • Minimization• Closed under
– Complement– Intersection– Union
But…
• XML documents are unranked:book (intro,section*,conclusion)
Automata
Automata on unranked tree
Typing semistructured data
Unranked tree automata
...,,,,,,
...,,,,,
...,,,,,
...,,,,,,
222
222
222
222
fffffffff
ttftfttt
ftffftff
ttttttttt
Issue: represent an infinite set of transitionsSolution: a regular language
• Rule:• Meaning: if the states of the children of some
node labeled a form a word in L(Q), this node moves to some state in {r1,…,rm}
Unranked tree automata (2)
mrrQLa ,...,)(, 1
fOrwherefOr
fttftOrwheretOr
ftfftAndwherefAnd
tAndwheretAnd
00,
*)(*)(11,
*)(*)(00,
11,
2
2
2
2
Building on ranked trees
a
b
b
b
b
a b
a b
a
b
b
b
b
a b
a b
Ranked tree: FirstChild-NextSibling
F: encoding into a ranked treeF is a bijectionF-1: decoding
Building on bottom-up ranked trees (2)
• For each Unranked TA A, there is a Ranked TA accepting F(L(A))
• For each Ranked TA A, there is an unranked TA accepting F-1(L(A))
• Both are easy to construct
Consequence: Unranked TA are closed under union, intersection, complement
Determinaztaion also possible, a bit more tricky