Upload
guido-wachsmuth
View
3.649
Download
2
Embed Size (px)
DESCRIPTION
This lecture lays the theoretic foundations for declarative syntax formalisms and syntax-based language processors, which we will discuss later in the course. We introduce the notions of formal languages, formal grammars, and syntax trees, starting from Chomsky's work on formal grammars as generative devices. We start with a formal model of languages and investigate formal grammars and their derivation relations as finite models of infinite productivity. We further discuss several classes of formal grammars and their corresponding classes of formal languages. In a second step, we introduce the word problem, analyse its decidability and complexity for different classes of formal languages, and discuss consequences of this analysis on language processing. We conclude the lecture with a discussion about parse tree construction, abstract syntax trees, and ambiguities.
Citation preview
IN4303 2014/15 Compiler Construction
Declarative Syntax Definition grammars and trees
Guido Wachsmuth
Grammars and Trees 2
3 * +7 21
3 * +7 21
Exp ExpExp
Exp
Exp
parser
Grammars and Trees 3
syntax definition
errors
parse generate
check
parse table
generic parser
Grammars and Trees
Stratego
NaBL TS
SDF3
ESV editor
SPT tests
4
syntax definition
concrete syntax
abstract syntax
static semantics
name binding
type system
dynamic semantics
translation
interpretation
Grammars and Trees
Stratego
NaBL TS
SDF3
ESV editor
SPT tests
5
syntax definition
concrete syntax
abstract syntax
static semantics
name binding
type system
dynamic semantics
translation
interpretation
Grammars and Trees
Stratego
NaBL TS
SDF3
ESV editor
SPT tests
6
syntax definition
concrete syntax
abstract syntax
static semantics
name binding
type system
dynamic semantics
translation
interpretation
Grammars and Trees 7
P A R E N T A L
ADVISORYTHEORETICAL CONTENT
Grammars and Trees 8
formal languages
Grammars and Trees 9
infinite productivity
Grammars and Trees 10
finite models
Grammars and Trees 11
Philosophy
Linguistics
lexicology
grammar
morphology
syntax
phonology
semantics
Interdisciplinary
Computer Science
syntax
semantics
Grammars and Trees 12
Philosophy
Linguistics
lexicology
grammar
morphology
syntax
phonology
semantics
Interdisciplinary
Computer Science
syntax
semantics
Grammars and Trees 13
Grammars and Trees 13
vocabulary Σ
finite, nonempty set of elements (words, letters)
alphabet
Grammars and Trees 13
vocabulary Σ
finite, nonempty set of elements (words, letters)
alphabet
string over Σ
finite sequence of elements chosen from Σ
word, sentence, utterance
Grammars and Trees 13
vocabulary Σ
finite, nonempty set of elements (words, letters)
alphabet
string over Σ
finite sequence of elements chosen from Σ
word, sentence, utterance
formal language λ
set of strings over a vocabulary Σ
λ ⊆ Σ*
Grammars and Trees 14
formal grammars
Grammars and Trees 15
formal grammar G
derivation relation ⇒G
formal language L(G) ⊆ Σ*
L(G) = {w∈Σ* | S ⇒G* w}
Grammars and Trees 16
G = (N, Σ, P, S)
Num → Digit Num Num → Digit Digit → “0” Digit → “1” Digit → “2” Digit → “3” Digit → “4” Digit → “5” Digit → “6” Digit → “7” Digit → “8” Digit → “9”
decimal numbers morphology
Grammars and Trees 17
G = (N, Σ, P, S)
Num → Digit Num Num → Digit Digit → “0” Digit → “1” Digit → “2” Digit → “3” Digit → “4” Digit → “5” Digit → “6” Digit → “7” Digit → “8” Digit → “9”
decimal numbers morphology
Σ: finite set of terminal symbols
Grammars and Trees 18
G = (N, Σ, P, S)
Num → Digit Num Num → Digit Digit → “0” Digit → “1” Digit → “2” Digit → “3” Digit → “4” Digit → “5” Digit → “6” Digit → “7” Digit → “8” Digit → “9”
decimal numbers morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
Grammars and Trees 19
G = (N, Σ, P, S)
Num → Digit Num Num → Digit Digit → “0” Digit → “1” Digit → “2” Digit → “3” Digit → “4” Digit → “5” Digit → “6” Digit → “7” Digit → “8” Digit → “9”
decimal numbers morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
Grammars and Trees 20
G = (N, Σ, P, S)
Num → Digit Num Num → Digit Digit → “0” Digit → “1” Digit → “2” Digit → “3” Digit → “4” Digit → “5” Digit → “6” Digit → “7” Digit → “8” Digit → “9”
decimal numbers morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
P⊆N×(N∪Σ)*: set of production rules
Grammars and Trees 21
Num ⇒
Digit Num ⇒
4 Num ⇒
4 Digit Num ⇒
4 3 Num ⇒
4 3 Digit Num ⇒
4 3 0 Num ⇒
4 3 0 Digit ⇒
4 3 0 3
Num → Digit Num ⇒
Digit → “4” ⇒
Num → Digit Num ⇒
Digit → “3” ⇒
Num → Digit Num ⇒
Digit → “0” ⇒
Num → Digit ⇒
Digit → “3” ⇒
decimal numbers production
Grammars and Trees 21
Num ⇒
Digit Num ⇒
4 Num ⇒
4 Digit Num ⇒
4 3 Num ⇒
4 3 Digit Num ⇒
4 3 0 Num ⇒
4 3 0 Digit ⇒
4 3 0 3
Num → Digit Num ⇒
Digit → “4” ⇒
Num → Digit Num ⇒
Digit → “3” ⇒
Num → Digit Num ⇒
Digit → “0” ⇒
Num → Digit ⇒
Digit → “3” ⇒
decimal numbers production
leftmost derivation
Grammars and Trees 22
Num ⇒
Digit Num ⇒
Digit Digit Num ⇒
Digit Digit Digit Num ⇒
Digit Digit Digit Digit ⇒
Digit Digit Digit 3 ⇒
Digit Digit 0 3 ⇒
Digit 3 0 3 ⇒
4 3 0 3
Num → Digit Num ⇒
Num → Digit Num ⇒
Num → Digit Num ⇒
Num → Digit ⇒
Digit → “3” ⇒ ⇒
Digit → “0” ⇒ ⇒
Digit → “3” ⇒
Digit → “4” ⇒ ⇒
decimal numbers production
Grammars and Trees 22
Num ⇒
Digit Num ⇒
Digit Digit Num ⇒
Digit Digit Digit Num ⇒
Digit Digit Digit Digit ⇒
Digit Digit Digit 3 ⇒
Digit Digit 0 3 ⇒
Digit 3 0 3 ⇒
4 3 0 3
Num → Digit Num ⇒
Num → Digit Num ⇒
Num → Digit Num ⇒
Num → Digit ⇒
Digit → “3” ⇒ ⇒
Digit → “0” ⇒ ⇒
Digit → “3” ⇒
Digit → “4” ⇒ ⇒
decimal numbers production
rightmost derivation
Grammars and Trees 23
G = (N, Σ, P, S)
Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
binary expressions syntax
Grammars and Trees 24
G = (N, Σ, P, S)
Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
binary expressions syntax
Σ: finite set of terminal symbols
Grammars and Trees 25
G = (N, Σ, P, S)
Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
binary expressions syntax
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
Grammars and Trees 26
G = (N, Σ, P, S)
Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
binary expressions syntax
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
Grammars and Trees 27
G = (N, Σ, P, S)
Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
binary expressions syntax
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
P⊆N×(N∪Σ)*: set of production rules
Grammars and Trees 28
Exp ⇒
Exp + Exp ⇒
Exp + Exp * Exp ⇒
3 + Exp * Exp ⇒
3 + 4 * Exp ⇒
3 + 4 * 5
Exp → Exp “+” Exp ⇒
Exp → Exp “*” Exp ⇒
Exp → Num ⇒
Exp → Num ⇒
Exp → Num ⇒
binary expressions production
Grammars and Trees 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
start symbol S∈N
Grammars and Trees 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
start symbol S∈N
nonterminal symbol
Grammars and Trees 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
start symbol S∈N
context
Grammars and Trees 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
start symbol S∈N
replacement
Grammars and Trees 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
start symbol S∈N
Grammars and Trees 30
type-0, unrestricted: P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
type-1, context-sensitive: (a A c, a b c)
type-2, context-free: P ⊆ N × (N∪Σ)*
type-3, regular: (A, x) or (A, xB)
Grammars and Trees 31
formal grammars
context-sensitive
context-free
regular
Grammars and Trees 32
formal grammar G
derivation relation ⇒G
formal language L(G) ⊆ Σ*
L(G) = {w∈Σ* | S ⇒G* w}
Grammars and Trees 33
formal grammar G = (N, Σ, P, S)
derivation relation ⇒G ⊆ (N∪Σ)* × (N∪Σ)*
w ⇒G w’ ⇔
∃(p, q)∈P: ∃u,v∈(N∪Σ)*:
w=u p v ∧ w’=u q v
formal language L(G) ⊆ Σ*
L(G) = {w∈Σ* | S ⇒G* w}
Grammars and Trees 34
formal languages
context-sensitive
context-free
regular
Grammars and Trees 35
3 * +7 21
3 * +7 21
Exp ExpExp
Exp
Exp
parser
Derivation is about productivity. But what about parsing?
Grammars and Trees 36
word problem
Grammars and Trees 37
word problem χL: Σ*→ {0,1}
w → 1, if w∈L
w → 0, else
theoretical computer science decidability & complexity
Grammars and Trees 37
word problem χL: Σ*→ {0,1}
w → 1, if w∈L
w → 0, else
decidability
type-0: semi-decidable
type-1, type-2, type-3: decidable
theoretical computer science decidability & complexity
Grammars and Trees 37
word problem χL: Σ*→ {0,1}
w → 1, if w∈L
w → 0, else
decidability
type-0: semi-decidable
type-1, type-2, type-3: decidable
complexity
type-1: PSPACE-complete
type-2, type-3: P
theoretical computer science decidability & complexity
Grammars and Trees 37
word problem χL: Σ*→ {0,1}
w → 1, if w∈L
w → 0, else
decidability
type-0: semi-decidable
type-1, type-2, type-3: decidable
complexity
type-1: PSPACE-complete
type-2, type-3: P
theoretical computer science decidability & complexity
PSPACE⊇NP⊇P
Grammars and Trees 38
formal grammars
context-sensitive
context-free
regular
theoretical computer science decidability & complexity
Grammars and Trees 38
formal grammars
context-sensitive
context-free
regular
theoretical computer science decidability & complexity
Grammars and Trees 38
formal grammars
context-sensitive
context-free
regular
theoretical computer science decidability & complexity
Grammars and Trees 39
!Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
context-free grammars production vs. reduction rules
Grammars and Trees 39
!Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
context-free grammars production vs. reduction rules
productive form
Grammars and Trees 39
!Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
!Num → Exp Exp “+” Exp → Exp Exp “−” Exp → Exp Exp “*” Exp → Exp Exp “/” Exp → Exp “(” Exp “)” → Exp
context-free grammars production vs. reduction rules
productive form
Grammars and Trees 39
!Exp → Num Exp → Exp “+” Exp Exp → Exp “−” Exp Exp → Exp “*” Exp Exp → Exp “/” Exp Exp → “(” Exp “)”
!Num → Exp Exp “+” Exp → Exp Exp “−” Exp → Exp Exp “*” Exp → Exp Exp “/” Exp → Exp “(” Exp “)” → Exp
context-free grammars production vs. reduction rules
productive form reductive form
Grammars and Trees 40
3 + 4 + 5 ⇒
Exp + 4 + 5 ⇒
Exp + Exp + 5 ⇒
Exp + Exp * Exp ⇒
Exp + Exp ⇒
Exp
Num → Exp ⇒
Num → Exp ⇒
Num → Exp ⇒
Exp “*” Exp → Exp ⇒
Exp “+” Exp → Exp ⇒⇒
binary expressions reduction
Grammars and Trees 41
3 * +7 21
3 * +7 21
Exp ExpExp
Exp
Exp
parser
The word problem is about membership. But what about structure?
Grammars and Trees 42
syntax trees
Grammars and Trees 43
Exp
Num
Exp
+Exp Exp
Exp
−Exp Exp
Exp
*Exp Exp
Exp
/Exp Exp
Exp
Exp( )
context-free grammars tree construction rules
Grammars and Trees 44
binary expressions tree construction
3 + *4 5
Grammars and Trees 44
binary expressions tree construction
Exp
3 + *4 5
Grammars and Trees 44
binary expressions tree construction
Exp Exp
3 + *4 5
Grammars and Trees 44
binary expressions tree construction
Exp Exp
3 + *4 5
Exp
Grammars and Trees 44
binary expressions tree construction
Exp
Exp
Exp
3 + *4 5
Exp
Grammars and Trees 44
binary expressions tree construction
Exp
Exp
Exp
3 + *4 5
Exp
Exp
Grammars and Trees 45
binary expressions ambiguity
Exp
Exp
Exp
3 + *4 5
Exp
Exp
Exp
Exp
Exp
3 + *4 5
Exp
Exp
Grammars and Trees 46
syntax trees
different trees for same sentence
derivations
different leftmost derivations for same sentence
different rightmost derivations for same sentence
NOT just different derivations for same sentence
context-free grammars ambiguity
Grammars and Trees 47
parse trees
parent node: nonterminal symbol
child nodes: terminal symbols
abstract syntax trees (ASTs)
abstract over terminal symbols
convey information at parent nodes
abstract over injective production rules
syntax trees parse trees & abstract syntax trees
Grammars and Trees 48
binary expressions parse tree & abstract syntax tree
Exp
Exp
Exp Exp
Exp
(3 + *4 5 )
Exp
Grammars and Trees 48
binary expressions parse tree & abstract syntax tree
Const
Mul
Const
3 4 5
Const
Add
Exp
Exp
Exp Exp
Exp
(3 + *4 5 )
Exp
Grammars and Trees 49
Except where otherwise noted, this work is licensed under
Grammars and Trees 50
attribution
slide title author license
1 The Pine, Saint Tropez Paul Signac public domain
2, 3, 35, 41 PICOL icons Melih Bilgil CC BY 3.0
9 Writing Caitlin Regan CC BY 2.0
10 Latin Grammar Anthony Nelzin
13, 15, 29-34, 38 Noam Chomsky Fellowsisters CC BY-NC-SA 2.0