Upload
franklin-mcdaniel
View
224
Download
1
Tags:
Embed Size (px)
Citation preview
A language is a set of strings.
Example: The set of all valid C++ programs is a language. Each program consists of a string of symbols – identifiers, keywords, numeric constants, operator symbols, and various punctuation marks, called tokens.
Problem: Given a string, and a language, determine whether the string is recognized by the language.
A mechanism to describe, or recognize, the strings that make up a language is a grammar.
A grammar consists of
1. A set of non-terminal symbols
2. A set of terminal symbols – strings in the language are formed by the terminal symbols.
3. A specified non-terminal symbol called the start symbol.
4. A set of rules, called productions, to reduce the start symbol to a string in the language.
A string of terminal symbols belongs to the language defined by the grammar if beginning with the start symbol, and repeatedly replacing a non-terminal symbol with the right hand side of a production for that non-terminal, we eventually obtain the specified string.
This process is called a derivation.
Example: Consider the following grammar for arithmetic expressions involving additions and subtractions.
The non-terminal symbols are: <L>, <int>
The terminal symbols are: all integers, +, -
The start symbol is: <L>
The rules are
<L> <int> + <L>
<L> <int> - <L>
<L> <int>
<int> any integer
Example of a string: 4 – 2 + 5
<L> <int> - <L> 4 - <L> 4 - <int> + <L>
4 – 2 + <L> 4 – 2 + <int> 4 – 2 + 5
This derivation verifies that the original string, or arithmetic expression, belongs to the language, but it doesn’t tell the whole story. We need to look at the parse tree for this derivation.
The root of a parse tree is the start symbol, and the children of any node on the tree which contains a non-terminal, are the objects – terminals and non-terminals- on the right hand side of a production for that non-terminal.
<L>
<int> - <L>
4 <int> + <L>
2 <int>
5
In other words 4 – 2 + 5 = -3 What’s wrong with this?
4 – 2 + 5 = 7 in “conventional” arithmetic.
2+5=7
4-7=-3
The additions and subtractions in expressions recognized by this grammar are right associative, that is they are performed from right to left – the right-most operation is performed first, then the right-most operation of those remaining is performed next, etc.
In ordinary arithmetic if several additions and subtractions appear in an expression, they are performed from left to right. The operations are said to be left associative.
Example: Revise the productions in the previous example so the operations are left associative.
<L> <L> + <int> | <L> - <int> | <int>
<int> any integer
The symbol “|” represents “or”
<L>
<L> + <int>
<L> - <int> 5
<int> 2
4
In this grammar the associativity is consistent with ordinary arithmetic.
4-2=2
2+5=7
Example: The language L consists of
ton-terminals: <S> which is also the start symbol
terminals: a, b, c
Any string of a’s, b’s, and c’s are candidates to belong to this language.
productions: <S> a <S> b | c
It’s not too hard to see that L = { an c bn : n >= 0 }
Example: Does the string aacbb belong to the language L:
<S> a <S> b aa <S> bb aacbb
<S>
a <S> b
a <S> b
c
To write a program which will recognize strings in a particular language we rely heavily on the grammar for the language.
There is a function for each non-terminal symbol.
Consider the previous example with productions
<S> a <S> b | c
Write a function corresponding to the non-terminal symbol <S> which takes note of the current character in the specified string, and performs the actions indicated by the right hand side of the productions for <S>.
The function bool S ( ) is called in main via
input a string
ch = getNextChar ( ) // the first character in the string
if ( S ( ) )
display: the string is recognized by the grammar
else
display: the string is not recognized by the grammar
bool S ( ) {
if (ch = ‘a’) { // applies the production <S> a <S> b
st.push (ch)
if (EOS) return false
ch = getNextChar ( )
if (!S ( ) ) return false
if ( ch == ‘b’) {
if (st.empty( ) ) return false
st.pop ( )
if (st.empty ( ) && EOS ) return true
if ( st.empty( ) xor EOS ) return false
ch = getNextChar ( ) return true } }
else if ( ch == ‘c’ ) { // applies the production <S> c
if ( st.empty ( ) && EOS ) return true
if ( st.empty ( ) xor EOS ) return false
ch = getNextChar ( )
return true
}
else // something bad happened
return false
} // the end of S
Example:
In main: input string: aacbb ch = ‘a’
Call to S: st: a ch = ‘a’
Call to S st: a a ch = ‘c’
Call to S st: a a ch = ‘b’ return true and
Example:
In main: input string: aacbb ch = ‘a’
Call to S: st: a ch = ‘a’
Call to S st: a a ch = ‘c’ ch = ‘b’ = ‘b’ return true
true
Call to S st: a a ch = ‘b’ return true and
Example:
In main: input string: aacbb ch = ‘a’
Call to S: st: a ch = ‘a’ ‘b’ true
Call to S st: a a ch = ‘c’ ch = ‘b’ = ‘b’ return true
true
Call to S st: a a ch = ‘b’ return true and