Upload
dinhthu
View
221
Download
6
Embed Size (px)
Citation preview
009
YEAR
20 7
32
Software ToolsANTLR
CO
MP
SC
I ANTLR
eala
nd Part II - Lecture 8
klan
d | N
ew Z
eve
rsity
of A
uck
The
Uni
v
1
Today’s Outline009
y
YEAR
20 7
32
• Introduction to ANTLRP i A ti
CO
MP
SC
I • Parsing Actions• Generators
eala
ndkl
and
| New
Ze
vers
ity o
f Auc
kTh
e U
niv
2
009
YEAR
20 7
32C
OM
PS
CI
Introduction to ANTLR
eala
nd
Introduction to ANTLR
klan
d | N
ew Z
eve
rsity
of A
uck
The
Uni
v
3
ANTLR009 • Parser/lexer generator: takes a grammar and
YEAR
20 7
32
g ggenerates a LL(k) lexer and/or parser for you– Written in Java, open source software
CO
MP
SC
I
– Can generate Java, C#, C++, Python, …– Uses the regular expression / grammar syntax
eala
nd
g p g ythat we have learned in the last lecture
– Grammar files have suffix .g
klan
d | N
ew Z
e
• Besides simple LL(k), ANTLR supports backtracking:– If it is unclear which rule alternative to apply,
vers
ity o
f Auc
k
alternatives are applied speculatively– If a choice turns out to be wrong, backtracking is
d d h l i i i d
The
Uni
v used and another alternative is tried4
ANTLR Example: Java.g009
p ggrammar Java;options { backtrack true; memoize true; }
YEAR
20 7
32
options { backtrack=true; memoize=true; }
compilationUnit:( (annotations)? packageDeclaration )?
CO
MP
SC
I (importDeclaration)* (typeDeclaration)* ;
packageDeclaration: 'package' qualifiedName ';' ;
importDeclaration:
eala
nd
importDeclaration:'import' ('static')? IDENTIFIER '.' '*' ';'
| 'import' ('static')?IDENTIFIER ('.' IDENTIFIER)+ ('.' '*')? ';' ;
klan
d | N
ew Z
e
• The “Java” grammar uses backtracking
( ) ( ) ; ;
qualifiedImportName: IDENTIFIER ('.' IDENTIFIER)* ; // ...
vers
ity o
f Auc
k The Java grammar uses backtracking• Some grammar rules define simple tokens directly,
e.g. 'import', 'static'
The
Uni
v
• Grammar rules also refer to tokens of the lexer, which is defined later on in Java.g 5
The Lexer in Java.g009
gLONGLITERAL: IntegerNumber LongSuffix ;
INTLITERAL: IntegerNumber ;
YEAR
20 7
32
INTLITERAL: IntegerNumber ;
fragment IntegerNumber: '0' // number zero| '1'..'9' ('0'..'9')* // decimal numbers| '0' ('0' '7')+ // octal n mbe s
CO
MP
SC
I | '0' ('0'..'7')+ // octal numbers| HexPrefix HexDigit+ ; // hexadecimal numbers
fragment HexPrefix: '0x' | '0X' ;
eala
nd
fragment HexDigit: ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment LongSuffix: 'l' | 'L' ;
ABSTRACT ' b t t' //
klan
d | N
ew Z
e
• The lexer rules come right after the parser rules( h i l l N )
ABSTRACT: 'abstract' ; // ...
vers
ity o
f Auc
k (some grammars have an optional lexer lexerName; )• Lexer rules use essentially the same syntax as parser rules
L l s s s b l s (f t l s) th t d t
The
Uni
v • Lexer rules can use subrules (fragment rules) that do not define tokens themselves but are used by other rules 6
Generating and Using L x rs nd P rs rs
009
Lexers and Parsers• Generate Java classes for parser and lexer by executing
YEAR
20 7
32
p y gclass org.antlr.Tool with command line arguments:-Xconversiontimeout 100000 -o src\pdstore\java Java.g
(timeout for backtracking) (output folder) (input)
CO
MP
SC
I (timeout for backtracking) (output folder) (input)• This generates classes JavaLexer and JavaParser, which
can be used from other classes, e.g.
eala
nd
import org.antlr.runtime.*; // ...public class Import {
public static void main(String[] args) { // ...
klan
d | N
ew Z
e public static void main(String[] args) { // ...CharStream input = new ANTLRFileStream(args[0]); JavaLexer lexer = new JavaLexer(input);CommonTokenStream tokens = new CommonTokenStream();
vers
ity o
f Auc
k ();tokens.setTokenSource(lexer);JavaParser parser = new JavaParser(tokens);
// start parsing at the compilationUnit rule
The
Uni
v
7
// start parsing at the compilationUnit ruleparser.compilationUnit(); // ...
} }
009
YEAR
20 7
32C
OM
PS
CI
Parsing Actions
eala
nd
Parsing Actions
klan
d | N
ew Z
eve
rsity
of A
uck
The
Uni
v
8
Parsing Actions009
g• We want to do things with the source code we parse
YEAR
20 7
32
• Idea: whenever we have recognized part of the language, execute some code (“action”)
CO
MP
SC
I • Actions can be at beginning (@init{ }), end (@after{ }) or anywhere else in the rule body ({ })
eala
nd
compilationUnit@init { System.out.println("Rule application has begun");}@after{ System.out.println("Rule application has ended");}
klan
d | N
ew Z
e @after{ System.out.println( Rule application has ended );}: ((annotations)? packageDeclaration
{ System.out.println("Parsed packageDeclaration");})?
vers
ity o
f Auc
k )(importDeclaration{ System.out.println("Parsed importDeclaration"); })*
The
Uni
v
9(typeDeclaration { ….println("Parsed typeDeclaration"); })*
;
Accessing and Returning V lu s fr m Rul s
009
Values from Rules• Rules are used to generate methods that can return
YEAR
20 7
32
gvalues: add returns [Type varName] after rule name
• To access return values, assign a variable d f ld
CO
MP
SC
I
var=ruleName or var=TOKEN and access its fields• The variable is declared for you by ANTLR and can be
d i ti ith $
eala
nd
accessed in actions with $var• Tokens have their text string in $var.text
klan
d | N
ew Z
e
packageDeclaration: 'package' name=qualifiedName{ System.out.println("qualifiedName="+$name.value); }
vers
ity o
f Auc
k ';' ;
qualifiedName returns [String value]: id=IDENTIFIER { $value = $id.text; }
The
Uni
v
10
: id IDENTIFIER { $value $id.text; }('.' id=IDENTIFIER { $value += "." + $id.text; } )* ;
Example: Accessing and R turnin V lu s
009
Returning ValuesThe following rule prints out the source code it parses:
YEAR
20 7
32
g p p
importDeclaration@init{ String s = "import "; }
CO
MP
SC
I
@ { g p ; }: 'import' ('static' { s += "static "; } )?
id=IDENTIFIER '.' '*' ';'{ S t t i tl ( + $id t t + " * ") }
eala
nd
{ System.out.println(s + $id.text + ".*;"); }| 'import' ('static' { s += "static "; } )?
id=IDENTIFIER
klan
d | N
ew Z
e
{ s += $id.text; }('.' id=IDENTIFIER { s += "." + $id.text; } )+(' ' '*' { s += " *"; } )?
vers
ity o
f Auc
k ( . { s += . ; } )?';'{ System.out.println(s + ";"); }
The
Uni
v
;11
Debugging Parsing Actions009
gg g g• ANTLR will not check the Java code in the actions,
YEAR
20 7
32
i.e. the generated class might contain errors• Eclipse’s compiler will show you syntax errors after
l di h d j fil (F5 f l d)
CO
MP
SC
I reloading the generated .java file (F5 for reload)• For each rule, ANTLR will generate a method with the
l
eala
nd
rule namepublic final PDJavaImport importDeclaration() throws
importDeclaration returns [PDJavaImport value]
klan
d | N
ew Z
e importDeclaration() throws RecognitionException {…if (state.backtracking==0 )
[PDJavaImport value] …: 'import' ('static')?id=IDENTIFIER '.' '*' ';'{ ANTLR
vers
ity o
f Auc
k ( g ){PDJavaPackage package = …
}
{PDJavaPackage package = …
}…
The
Uni
v
12…
}; Error: package
is a Java keyword
Example:Buildin n AST
009
Building an ASTIdea: each rule returns AST node and gets the returned AST
n d s f th l s it s sYEAR
20 7
32
nodes of the rules it usescompilationUnit returns [JavaCompilationUnit value]@init {
CO
MP
SC
I @init {$value = new JavaCompilationUnit();
} : ( (annotations)?
eala
nd
packageDecl=packageDeclaration{ $value.setPackage($packageDecl.value); }
)?
klan
d | N
ew Z
e )?(importDecl=importDeclaration{ $value.addImports($importDecl.value); }
vers
ity o
f Auc
k )*(typeDecl=typeDeclaration{ $value.addTypeDefinitions($typeDecl.value); }
The
Uni
v
)* ;13
Building an AST Cont’d009
gimportDeclaration returns [JavaImport value]@init {
YEAR
20 7
32
$value = new JavaImport();
String name = null;boolean isPackage false;
CO
MP
SC
I boolean isPackage = false;} : …| 'import' ('static')? id=IDENTIFIER { name = $id.text; }
eala
nd
('.' id=IDENTIFIER { name += "." + $id.text; } )+('.' '*' { isPackage = true; } )? ';'{
klan
d | N
ew Z
e {if (isPackage) {JavaPackage p = new JavaPackage();
vers
ity o
f Auc
k p.setName(name); $value.setPackage(p);} else {JavaType type = new JavaType();
The
Uni
v
type.setName(name); $value.setType(type);} } ;
14
009
YEAR
20 7
32C
OM
PS
CI
Generators
eala
nd
Generators
klan
d | N
ew Z
eve
rsity
of A
uck
The
Uni
v
15
Writing Generators009
g
• Generators traverse the AST that was generated by YEAR
20 7
32
g ythe parser
• For each AST node, they generate some output
CO
MP
SC
I
• Easy way to implement:– For important AST node types, write a generator
eala
nd
p yp gmethod
– Method for AST node type X calls other methods
klan
d | N
ew Z
e
to do generation for child node types of X• Examples:
vers
ity o
f Auc
k
– Source code printer– Source code converter (i.e. print another language)
The
Uni
v
16
Java Printer009 public class JavaPrinter {
PrintStream s;YEAR
20 7
32
PrintStream s;
public JavaPrinter(OutputStream out) {s = new PrintStream(out);
CO
MP
SC
I
}
public void printCompilationUnit(Ja aCompilationUnit compilationUnit) {
eala
nd
JavaCompilationUnit compilationUnit) {s.println("package " +
compilationUnit.getPackage().getName() + ";");
klan
d | N
ew Z
e
s.println(); // use separate method to print importsfor (JavaImport i : compilationUnit.getImports())printImport(i);
vers
ity o
f Auc
k printImport(i);s.println(); // use separate method to print typesfor (JavaType t : compilationUnit.getTypeDefinitions())
The
Uni
v printType(t);} … 17
Java Printer Cont.009 public void printImport(JavaImport javaImport) {
if (javaImport getPackage() != null)YEAR
20 7
32
if (javaImport.getPackage() != null)s.println("import " + javaImport.getPackage().getName()
+ ".*;");
CO
MP
SC
I else if (javaImport.getType() != null)s.println("import " + javaImport.getType().getName()
+ ";");
eala
nd
}
public void printType(JavaType type) {
klan
d | N
ew Z
e
if (type.getJavaInterface() != null)s.println("interface " + type.getJavaInterface().getName()
+ " { }");
vers
ity o
f Auc
k + { ... } );else if (type.getJavaClass() != null)s.println("class " + type.getJavaClass().getName()
The
Uni
v + " { ... }");} 18
Using the Java Printer009
gpublic class Import {public static void main(String[] args) {
YEAR
20 7
32
public static void main(String[] args) { …// Create a parser that reads from the token streamJavaParser parser = new JavaParser(tokens);
CO
MP
SC
I
// start parsing at the compilationUnit ruleJavaCompilationUnit compilationUnit =
eala
nd
parser.compilationUnit();
// set up a JavaPrinter that prints to the standard outputi i i ( )
klan
d | N
ew Z
e JavaPrinter printer = new JavaPrinter(System.out);
// print the ASTprinter printCompilationUnit(compilationUnit);
vers
ity o
f Auc
k printer.printCompilationUnit(compilationUnit);…
} }
The
Uni
v
19
Today’s Summary009
y y
• ANTLR is a tool that can generate LL(k) parsers and YEAR
20 7
32
g ( ) plexers in Java
• By adding actions to a parser rule, Java code can be
CO
MP
SC
I executed after something has been parsed• Actions can create ASTs
eala
nd
• Generators traverse an AST and produce output recursively for each AST node
klan
d | N
ew Z
e
References:– ANTLR Homepage with Online Documentation.
vers
ity o
f Auc
k p ghttp://www.antlr.org/
– Scott Stanchfield. An ANTLR 2.0 Tutorial.http://javadude com/articles/antlrtut/
The
Uni
v
21
http://javadude.com/articles/antlrtut/