ANTLR 4动作和属性

1 动作和属性

动作是花括号包围的文本,执行时机取决于出现的位置。

1
2
3
4
5
6
7
8
9
10
# 遇到合法声明时输出打印
decl: type ID ';' {System.out.println("found a decl");} ;
type: 'int' | 'float' ;

# 访问记号和引用规则
decl: type ID ';'
{System.out.println("var "+$ID.text+":"+$type.text+";");}
| t=ID id=ID ';'
{System.out.println("var "+$id.text+":"+$t.text+";");}
;

2 记号属性

每个记号都有一系列预定义、只读的属性。可以直接引用记号。为了避免歧义,引用多次的可以使用标签,如$label.attribute

1
2
3
4
r : INT {int x = $INT.line;}
( ID {if ($INT.line == $ID.line) ...;} )?
a=FLOAT b=FLOAT {if ($a.line == $b.line) ...;}
;

注意:圆括号中的子规则可以访问外部的INT记号。

不同的选项中,记号引用也是唯一的,因此可以直接引用记号。

1
2
3
r : ... ID {System.out.println($ID.text);}
| ... ID {System.out.println($ID.text);}
;

访问字面量的记号,必须使用标签。

1
stat: r='return' expr ';' {System.out.println("line="+$r.line);} ;

允许访问记号对象,以获取所有的属性。

1
2
3
4
stat: 'if' expr 'then' stat (el='else' stat)?
{if ( $el!=null ) System.out.println("found an else");}
| ...
;

el是字面量else的标签。

预定义的属性有:

Attribute Type Description
text String The text matched for the token; translates to a call to getText. Example: $ID.text.
type int The token type (nonzero positive integer) of the token such as INT; translates to a call to getType. Example: $ID.type.
line int The line number on which the token occurs, counting from 1; translates to a call to getLine. Example: $ID.line.
pos int The character position within the line at which the token’s first character occurs counting from zero; translates to a call togetCharPositionInLine. Example: $ID.pos.
index int The overall index of this token in the token stream, counting from zero; translates to a call to getTokenIndex. Example: $ID.index.
channel int The token’s channel number. The parser tunes to only one channel, effectively ignoring off-channel tokens. The default channel is 0 (Token.DEFAULT_CHANNEL), and the default hidden channel is Token.HIDDEN_CHANNEL. Translates to a call to getChannel. Example: $ID.channel.
int int The integer value of the text held by this token; it assumes that the text is a valid numeric string. Handy for building calculators and so on. Translates to Integer.valueOf(text-of-token). Example: $INT.int.

3 解析规则属性

ANTLR预定义了一系列规则相关的、只读的属性,用于被动作访问。

注意只能访问动作之前的属性。

可以直接引用规则或者使用标签,或者直接引用属性名获取当前执行的规则的属性。

1
2
3
4
5
6
# 引用规则
returnStat : 'return' expr {System.out.println("matched "+$expr.text);} ;
# 使用标签
returnStat : 'return' e=expr {System.out.println("matched "+e.text);} ;
# 引用当前执行规则的属性
returnStat : 'return' expr {System.out.println("first token "+$start.getText());} ;

相关属性有:

Attribute Type Description
text String The text matched for a rule or the text matched from the start of the rule up until the point of the $text expression evaluation. Note that this includes the text for all tokens including those on hidden channels, which is what you want because usually that has all the whitespace and comments. When referring to the current rule, this attribute is available in any action including any exception actions.
start Token The first token to be potentially matched by the rule that is on the main token channel; in other words, this attribute is never a hidden token. For rules that end up matching no tokens, this attribute points at the first token that could have been matched by this rule. When referring to the current rule, this attribute is available to any action within the rule.
stop Token The last nonhidden channel token to be matched by the rule. When referring to the current rule, this attribute is available only to the after and finally actions.
ctx ParserRuleContext The rule context object associated with a rule invocation. All of the other attributes are available through this attribute. For example, $ctx.start accesses the start field within the current rules context object. It’s the same as $start.

4 动态范围属性

通常编程语言使用作用域限定本地变量的作用范围。

动态范围是指允许访问事先定义的,调用链上层的本地变量。

使用$currentRule::attribute访问当前规则中的属性。

以下示例用于检查变量是否提前定义。其中在声明和规则stat中引用了block中的变量symbols。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
grammar DynScope;

prog: block ;

block
/* List of symbols defined within this block */
locals [
List<String> symbols = new ArrayList<String>()
]
: '{' decl* stat+ '}'
// print out all symbols found in block
// $block::symbols evaluates to a List as defined in scope
{System.out.println("symbols="+$symbols);}
;

/** Match a declaration and add identifier name to list of symbols */
decl: 'int' ID {$block::symbols.add($ID.text);} ';' ;

/** Match an assignment then test list of symbols to verify
* that it contains the variable on the left side of the assignment.
* Method contains() is List.contains() because $block::symbols
* is a List.
*/
stat: ID '=' INT ';'
{
if ( !$block::symbols.contains($ID.text) ) {
System.err.println("undefined variable: "+$ID.text);
}
}
| block
;

ID : [a-z]+ ;
INT : [0-9]+ ;
WS : [ \t\r\n]+ -> skip ;

动态范围属性和@members动作中定义的字段的区别是:动态范围属性是本地变量,对于每一个调用都有一个副本,会被更内部的同名变量覆盖。

因此,需要使用当前的上下文对象$ctx.getParent递归调用上层的同名属性。

参考资料