antlr4 literal string handling -


i have following antlr4 grammar:

grammar squirrel;  program: globalstatement+;  globalstatement: globalvardef | classdef | functiondef;  globalvardef: ident '=' constantexpr ';';  classdef: class ident '{' classstatement+ '}';  functiondef: function ident '(' parameterlist ')' functionbody;  constructordef: constructor '(' parameterlist ')' functionbody;  parameterlist: ident (',' ident)* | ;  functionbody: '{' statement* '}';  classstatement: globalvardef | functiondef | constructordef;  statement: expression ';';   expression:      ident # ident |     ident '=' expression # assignment |     ident ('.' ident)+ # lookupchain |     constantexpr # constant |     ident '(' expressionlist ')' # functioncall |     expression '+' expression # addition;  constantexpr: integer | string;  expressionlist: expression (',' expression)* | ;  constructor: 'constructor'; class: 'class'; function: 'function'; comment: '//'.*[\n]; string: '"' char* '"'; char: [ a-za-z0-9]; integer: [0-9]+; ident: [a-za-z]+; ws: [ \t\r\n]+ -> skip; 

now if parse file:

z = "global variable";  class base {     z = 10; } 

everything fine:

@0,0:0='z',<16>,1:0 @1,2:2='=',<1>,1:2 @2,4:20='"global variable"',<14>,1:4 @3,21:21=';',<2>,1:21 @4,26:30='class',<11>,3:0 @5,32:35='base',<16>,3:6 @6,38:38='{',<3>,4:0 @7,42:42='z',<16>,5:1 @8,44:44='=',<1>,5:3 @9,46:47='10',<15>,5:5 @10,48:48=';',<2>,5:7 @11,51:51='}',<4>,6:0 @12,56:55='<eof>',<-1>,8:0 

but file:

z = "global variable";  class base {     z = "10"; } 

i this:

@0,0:0='z',<16>,1:0 @1,2:2='=',<1>,1:2 @2,4:49='"global variable";\r\n\r\nclass base\r\n{\r\n\tz = "10"',<14>,1:4 @3,50:50=';',<2>,5:9 @4,53:53='}',<4>,6:0 @5,58:57='<eof>',<-1>,8:0 

so seems between first " , last " in file gets matched 1 string literal.

how prevent ?

note string matching first quote last possible quote.

by default, kleene operator (*) in antlr greedy. so, change

string: '"' char* '"'; 

to

string: '"' char*? '"'; 

to make non-greedy.


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -