Pascalito Veni, Vidi, Vomi ! | Bonjour,
Je débute en ANTLR et je compte sur vous pour éclaircir quelques points durs...
Mon but est de parser des fichiers C pour en récupérer les typedef et structures...
J'ai donc récupéré la grammaire C.g sur ANTLR que vous pouvez trouver en fin de message.
Je travaille en C#.
[B][U]Question 1 :[/U][/B]
J 'ai quelques warnings à la génération des fichiers Lexer et Parser. Voici les logs :
Code :
- ANTLR Parser Generator Version 3.0 (May 17, 2007) 1989-2007
- warning(200): C.g:468:38: Decision can match input such as "'else'" using multiple alternatives: 1, 2
- As a result, alternative(s) 2 were disabled for that input
- warning(200): C.g:517:4: Decision can match input such as "{'U', 'u'}{'L', 'l'}" using multiple alternatives: 1, 2
- As a result, alternative(s) 2 were disabled for that input
- warning(200): C.g:522:9: Decision can match input such as "'0'..'9'{'E', 'e'}{'+', '-'}'0'..'9'{'D', 'F', 'd', 'f'}" using multiple alternatives: 3, 4
- As a result, alternative(s) 4 were disabled for that input
|
Quelqu'un a une idée ?
[B][U]Question 2 :[/U][/B]
J'aimerais afficher à chaque itération la chaine sur la laquelle on travaille. Comment puis je faire ?
Il faut rajouter un {Console.WriteLIne()} dans la grammaire mais où et avec quoi comme paramètre ?
[B][U]Question 3 :[/U][/B]
Quelqu'un pourrait-il m'expliquer les deux extraitrs suivants car là je suis dépassé...
[U]Extrait 1 :[/U]
Code :
- translation_unit
- : external_declaration+
- ;
- external_declaration
- options {k=1;}
- : ( declaration_specifiers? declarator declaration* '{' )=> function_definition
- | declaration
- ;
|
[U]Extrait 2 :[/U]
Là il commence à y avoir des ? et je ne comprends plus le fond de la règle...
Code :
- declaration
- scope {
- bool isTypedef;
- }
- @init {
- $declaration::isTypedef = false;
- }
- : 'typedef' declaration_specifiers? {$declaration::isTypedef=true;}
- init_declarator_list ';' // special case, looking for typedef
- | declaration_specifiers init_declarator_list? ';'
- ;
- declaration_specifiers
- : ( storage_class_specifier
- | type_specifier
- | type_qualifier
- )+
- ;
- init_declarator_list
- : init_declarator (',' init_declarator)*
- ;
|
Merci beaucoup !
Pascal
[U]Annexe :[/U]
Grammaire utilisée :
Code :
- /** ANSI C ANTLR v3 grammar
-
- Translated from Jutta Degener's 1995 ANSI C yacc grammar by Terence Parr
- July 2006. The lexical rules were taken from the Java grammar.
-
- Jutta says: "In 1985, Jeff Lee published his Yacc grammar (which
- is accompanied by a matching Lex specification) for the April 30, 1985 draft
- version of the ANSI C standard. Tom Stockfisch reposted it to net.sources in
- 1987; that original, as mentioned in the answer to question 17.25 of the
- comp.lang.c FAQ, can be ftp'ed from ftp.uu.net,
- file usenet/net.sources/ansi.c.grammar.Z.
- I intend to keep this version as close to the current C Standard grammar as
- possible; please let me know if you discover discrepancies. Jutta Degener, 1995"
-
- Generally speaking, you need symbol table info to parse C; typedefs
- define types and then IDENTIFIERS are either types or plain IDs. I'm doing
- the min necessary here tracking only type names. This is a good example
- of the use of the global scope (called Symbols). Every rule that declares its usage
- of Symbols pushes a new copy on the stack effectively creating a new
- symbol scope. Also note rule declaration declares a rule scope that
- lets any invoked rule see isTypedef boolean. It's much easier than
- passing that info down as parameters. Very clean. Rule
- direct_declarator can then easily determine whether the IDENTIFIER
- should be declared as a type name.
-
- I have only tested this on a single file, though it is 3500 lines.
-
- This grammar requires ANTLR v3 (3.0b3 or higher)
-
- Terence Parr
- July 2006
-
- ANTLR C# version - Kunle Odutola, November 2006
- */
- grammar C;
- options {
- language=CSharp;
- backtrack=true;
- memoize=true;
- k=2;
- }
- scope Symbols {
- IDictionary types;
- }
- @header {
- }
- @members {
- bool isTypeName(string name)
- {
- for (int i = Symbols_stack.Count-1; i>=0; i--)
- {
- Symbols_scope scope = (Symbols_scope)Symbols_stack[i];
- if ( scope.types.Contains(name) )
- {
- return true;
- }
- }
- return false;
- }
- bool isfinStruct(string token)
- {
- if (token.ToString() == "}" )
- {
- return true;
- }
- return false;
- }
- }
- translation_unit
- scope Symbols; // entire file is a scope
- @init {
- $Symbols::types = new Hashtable();
- }
- : external_declaration+
- ;
- /** Either a function definition or any other kind of C decl/def.
- * The LL(*) analysis algorithm fails to deal with this due to
- * recursion in the declarator rules. I'm putting in a
- * manual predicate here so that we don't backtrack over
- * the entire function. Further, you get a better error
- * as errors within the function itself don't make it fail
- * to predict that it's a function. Weird errors previously.
- * Remember: the goal is to avoid backtrack like the plague
- * because it makes debugging, actions, and errors harder.
- *
- * Note that k=1 results in a much smaller predictor for the
- * fixed lookahead; k=2 made a few extra thousand lines. ;)
- * I'll have to optimize that in the future.
- */
- external_declaration
- options {k=1;}
- : ( declaration_specifiers? declarator declaration* '{' )=> function_definition
- | declaration
- ;
- function_definition
- scope Symbols; // put parameters and locals into same scope for now
- @init {
- $Symbols::types = new Hashtable();
- }
- : declaration_specifiers? declarator
- ( declaration+ compound_statement // K&R style
- | compound_statement // ANSI style
- )
- ;
- declaration
- scope {
- bool isTypedef;
- }
- @init {
- $declaration::isTypedef = false;
- }
- : 'typedef' declaration_specifiers? {$declaration::isTypedef=true;}
- init_declarator_list ';' // special case, looking for typedef
- | declaration_specifiers init_declarator_list? ';'
- ;
- declaration_specifiers
- : ( storage_class_specifier
- | type_specifier
- | type_qualifier
- )+
- ;
- init_declarator_list
- : init_declarator (',' init_declarator)*
- ;
- init_declarator
- : declarator ('=' initializer)?
- ;
- storage_class_specifier
- : 'extern'
- | 'static'
- | 'auto'
- | 'register'
- ;
- type_specifier
- : 'void'
- | 'char'
- | 'short'
- | 'int'
- | 'long'
- | 'float'
- | 'double'
- | 'signed'
- | 'unsigned'
- | struct_or_union_specifier
- | enum_specifier
- | type_id
- ;
- type_id
- : {isTypeName(input.LT(1).Text)}? IDENTIFIER
- {Console.Out.WriteLine("\t" + input.LT(-1).Text + " " + input.LT(1).Text);}
- ;
- struct_or_union_specifier
- options {k=3;}
- scope Symbols; // structs are scopes
- @init {
- $Symbols::types = new Hashtable();
- }
- : struct_or_union IDENTIFIER? '{' struct_declaration_list '}'
- | struct_or_union IDENTIFIER
- ;
- struct_or_union
- : 'struct' {Console.Out.WriteLine("\r\nStruct\r\n{" );}
- | 'union' {Console.Out.WriteLine("\r\nUnion\r\n{" );}
- ;
- struct_declaration_list
- : struct_declaration+
- ;
- struct_declaration
- : specifier_qualifier_list struct_declarator_list ';'
- ;
- specifier_qualifier_list
- : ( type_qualifier | type_specifier )+
- ;
- struct_declarator_list
- : struct_declarator (',' struct_declarator)*
- ;
- struct_declarator
- : declarator (':' constant_expression)?
- | ':' constant_expression
- ;
- enum_specifier
- options {k=3;}
- : 'enum' '{' enumerator_list '}'
- | 'enum' IDENTIFIER '{' enumerator_list '}'
- | 'enum' IDENTIFIER
- ;
- enumerator_list
- : enumerator (',' enumerator)*
- ;
- enumerator
- : IDENTIFIER ('=' constant_expression)?
- ;
- type_qualifier
- : 'const'
- | 'volatile'
- ;
- declarator
- : pointer? direct_declarator
- | pointer
- ;
- direct_declarator
- : ( IDENTIFIER
- {
- if ($declaration.Count>0 && $declaration::isTypedef) {
- $Symbols::types[$IDENTIFIER.Text] = $IDENTIFIER.Text;
- Console.Out.WriteLine("using " + input.LT(-1).Text + " = System." + input.LT(-2).Text + ";" );
- }
- }
- | '(' declarator ')'
- )
- declarator_suffix*
- ;
- declarator_suffix
- : '[' constant_expression ']'
- | '[' ']'
- | '(' parameter_type_list ')'
- | '(' identifier_list ')'
- | '(' ')'
- ;
- pointer
- : '*' type_qualifier+ pointer?
- | '*' pointer
- | '*'
- ;
- parameter_type_list
- : parameter_list (',' '...')?
- ;
- parameter_list
- : parameter_declaration (',' parameter_declaration)*
- ;
- parameter_declaration
- : declaration_specifiers (declarator|abstract_declarator)*
- ;
- identifier_list
- : IDENTIFIER (',' IDENTIFIER)*
- ;
- type_name
- : specifier_qualifier_list abstract_declarator?
- ;
- abstract_declarator
- : pointer direct_abstract_declarator?
- | direct_abstract_declarator
- ;
- direct_abstract_declarator
- : ( '(' abstract_declarator ')' | abstract_declarator_suffix ) abstract_declarator_suffix*
- ;
- abstract_declarator_suffix
- : '[' ']'
- | '[' constant_expression ']'
- | '(' ')'
- | '(' parameter_type_list ')'
- ;
- initializer
- : assignment_expression
- | '{' initializer_list ','? '}'
- ;
- initializer_list
- : initializer (',' initializer)*
- ;
- // E x p r e s s i o n s
- argument_expression_list
- : assignment_expression (',' assignment_expression)*
- ;
- additive_expression
- : (multiplicative_expression) ('+' multiplicative_expression | '-' multiplicative_expression)*
- ;
- multiplicative_expression
- : (cast_expression) ('*' cast_expression | '/' cast_expression | '%' cast_expression)*
- ;
- cast_expression
- : '(' type_name ')' cast_expression
- | unary_expression
- ;
- unary_expression
- : postfix_expression
- | '++' unary_expression
- | '--' unary_expression
- | unary_operator cast_expression
- | 'sizeof' unary_expression
- | 'sizeof' '(' type_name ')'
- ;
- postfix_expression
- : primary_expression
- ( '[' expression ']'
- | '(' ')'
- | '(' argument_expression_list ')'
- | '.' IDENTIFIER
- | '*' IDENTIFIER
- | '->' IDENTIFIER
- | '++'
- | '--'
- )*
- ;
- unary_operator
- : '&'
- | '*'
- | '+'
- | '-'
- | '~'
- | '!'
- ;
- primary_expression
- : IDENTIFIER
- | constant
- | '(' expression ')'
- ;
- constant
- : HEX_LITERAL
- | OCTAL_LITERAL
- | DECIMAL_LITERAL
- | CHARACTER_LITERAL
- | STRING_LITERAL
- | FLOATING_POINT_LITERAL
- ;
- /////
- expression
- : assignment_expression (',' assignment_expression)*
- ;
- constant_expression
- : conditional_expression
- ;
- assignment_expression
- : lvalue assignment_operator assignment_expression
- | conditional_expression
- ;
- lvalue
- : unary_expression
- ;
- assignment_operator
- : '='
- | '*='
- | '/='
- | '%='
- | '+='
- | '-='
- | '<<='
- | '>>='
- | '&='
- | '^='
- | '|='
- ;
- conditional_expression
- : logical_or_expression ('?' expression ':' conditional_expression)?
- ;
- logical_or_expression
- : logical_and_expression ('||' logical_and_expression)*
- ;
- logical_and_expression
- : inclusive_or_expression ('&&' inclusive_or_expression)*
- ;
- inclusive_or_expression
- : exclusive_or_expression ('|' exclusive_or_expression)*
- ;
- exclusive_or_expression
- : and_expression ('^' and_expression)*
- ;
- and_expression
- : equality_expression ('&' equality_expression)*
- ;
- equality_expression
- : relational_expression (('=='|'!=') relational_expression)*
- ;
- relational_expression
- : shift_expression (('<'|'>'|'<='|'>=') shift_expression)*
- ;
- shift_expression
- : additive_expression (('<<'|'>>') additive_expression)*
- ;
- // S t a t e m e n t s
- statement
- : labeled_statement
- | compound_statement
- | expression_statement
- | selection_statement
- | iteration_statement
- | jump_statement
- ;
- labeled_statement
- : IDENTIFIER ':' statement
- | 'case' constant_expression ':' statement
- | 'default' ':' statement
- ;
- compound_statement
- scope Symbols; // blocks have a scope of symbols
- @init {
- $Symbols::types = new Hashtable();
- }
- : '{' declaration* statement_list? '}'
- ;
- statement_list
- : statement+
- ;
- expression_statement
- : ';'
- | expression ';'
- ;
- selection_statement
- : 'if' '(' expression ')' statement (options {k=1; backtrack=false;}:'else' statement)?
- | 'switch' '(' expression ')' statement
- ;
- iteration_statement
- : 'while' '(' expression ')' statement
- | 'do' statement 'while' '(' expression ')' ';'
- | 'for' '(' expression_statement expression_statement expression? ')' statement
- ;
- jump_statement
- : 'goto' IDENTIFIER ';'
- | 'continue' ';'
- | 'break' ';'
- | 'return' ';'
- | 'return' expression ';'
- ;
- IDENTIFIER
- : LETTER (LETTER|'0'..'9')*
- ;
- fragment
- LETTER
- : '$'
- | 'A'..'Z'
- | 'a'..'z'
- | '_'
- ;
- CHARACTER_LITERAL
- : '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
- ;
- STRING_LITERAL
- : '"' ( EscapeSequence | ~('\\'|'"') )* '"'
- ;
- HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
- DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
- OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ;
- fragment
- HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
- fragment
- IntegerTypeSuffix
- : ('u'|'U')? ('l'|'L')
- | ('u'|'U') ('l'|'L')?
- ;
- FLOATING_POINT_LITERAL
- : ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
- | '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
- | ('0'..'9')+ Exponent FloatTypeSuffix?
- | ('0'..'9')+ Exponent? FloatTypeSuffix
- ;
- fragment
- Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
- fragment
- FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
- fragment
- EscapeSequence
- : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
- | OctalEscape
- ;
- fragment
- OctalEscape
- : '\\' ('0'..'3') ('0'..'7') ('0'..'7')
- | '\\' ('0'..'7') ('0'..'7')
- | '\\' ('0'..'7')
- ;
- fragment
- UnicodeEscape
- : '\\' 'u' HexDigit HexDigit HexDigit HexDigit
- ;
- WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
- ;
- COMMENT
- : '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
- ;
- LINE_COMMENT
- : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
- ;
- // ignore #line info for now
- LINE_COMMAND
- : '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
- ;
|
|