Generating Intermediate Code Using Syntax-Directed Translation in ANTLR

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

So, this question isn't necessarily a problem I have, but rather a lack of understanding.

I have this ANTLR code (which comprises of a parser and lexer):

grammar Compiler;



prog

: Class Program '{' field_decls method_decls '}'

;



field_decls returns [String s1]

: field_decls field_decl ';'

{

  $s1 = $field_decl.s2;

}

| field_decls inited_field_decl ';'

|

;



field_decl returns [String s2]

: field_decl ',' Ident

| field_decl ',' Ident '[' num ']'

| Type Ident

{

  System.out.println($Ident.text);

  $s2 = $Ident.text;

}

| Type Ident '[' num ']'

{

  System.out.println($Ident.text+"["+"]");

  $s2 = $Ident.text;

}

;



inited_field_decl

: Type Ident '=' literal

;



method_decls

: method_decls method_decl

|

;



method_decl

: Void Ident '(' params ')' block

| Type Ident '(' params ')' block

;



params

: Type Ident nextParams

|

;



nextParams

: ',' Type Ident nextParams

|

;



block

: '{' var_decls statements '}'

;



var_decls

: var_decls var_decl

|

;



var_decl

: Type Ident ';'

;



statements

: statement statements

|

;



statement

: location eqOp expr ';'

| If '(' expr ')' block

| If '(' expr ')' block Else block

| While '(' expr ')' statement

| Switch expr '{' cases '}'

| Ret ';'

| Ret '(' expr ')' ';'

| Brk ';'

| Cnt ';'

| block

| methodCall ';'

;



cases

: Case literal ':' statements cases

| Case literal ':' statements

;



methodCall

: Ident '(' args ')'

| Callout '(' Str calloutArgs ')'

;



args

: someArgs

|

;



someArgs

: someArgs ',' expr

| expr

;



calloutArgs

: calloutArgs ',' expr

| calloutArgs ',' Str

|

;



expr

: literal

| location

| '(' expr ')'

| SubOp expr

| '!' expr

| expr AddOp expr

| expr MulDiv expr

| expr SubOp expr

| expr RelOp expr

| expr AndOp expr

| expr OrOp expr

| methodCall

;



location

:Ident

| Ident '[' expr ']'

;



num

: DecNum

| HexNum

;



literal

: num

| Char

| BoolLit

;



eqOp

: '='

| AssignOp

;



//-----------------------------------------------------------------------------------------------------------

fragment Delim

: ' '

| 't'

| 'n'

;



fragment Letter

: [a-zA-Z]

;



fragment Digit

: [0-9]

;



fragment HexDigit

: Digit

| [a-f]

| [A-F]

;



fragment Alpha

: Letter

| '_'

;



fragment AlphaNum

: Alpha

| Digit

;



WhiteSpace

: Delim+ -> skip

;



Char

: ''' ~('\') '''

| ''\' . '''

;



Str

:'"' ((~('\' | '"')) | ('\'.))* '"'

;



Class

: 'class'

;



Program

: 'Program'

;



Void

: 'void'

;



If

: 'if'

;



Else

: 'else'

;



While

: 'while'

;



Switch

: 'switch'

;



Case

: 'case'

;



Ret

: 'return'

;



Brk

: 'break'

;



Cnt

: 'continue'

;



Callout

: 'callout'

;



DecNum

: Digit+

;



HexNum

: '0x'HexDigit+

;



BoolLit

: 'true'

| 'false'

;



Type

: 'int'

| 'boolean'

;



Ident

: Alpha AlphaNum*

;



RelOp

: '<='

| '>='

| '<'

| '>'

| '=='

| '!='

;



AssignOp

: '+='

| '-='

;



MulDiv

: '*'

| '/'

| '%'

;



AddOp

: '+'

;



SubOp

: '-'

;



AndOp

: '&&'

;



OrOp

: '||'

;

And basically, we need to generate intermediate code using syntax directed translation. By my knowledge, this means that we must add semantic rules to the parser grammar. We need to take the output generated and encapsulate it into .csv files.

So, we have three files: symbols.csv, symtable.csv and instructions.csv

In symbols.csv, the format of each row is:

int id; //serial no. of symbol, unique

int tabid; //id no. of symbol table

string name; //symbol name

enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type

enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope

boolean isArray; //is it an array variable

int arrSize; //array size, if applicable

boolean isInited; //is initialized

union initVal {

    int i;

    boolean b;

} in; //initial value, if applicable

In symtable.csv, the format of each row is:

int id; //symbol table serial no., unique

int parent; //parent symbol table serial no.

In instructions.csv, the format of each row is:

int id; //serial no., unique

int res; //serial no. of result symbol

enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type

int op1; //serial no. of first operand symbol

int op2; //serial no. of second operand symbol

As an example, let's say we have this input:

class Program {

    int x;

    int y, z;

    int w = 0;

    void main (int n) {

        int a;

        a = 0;

        while (a < n) {

            int n;

            n = a + 1;

            a = n;

        }

        callout("printf", "n = %dn", n);

        return n;

    }

}

symbols.csv should look like this:

0, 0, x, INT, GLOBAL, false, 0, false, 0,

1, 0, y, INT, GLOBAL, false, 0, false, 0,

2, 0, z, INT, GLOBAL, false, 0, false, 0,

3, 0, 0, INT, CONST, false, 0, false, 0,

4, 0, w, INT, GLOBAL, false, 0, true, 0,

5, 0, main, LABEL, GLOBAL, false, 0, false, 0,

6, 1, n, INT, LOCAL, false, 0, false, 0,

7, 1, a, INT, LOCAL, false, 0, false, 0,

8, 1, 0, INT, CONST, false, 0, false, 0,

9, 2, n, INT, LOCAL, false, 0, false, 0,

10, 2, 1, INT, CONST, false, 0, false, 0,

11, 1, "printf", STR, CONST, false, 0, false, 0,

12, 1, "n = %dn", STR, CONST, false, 0, false, 0,

13, 1, 2, INT, CONST, false, 0, false, 0,

symtables.csv should look like this:

0, -1,

1, 0,

2, 1,

instructions.csv should look like this:

0, 4, ASSIGN, 3, -1, #w = 0

1, 5, LABEL, -1, -1, #main:

2, 7, ASSIGN, 8, -1, #a = 0

3, 5, LT, 7, 6, #if a<n goto 5

4, 8, GE, 7, 6, #iffalse a<n goto 8

5, 9, ADD, 7, 10, #n = a + 1

6, 7, ASSIGN, 9, -1, #a = n

7, 2, GOTO, -1, -1, #goto 3

8, -1, PARAM, 12, -1, #"n = %dn"

9, -1, PARAM, 6, -1, #n

10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);

11, -1, RET, 6, -1, # return n

Simply put, I am not sure exactly where to start. I understand that I must add semantic rules to my parser grammar so that I can have output such as the ones I have previously stated. Furthermore, I have done some research on my own and discovered that I must create classes in java for my symbols and symtable and symstack. I am very new to ANTLR and would appreciate it if someone experienced in ANTLR could point me in the right direction.

Thank you in advance for any help.

P.S My lexer and parser are based off a tiny C-like language that is posted below.

Tiny C-Like Language:

program

:'class Program {'field_decl* method_decl*'}'



field_decl

: type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'

| type id '=' literal ';'



method_decl

: (type | 'void') id'('( (type id) ( ','type id)*)? ')'block



block

: '{'var_decl* statement*'}'



var_decl

: type id(','id)* ';'



type

: 'int'

| 'boolean'



statement

: location assign_op expr';'

| method_call';'

| 'if ('expr')' block ('else' block  )?

| 'switch' expr '{'('case' literal ':' statement*)+'}'

| 'while (' expr ')' statement

| 'return' ( expr )? ';'

| 'break ;'

| 'continue ;'

| block



assign_op

: '='

| '+='

| '-='



method_call

: method_name '(' (expr ( ',' expr )*)? ')'

| 'callout (' string_literal ( ',' callout_arg )* ')'



method_name

: id



location

: id

| id '[' expr ']'



expr

: location

| method_call

| literal

| expr bin_op expr

| '-' expr

| '!' expr

| '(' expr ')'



callout_arg

: expr

| string_literal



bin_op

: arith_op

| rel_op

| eq_op

| cond_op



arith_op

: '+'

| '-'

| '*'

| '/'

| '%'



rel_op

: '<'

| '>'

| '<='

| '>='



eq_op

: '=='

| '!='



cond_op

: '&&'

| '||'



literal

: int_literal

| char_literal

| bool_literal



id

: alpha alpha_num*



alpha

: ['a'-'z''A'-'Z''_']



alpha_num

: alpha

| digit 



digit

: ['0'-'9']



hex_digit

: digit

| ['a'-'f''A'-'F']



int_literal

: decimal_literal

| hex_literal



decimal_literal

: digit+



hex_literal

: '0x' hex_digit+



bool_literal

: 'true'

| 'false'



char_literal

: '‘'char'’'



string_literal

: '“'char*'”'

asked Nov 24 '18 at 6:40

J.Khelly

125

add a comment |

So, this question isn't necessarily a problem I have, but rather a lack of understanding.

I have this ANTLR code (which comprises of a parser and lexer):

grammar Compiler;



prog

: Class Program '{' field_decls method_decls '}'

;



field_decls returns [String s1]

: field_decls field_decl ';'

{

  $s1 = $field_decl.s2;

}

| field_decls inited_field_decl ';'

|

;



field_decl returns [String s2]

: field_decl ',' Ident

| field_decl ',' Ident '[' num ']'

| Type Ident

{

  System.out.println($Ident.text);

  $s2 = $Ident.text;

}

| Type Ident '[' num ']'

{

  System.out.println($Ident.text+"["+"]");

  $s2 = $Ident.text;

}

;



inited_field_decl

: Type Ident '=' literal

;



method_decls

: method_decls method_decl

|

;



method_decl

: Void Ident '(' params ')' block

| Type Ident '(' params ')' block

;



params

: Type Ident nextParams

|

;



nextParams

: ',' Type Ident nextParams

|

;



block

: '{' var_decls statements '}'

;



var_decls

: var_decls var_decl

|

;



var_decl

: Type Ident ';'

;



statements

: statement statements

|

;



statement

: location eqOp expr ';'

| If '(' expr ')' block

| If '(' expr ')' block Else block

| While '(' expr ')' statement

| Switch expr '{' cases '}'

| Ret ';'

| Ret '(' expr ')' ';'

| Brk ';'

| Cnt ';'

| block

| methodCall ';'

;



cases

: Case literal ':' statements cases

| Case literal ':' statements

;



methodCall

: Ident '(' args ')'

| Callout '(' Str calloutArgs ')'

;



args

: someArgs

|

;



someArgs

: someArgs ',' expr

| expr

;



calloutArgs

: calloutArgs ',' expr

| calloutArgs ',' Str

|

;



expr

: literal

| location

| '(' expr ')'

| SubOp expr

| '!' expr

| expr AddOp expr

| expr MulDiv expr

| expr SubOp expr

| expr RelOp expr

| expr AndOp expr

| expr OrOp expr

| methodCall

;



location

:Ident

| Ident '[' expr ']'

;



num

: DecNum

| HexNum

;



literal

: num

| Char

| BoolLit

;



eqOp

: '='

| AssignOp

;



//-----------------------------------------------------------------------------------------------------------

fragment Delim

: ' '

| 't'

| 'n'

;



fragment Letter

: [a-zA-Z]

;



fragment Digit

: [0-9]

;



fragment HexDigit

: Digit

| [a-f]

| [A-F]

;



fragment Alpha

: Letter

| '_'

;



fragment AlphaNum

: Alpha

| Digit

;



WhiteSpace

: Delim+ -> skip

;



Char

: ''' ~('\') '''

| ''\' . '''

;



Str

:'"' ((~('\' | '"')) | ('\'.))* '"'

;



Class

: 'class'

;



Program

: 'Program'

;



Void

: 'void'

;



If

: 'if'

;



Else

: 'else'

;



While

: 'while'

;



Switch

: 'switch'

;



Case

: 'case'

;



Ret

: 'return'

;



Brk

: 'break'

;



Cnt

: 'continue'

;



Callout

: 'callout'

;



DecNum

: Digit+

;



HexNum

: '0x'HexDigit+

;



BoolLit

: 'true'

| 'false'

;



Type

: 'int'

| 'boolean'

;



Ident

: Alpha AlphaNum*

;



RelOp

: '<='

| '>='

| '<'

| '>'

| '=='

| '!='

;



AssignOp

: '+='

| '-='

;



MulDiv

: '*'

| '/'

| '%'

;



AddOp

: '+'

;



SubOp

: '-'

;



AndOp

: '&&'

;



OrOp

: '||'

;

So, we have three files: symbols.csv, symtable.csv and instructions.csv

In symbols.csv, the format of each row is:

int id; //serial no. of symbol, unique

int tabid; //id no. of symbol table

string name; //symbol name

enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type

enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope

boolean isArray; //is it an array variable

int arrSize; //array size, if applicable

boolean isInited; //is initialized

union initVal {

    int i;

    boolean b;

} in; //initial value, if applicable

In symtable.csv, the format of each row is:

int id; //symbol table serial no., unique

int parent; //parent symbol table serial no.

In instructions.csv, the format of each row is:

int id; //serial no., unique

int res; //serial no. of result symbol

enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type

int op1; //serial no. of first operand symbol

int op2; //serial no. of second operand symbol

As an example, let's say we have this input:

class Program {

    int x;

    int y, z;

    int w = 0;

    void main (int n) {

        int a;

        a = 0;

        while (a < n) {

            int n;

            n = a + 1;

            a = n;

        }

        callout("printf", "n = %dn", n);

        return n;

    }

}

symbols.csv should look like this:

0, 0, x, INT, GLOBAL, false, 0, false, 0,

1, 0, y, INT, GLOBAL, false, 0, false, 0,

2, 0, z, INT, GLOBAL, false, 0, false, 0,

3, 0, 0, INT, CONST, false, 0, false, 0,

4, 0, w, INT, GLOBAL, false, 0, true, 0,

5, 0, main, LABEL, GLOBAL, false, 0, false, 0,

6, 1, n, INT, LOCAL, false, 0, false, 0,

7, 1, a, INT, LOCAL, false, 0, false, 0,

8, 1, 0, INT, CONST, false, 0, false, 0,

9, 2, n, INT, LOCAL, false, 0, false, 0,

10, 2, 1, INT, CONST, false, 0, false, 0,

11, 1, "printf", STR, CONST, false, 0, false, 0,

12, 1, "n = %dn", STR, CONST, false, 0, false, 0,

13, 1, 2, INT, CONST, false, 0, false, 0,

symtables.csv should look like this:

0, -1,

1, 0,

2, 1,

instructions.csv should look like this:

0, 4, ASSIGN, 3, -1, #w = 0

1, 5, LABEL, -1, -1, #main:

2, 7, ASSIGN, 8, -1, #a = 0

3, 5, LT, 7, 6, #if a<n goto 5

4, 8, GE, 7, 6, #iffalse a<n goto 8

5, 9, ADD, 7, 10, #n = a + 1

6, 7, ASSIGN, 9, -1, #a = n

7, 2, GOTO, -1, -1, #goto 3

8, -1, PARAM, 12, -1, #"n = %dn"

9, -1, PARAM, 6, -1, #n

10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);

11, -1, RET, 6, -1, # return n

Thank you in advance for any help.

P.S My lexer and parser are based off a tiny C-like language that is posted below.

Tiny C-Like Language:

program

:'class Program {'field_decl* method_decl*'}'



field_decl

: type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'

| type id '=' literal ';'



method_decl

: (type | 'void') id'('( (type id) ( ','type id)*)? ')'block



block

: '{'var_decl* statement*'}'



var_decl

: type id(','id)* ';'



type

: 'int'

| 'boolean'



statement

: location assign_op expr';'

| method_call';'

| 'if ('expr')' block ('else' block  )?

| 'switch' expr '{'('case' literal ':' statement*)+'}'

| 'while (' expr ')' statement

| 'return' ( expr )? ';'

| 'break ;'

| 'continue ;'

| block



assign_op

: '='

| '+='

| '-='



method_call

: method_name '(' (expr ( ',' expr )*)? ')'

| 'callout (' string_literal ( ',' callout_arg )* ')'



method_name

: id



location

: id

| id '[' expr ']'



expr

: location

| method_call

| literal

| expr bin_op expr

| '-' expr

| '!' expr

| '(' expr ')'



callout_arg

: expr

| string_literal



bin_op

: arith_op

| rel_op

| eq_op

| cond_op



arith_op

: '+'

| '-'

| '*'

| '/'

| '%'



rel_op

: '<'

| '>'

| '<='

| '>='



eq_op

: '=='

| '!='



cond_op

: '&&'

| '||'



literal

: int_literal

| char_literal

| bool_literal



id

: alpha alpha_num*



alpha

: ['a'-'z''A'-'Z''_']



alpha_num

: alpha

| digit 



digit

: ['0'-'9']



hex_digit

: digit

| ['a'-'f''A'-'F']



int_literal

: decimal_literal

| hex_literal



decimal_literal

: digit+



hex_literal

: '0x' hex_digit+



bool_literal

: 'true'

| 'false'



char_literal

: '‘'char'’'



string_literal

: '“'char*'”'

asked Nov 24 '18 at 6:40

J.Khelly

125

add a comment |

So, this question isn't necessarily a problem I have, but rather a lack of understanding.

I have this ANTLR code (which comprises of a parser and lexer):

grammar Compiler;



prog

: Class Program '{' field_decls method_decls '}'

;



field_decls returns [String s1]

: field_decls field_decl ';'

{

  $s1 = $field_decl.s2;

}

| field_decls inited_field_decl ';'

|

;



field_decl returns [String s2]

: field_decl ',' Ident

| field_decl ',' Ident '[' num ']'

| Type Ident

{

  System.out.println($Ident.text);

  $s2 = $Ident.text;

}

| Type Ident '[' num ']'

{

  System.out.println($Ident.text+"["+"]");

  $s2 = $Ident.text;

}

;



inited_field_decl

: Type Ident '=' literal

;



method_decls

: method_decls method_decl

|

;



method_decl

: Void Ident '(' params ')' block

| Type Ident '(' params ')' block

;



params

: Type Ident nextParams

|

;



nextParams

: ',' Type Ident nextParams

|

;



block

: '{' var_decls statements '}'

;



var_decls

: var_decls var_decl

|

;



var_decl

: Type Ident ';'

;



statements

: statement statements

|

;



statement

: location eqOp expr ';'

| If '(' expr ')' block

| If '(' expr ')' block Else block

| While '(' expr ')' statement

| Switch expr '{' cases '}'

| Ret ';'

| Ret '(' expr ')' ';'

| Brk ';'

| Cnt ';'

| block

| methodCall ';'

;



cases

: Case literal ':' statements cases

| Case literal ':' statements

;



methodCall

: Ident '(' args ')'

| Callout '(' Str calloutArgs ')'

;



args

: someArgs

|

;



someArgs

: someArgs ',' expr

| expr

;



calloutArgs

: calloutArgs ',' expr

| calloutArgs ',' Str

|

;



expr

: literal

| location

| '(' expr ')'

| SubOp expr

| '!' expr

| expr AddOp expr

| expr MulDiv expr

| expr SubOp expr

| expr RelOp expr

| expr AndOp expr

| expr OrOp expr

| methodCall

;



location

:Ident

| Ident '[' expr ']'

;



num

: DecNum

| HexNum

;



literal

: num

| Char

| BoolLit

;



eqOp

: '='

| AssignOp

;



//-----------------------------------------------------------------------------------------------------------

fragment Delim

: ' '

| 't'

| 'n'

;



fragment Letter

: [a-zA-Z]

;



fragment Digit

: [0-9]

;



fragment HexDigit

: Digit

| [a-f]

| [A-F]

;



fragment Alpha

: Letter

| '_'

;



fragment AlphaNum

: Alpha

| Digit

;



WhiteSpace

: Delim+ -> skip

;



Char

: ''' ~('\') '''

| ''\' . '''

;



Str

:'"' ((~('\' | '"')) | ('\'.))* '"'

;



Class

: 'class'

;



Program

: 'Program'

;



Void

: 'void'

;



If

: 'if'

;



Else

: 'else'

;



While

: 'while'

;



Switch

: 'switch'

;



Case

: 'case'

;



Ret

: 'return'

;



Brk

: 'break'

;



Cnt

: 'continue'

;



Callout

: 'callout'

;



DecNum

: Digit+

;



HexNum

: '0x'HexDigit+

;



BoolLit

: 'true'

| 'false'

;



Type

: 'int'

| 'boolean'

;



Ident

: Alpha AlphaNum*

;



RelOp

: '<='

| '>='

| '<'

| '>'

| '=='

| '!='

;



AssignOp

: '+='

| '-='

;



MulDiv

: '*'

| '/'

| '%'

;



AddOp

: '+'

;



SubOp

: '-'

;



AndOp

: '&&'

;



OrOp

: '||'

;

So, we have three files: symbols.csv, symtable.csv and instructions.csv

In symbols.csv, the format of each row is:

int id; //serial no. of symbol, unique

int tabid; //id no. of symbol table

string name; //symbol name

enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type

enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope

boolean isArray; //is it an array variable

int arrSize; //array size, if applicable

boolean isInited; //is initialized

union initVal {

    int i;

    boolean b;

} in; //initial value, if applicable

In symtable.csv, the format of each row is:

int id; //symbol table serial no., unique

int parent; //parent symbol table serial no.

In instructions.csv, the format of each row is:

int id; //serial no., unique

int res; //serial no. of result symbol

enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type

int op1; //serial no. of first operand symbol

int op2; //serial no. of second operand symbol

As an example, let's say we have this input:

class Program {

    int x;

    int y, z;

    int w = 0;

    void main (int n) {

        int a;

        a = 0;

        while (a < n) {

            int n;

            n = a + 1;

            a = n;

        }

        callout("printf", "n = %dn", n);

        return n;

    }

}

symbols.csv should look like this:

0, 0, x, INT, GLOBAL, false, 0, false, 0,

1, 0, y, INT, GLOBAL, false, 0, false, 0,

2, 0, z, INT, GLOBAL, false, 0, false, 0,

3, 0, 0, INT, CONST, false, 0, false, 0,

4, 0, w, INT, GLOBAL, false, 0, true, 0,

5, 0, main, LABEL, GLOBAL, false, 0, false, 0,

6, 1, n, INT, LOCAL, false, 0, false, 0,

7, 1, a, INT, LOCAL, false, 0, false, 0,

8, 1, 0, INT, CONST, false, 0, false, 0,

9, 2, n, INT, LOCAL, false, 0, false, 0,

10, 2, 1, INT, CONST, false, 0, false, 0,

11, 1, "printf", STR, CONST, false, 0, false, 0,

12, 1, "n = %dn", STR, CONST, false, 0, false, 0,

13, 1, 2, INT, CONST, false, 0, false, 0,

symtables.csv should look like this:

0, -1,

1, 0,

2, 1,

instructions.csv should look like this:

0, 4, ASSIGN, 3, -1, #w = 0

1, 5, LABEL, -1, -1, #main:

2, 7, ASSIGN, 8, -1, #a = 0

3, 5, LT, 7, 6, #if a<n goto 5

4, 8, GE, 7, 6, #iffalse a<n goto 8

5, 9, ADD, 7, 10, #n = a + 1

6, 7, ASSIGN, 9, -1, #a = n

7, 2, GOTO, -1, -1, #goto 3

8, -1, PARAM, 12, -1, #"n = %dn"

9, -1, PARAM, 6, -1, #n

10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);

11, -1, RET, 6, -1, # return n

Thank you in advance for any help.

P.S My lexer and parser are based off a tiny C-like language that is posted below.

Tiny C-Like Language:

program

:'class Program {'field_decl* method_decl*'}'



field_decl

: type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'

| type id '=' literal ';'



method_decl

: (type | 'void') id'('( (type id) ( ','type id)*)? ')'block



block

: '{'var_decl* statement*'}'



var_decl

: type id(','id)* ';'



type

: 'int'

| 'boolean'



statement

: location assign_op expr';'

| method_call';'

| 'if ('expr')' block ('else' block  )?

| 'switch' expr '{'('case' literal ':' statement*)+'}'

| 'while (' expr ')' statement

| 'return' ( expr )? ';'

| 'break ;'

| 'continue ;'

| block



assign_op

: '='

| '+='

| '-='



method_call

: method_name '(' (expr ( ',' expr )*)? ')'

| 'callout (' string_literal ( ',' callout_arg )* ')'



method_name

: id



location

: id

| id '[' expr ']'



expr

: location

| method_call

| literal

| expr bin_op expr

| '-' expr

| '!' expr

| '(' expr ')'



callout_arg

: expr

| string_literal



bin_op

: arith_op

| rel_op

| eq_op

| cond_op



arith_op

: '+'

| '-'

| '*'

| '/'

| '%'



rel_op

: '<'

| '>'

| '<='

| '>='



eq_op

: '=='

| '!='



cond_op

: '&&'

| '||'



literal

: int_literal

| char_literal

| bool_literal



id

: alpha alpha_num*



alpha

: ['a'-'z''A'-'Z''_']



alpha_num

: alpha

| digit 



digit

: ['0'-'9']



hex_digit

: digit

| ['a'-'f''A'-'F']



int_literal

: decimal_literal

| hex_literal



decimal_literal

: digit+



hex_literal

: '0x' hex_digit+



bool_literal

: 'true'

| 'false'



char_literal

: '‘'char'’'



string_literal

: '“'char*'”'

asked Nov 24 '18 at 6:40

J.Khelly

125

So, this question isn't necessarily a problem I have, but rather a lack of understanding.

I have this ANTLR code (which comprises of a parser and lexer):

grammar Compiler;



prog

: Class Program '{' field_decls method_decls '}'

;



field_decls returns [String s1]

: field_decls field_decl ';'

{

  $s1 = $field_decl.s2;

}

| field_decls inited_field_decl ';'

|

;



field_decl returns [String s2]

: field_decl ',' Ident

| field_decl ',' Ident '[' num ']'

| Type Ident

{

  System.out.println($Ident.text);

  $s2 = $Ident.text;

}

| Type Ident '[' num ']'

{

  System.out.println($Ident.text+"["+"]");

  $s2 = $Ident.text;

}

;



inited_field_decl

: Type Ident '=' literal

;



method_decls

: method_decls method_decl

|

;



method_decl

: Void Ident '(' params ')' block

| Type Ident '(' params ')' block

;



params

: Type Ident nextParams

|

;



nextParams

: ',' Type Ident nextParams

|

;



block

: '{' var_decls statements '}'

;



var_decls

: var_decls var_decl

|

;



var_decl

: Type Ident ';'

;



statements

: statement statements

|

;



statement

: location eqOp expr ';'

| If '(' expr ')' block

| If '(' expr ')' block Else block

| While '(' expr ')' statement

| Switch expr '{' cases '}'

| Ret ';'

| Ret '(' expr ')' ';'

| Brk ';'

| Cnt ';'

| block

| methodCall ';'

;



cases

: Case literal ':' statements cases

| Case literal ':' statements

;



methodCall

: Ident '(' args ')'

| Callout '(' Str calloutArgs ')'

;



args

: someArgs

|

;



someArgs

: someArgs ',' expr

| expr

;



calloutArgs

: calloutArgs ',' expr

| calloutArgs ',' Str

|

;



expr

: literal

| location

| '(' expr ')'

| SubOp expr

| '!' expr

| expr AddOp expr

| expr MulDiv expr

| expr SubOp expr

| expr RelOp expr

| expr AndOp expr

| expr OrOp expr

| methodCall

;



location

:Ident

| Ident '[' expr ']'

;



num

: DecNum

| HexNum

;



literal

: num

| Char

| BoolLit

;



eqOp

: '='

| AssignOp

;



//-----------------------------------------------------------------------------------------------------------

fragment Delim

: ' '

| 't'

| 'n'

;



fragment Letter

: [a-zA-Z]

;



fragment Digit

: [0-9]

;



fragment HexDigit

: Digit

| [a-f]

| [A-F]

;



fragment Alpha

: Letter

| '_'

;



fragment AlphaNum

: Alpha

| Digit

;



WhiteSpace

: Delim+ -> skip

;



Char

: ''' ~('\') '''

| ''\' . '''

;



Str

:'"' ((~('\' | '"')) | ('\'.))* '"'

;



Class

: 'class'

;



Program

: 'Program'

;



Void

: 'void'

;



If

: 'if'

;



Else

: 'else'

;



While

: 'while'

;



Switch

: 'switch'

;



Case

: 'case'

;



Ret

: 'return'

;



Brk

: 'break'

;



Cnt

: 'continue'

;



Callout

: 'callout'

;



DecNum

: Digit+

;



HexNum

: '0x'HexDigit+

;



BoolLit

: 'true'

| 'false'

;



Type

: 'int'

| 'boolean'

;



Ident

: Alpha AlphaNum*

;



RelOp

: '<='

| '>='

| '<'

| '>'

| '=='

| '!='

;



AssignOp

: '+='

| '-='

;



MulDiv

: '*'

| '/'

| '%'

;



AddOp

: '+'

;



SubOp

: '-'

;



AndOp

: '&&'

;



OrOp

: '||'

;

So, we have three files: symbols.csv, symtable.csv and instructions.csv

In symbols.csv, the format of each row is:

int id; //serial no. of symbol, unique

int tabid; //id no. of symbol table

string name; //symbol name

enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type

enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope

boolean isArray; //is it an array variable

int arrSize; //array size, if applicable

boolean isInited; //is initialized

union initVal {

    int i;

    boolean b;

} in; //initial value, if applicable

In symtable.csv, the format of each row is:

int id; //symbol table serial no., unique

int parent; //parent symbol table serial no.

In instructions.csv, the format of each row is:

int id; //serial no., unique

int res; //serial no. of result symbol

enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type

int op1; //serial no. of first operand symbol

int op2; //serial no. of second operand symbol

As an example, let's say we have this input:

class Program {

    int x;

    int y, z;

    int w = 0;

    void main (int n) {

        int a;

        a = 0;

        while (a < n) {

            int n;

            n = a + 1;

            a = n;

        }

        callout("printf", "n = %dn", n);

        return n;

    }

}

symbols.csv should look like this:

0, 0, x, INT, GLOBAL, false, 0, false, 0,

1, 0, y, INT, GLOBAL, false, 0, false, 0,

2, 0, z, INT, GLOBAL, false, 0, false, 0,

3, 0, 0, INT, CONST, false, 0, false, 0,

4, 0, w, INT, GLOBAL, false, 0, true, 0,

5, 0, main, LABEL, GLOBAL, false, 0, false, 0,

6, 1, n, INT, LOCAL, false, 0, false, 0,

7, 1, a, INT, LOCAL, false, 0, false, 0,

8, 1, 0, INT, CONST, false, 0, false, 0,

9, 2, n, INT, LOCAL, false, 0, false, 0,

10, 2, 1, INT, CONST, false, 0, false, 0,

11, 1, "printf", STR, CONST, false, 0, false, 0,

12, 1, "n = %dn", STR, CONST, false, 0, false, 0,

13, 1, 2, INT, CONST, false, 0, false, 0,

symtables.csv should look like this:

0, -1,

1, 0,

2, 1,

instructions.csv should look like this:

0, 4, ASSIGN, 3, -1, #w = 0

1, 5, LABEL, -1, -1, #main:

2, 7, ASSIGN, 8, -1, #a = 0

3, 5, LT, 7, 6, #if a<n goto 5

4, 8, GE, 7, 6, #iffalse a<n goto 8

5, 9, ADD, 7, 10, #n = a + 1

6, 7, ASSIGN, 9, -1, #a = n

7, 2, GOTO, -1, -1, #goto 3

8, -1, PARAM, 12, -1, #"n = %dn"

9, -1, PARAM, 6, -1, #n

10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);

11, -1, RET, 6, -1, # return n

Thank you in advance for any help.

P.S My lexer and parser are based off a tiny C-like language that is posted below.

Tiny C-Like Language:

program

:'class Program {'field_decl* method_decl*'}'



field_decl

: type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'

| type id '=' literal ';'



method_decl

: (type | 'void') id'('( (type id) ( ','type id)*)? ')'block



block

: '{'var_decl* statement*'}'



var_decl

: type id(','id)* ';'



type

: 'int'

| 'boolean'



statement

: location assign_op expr';'

| method_call';'

| 'if ('expr')' block ('else' block  )?

| 'switch' expr '{'('case' literal ':' statement*)+'}'

| 'while (' expr ')' statement

| 'return' ( expr )? ';'

| 'break ;'

| 'continue ;'

| block



assign_op

: '='

| '+='

| '-='



method_call

: method_name '(' (expr ( ',' expr )*)? ')'

| 'callout (' string_literal ( ',' callout_arg )* ')'



method_name

: id



location

: id

| id '[' expr ']'



expr

: location

| method_call

| literal

| expr bin_op expr

| '-' expr

| '!' expr

| '(' expr ')'



callout_arg

: expr

| string_literal



bin_op

: arith_op

| rel_op

| eq_op

| cond_op



arith_op

: '+'

| '-'

| '*'

| '/'

| '%'



rel_op

: '<'

| '>'

| '<='

| '>='



eq_op

: '=='

| '!='



cond_op

: '&&'

| '||'



literal

: int_literal

| char_literal

| bool_literal



id

: alpha alpha_num*



alpha

: ['a'-'z''A'-'Z''_']



alpha_num

: alpha

| digit 



digit

: ['0'-'9']



hex_digit

: digit

| ['a'-'f''A'-'F']



int_literal

: decimal_literal

| hex_literal



decimal_literal

: digit+



hex_literal

: '0x' hex_digit+



bool_literal

: 'true'

| 'false'



char_literal

: '‘'char'’'



string_literal

: '“'char*'”'

csv compiler-construction antlr

asked Nov 24 '18 at 6:40

J.Khelly

125

asked Nov 24 '18 at 6:40

J.Khelly

125

asked Nov 24 '18 at 6:40

J.Khelly

125

asked Nov 24 '18 at 6:40

J.Khelly

125

asked Nov 24 '18 at 6:40

J.Khelly

125

add a comment |

1 Answer
1

active

oldest

votes

This depends on what version of ANTLR you're using:

In ANTLR 3
- The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.
- A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.

In ANTLR 4, you use a Listener or a Visitor to process the parsed input.

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53455817%2fgenerating-intermediate-code-using-syntax-directed-translation-in-antlr%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This depends on what version of ANTLR you're using:

In ANTLR 3
- The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.
- A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.

In ANTLR 4, you use a Listener or a Visitor to process the parsed input.

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

add a comment |

This depends on what version of ANTLR you're using:

In ANTLR 3
- The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.
- A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.

In ANTLR 4, you use a Listener or a Visitor to process the parsed input.

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

add a comment |

This depends on what version of ANTLR you're using:

In ANTLR 3
- The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.
- A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.

In ANTLR 4, you use a Listener or a Visitor to process the parsed input.

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

This depends on what version of ANTLR you're using:

In ANTLR 3
- The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.
- A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.

In ANTLR 4, you use a Listener or a Visitor to process the parsed input.

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

answered Nov 25 '18 at 21:09

Jiri Tousek

10.6k52240

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk