5/5 - (1 vote)

Assignment 4: Static Semantics (or Contextual Analysis)

COMP3131/9102: Programming Languages and Compilers

Term 1, 2023

Worth: 30/100 marks Due: 8:00pm Wednesday 5 April 2023

Revision Log Nothing at this stage. End Of Revision Log

FAQs

Please read the FAQs for this assignment.
Speciﬁcation

You are to implement a semantic or contextual analyser that checks that the program conforms to the source language’s context-sensitive constraints (i.e., static semantics) according to the VC Language De ﬁnition. This part of the compilation process is referred to as the semantic or contextual analysis.

There are two types of context-sensitive constraints:

Scope rules: These are the rules governing declarations (deﬁned occurrences of identiﬁers) and applied occurrences of identiﬁers.

Type rules: These are the rules that allow us to infer the types of language constructs and to decide whether each construct has a valid type.

Therefore, the semantic analysis consists of two subphases:

Identiﬁcation: applying the scope rules to relate each applied occurrence of an identiﬁer to its declaration, if any.

Type checking: applying the type rules to infer the type of each construct and comparing that type with the expected type in the context.

This assignment involves developing a visitor class (named Checker) that implements the set of visitor methods in the interface VC.ASTs.Visitor.java. Your semantic analyser will be a visitor object that performs both identiﬁcation and type checking in one pass by visiting the AST for the program being compiled in the depth-ﬁrst traversal.

In the case of ill-typed constructs, appropriate error messages as speciﬁed below must be reported.

As before, if no lexical, syntactic or semantic error is found, your compiler should announce success by printing:

Compilation was successful.

Otherwise, the following message should be printed:

Compilation was unsuccessful.

Identiﬁcation

This subphase of the semantic analyser has been implemented for you. Identiﬁcation relates each applied occurrence of an identiﬁer to its declaration, if any, by applying the VC’s scope rules. The standard method of implementing this subphase is to employ a symbol table that associates identiﬁers with their attributes. In the VC compiler, the attribute for an identiﬁer is represented by a pointer (an inherited attribute) to the subtree that represents the declaration (GlobalVarDecl, LocalVarDecl or FuncDecl) of the identiﬁer. This attribute is represented by the instance variable decl in VC.ASTs.Ident.java:

package VC.ASTs;

import VC.Scanner.SourcePosition; public class Ident extends Terminal {

public AST decl;

public Ident(String value , SourcePosition position) { super (value, position);

decl = null;

}

public Object visit(Visitor v, Object o) { return v.visitIdent(this, o);

}

There is only one symbol table organised as a stack for storing the identiﬁers in all scopes. Two classes are used:

VC.Checker.IdEntry.java: deﬁning what a symbol table entry looks like.

VC.Checker.SymbolTable.java: deﬁning all methods required for symbol table management.

The symbol table methods are called at the following visitor methods of the class Checker:

visitGlobalVarDecl, visitLocalVarDecl and and visitFuncDecl: whenever the semantic analyser visits the declaration at a subtree, it will call insert to enter the identiﬁer, its scope level and a pointer to the subtree into the symbol table.

To detect duplicate declarations using the same identiﬁer, you call the method retrieveOneLevel. This method returns a pointer to the identiﬁer entry if the identiﬁer was declared before in the current scope and null otherwise.

visitIdent: whenever the semantic analyser visits an applied occurrence of an identiﬁer I, it will call retrieve with the identiﬁer I and thus retrieves the pointer to the subtree representing its declaration. It will then decorate the identiﬁer node for I by establishing a link to this declaration. This link is null if no declaration is found. This fact will be used by you to detect undeclared variables.

visitCompoundStmt: whenever the semantic analyser visits a block, it calls openScope at the start of the block to open a new scope and closeScope at the end of the block to close the current scope.

The symbol table is not empty before the semantic analysis for the program begins. Many languages contain a standard collection of pre- deﬁned constants, variables, types and functions that the programmer can use without having to introduce themselves. The VC standard environment includes only 11 built-in functions and a few primitive types. The “declarations” of these functions do not appear in the AST for the program being compiled! In order to make it possible for a link from a call, say, putIntLn, to be established with its “declaration”, the following AST for the function is constructed by the VC compiler:

The name of the parameter is insigniﬁcant and is thus set to “”.

The ASTs for the other eight built-in functions are similarly constructed.

Before analysing the program, the semantic analyser initialises the symbol table with the identiﬁers for the 11 functions as follows:

Ident	Level	Attr
getInt	1	ptr to the getInt AST
putInt	1	ptr to the putInt AST
putIntLn	1	ptr to the putIntLn AST
getFloat	1	ptr to the getFloat AST
putFloat	1	ptr to the putFloat AST
putFloatLn	1	ptr to the putFloatLn AST
putBool	1	ptr to the putBool AST
putBoolLn	1	ptr to the putBoolLn AST
putString	1	ptr to the putString AST
putStringLn	1	ptr to the putStringLn AST
putLn	1	ptr to the putLn AST
the identiﬁers in the program

You are required to read

VC.Checker.IdEntry.java,

VC.Checker.SymbolTable.java,

VC.StdEnvironment.java, and

the method establishEnvironemnt in AST.Checker.Checker.java to ensure your understanding of the identiﬁcation subphase.

Error Messages

On detecting some semantic errors, your checker must print some error messages.

Your error messages must be taken from the following array that is already deﬁned for you in Checker.java:

private String errMesg[] = {

“*0: main function is missing”,

“*1: return type of main is not int”,

// defined occurrences of identifiers

// for global, local and parameters “*2: identifier redeclared”,

“*3: identifier declared void”, “*4: identifier declared void[]”,

// applied occurrences of identifiers “*5: identifier undeclared”,

// assignments

“*6: incompatible type for =”,

“*7: invalid lvalue in assignment”,

// types for expressions

“*8: incompatible type for return”,

“*9: incompatible type for this binary operator”, “*10: incompatible type for this unary operator”,

// scalars

“*11: attempt to use an array/fuction as a scalar”,

// arrays

“*12: attempt to use a scalar/function as an array”, “*13: wrong type for element in array initialiser”,

“*14: invalid initialiser: array initialiser for scalar”, “*15: invalid initialiser: scalar initialiser for array”, “*16: excess elements in array initialiser”,

“*17: array subscript is not an integer”, “*18: array size missing”,

// functions

“*19: attempt to reference a scalar/array as a function”,

// conditional expressions in if, for and while “*20: if conditional is not boolean”,

“*21: for conditional is not boolean”, “*22: while conditional is not boolean”,

// break and continue

“*23: break must be in a while/for”, “*24: continue must be in a while/for”,

// parameters

“*25: too many actual parameters”, “*26: too few actual parameters”,

“*27: wrong type for actual parameter”,

// reserved for errors that I may have missed “*28: misc 1”,

“*29: misc 2”,

// the following are not required “*30: statement(s) not reached”, “*31: missing return statement”,

};

The error messages 28 — 29 are reserved for some errors that I might have missed. They can be added in these slots later. If there there is a type error detected at the subtree rooted at “ast”, you can report an error message as follows:

reporter.reportError(“errMesg[index] +. blah blah “, “”, ast.position);

See ErrorReporter.java regarding how the position information will be printed.

It is also possible to pass a non-empty string as the 2nd argument so that it is printed in the position marked by % as follows:

reporter.reportError(errMesg[index] + “: %”, “blah blah”, ast.position);

Again you can read ErrorReporter.java to ﬁnd out how this works.

You are not allowed to modify the error message array errMesg. On detecting a semantic error, the error message your checker reports must contain one of the error strings deﬁned in errMesg as a substring. As demonstrated in the supplied solution ﬁles, t1.sol and t2.sol, you can certainly add more “words” in an oﬃcial error message to make it more informative.

To avoid printing a cascade of spurious error messages, you are advised to use the simple error recovery technique explained in Solution 2 to Question 2 in Week 9 Tutorial. Essentially, the compiler assigns StdEnvironment.errorType to every ill-typed expression and prints an error message. However, the compiler will refrain from printing any error messages for an expression if any of its subexpressions has the type StdEnvironment.errorType.

Personally, the Java compiler’s error handling is very good. It is helpful to run Java on similar test cases to examine how various semantic errors are detected and reported.
Writing Your Type Checker

Set up your compiling environment as speciﬁed in Ass ignment 1 spec. Download and install the supporting classes for this assignment as follows:
1. Copy ~cs3131/VC/Checker/Checker.zip into your VC directory
2. Set your current working directory as VC.
3. Extract the bundled files in the zip file as follows: unzip Checker.zip
  
  The ﬁles bundled in this zip archive are listed below. If you have trouble in handling Checker.zip, you can also download the supporting classes individually all from ~cs3131/VC/Checker and install them into the respective directories (i.e., packages) as speciﬁed below:
  
  The Checker package:
  
  ====================
  
  Checker.java: semantic analyser skeleton IdEntry.java: symbol table entry SymbolTable.java: symbol table management
  
  Test Files: t1.vc and t2.vc Solution Files: t1.sol and t2.sol
  
  The VC package:
  
  ===============
  
  StdEnvironment.java: The VC language environment
  
  vc.java: main compiler module (different from that in Assignment 3)
  
  Your static analyser will use ErrorReporter.java you installed in your VC directory in Assignment 1. If you have not done so or have lost the ﬁle, copy it from ~cs3131/VC.
  
  You need to read the VC Language De ﬁnition to ﬁnd out all context-sensitive constraints that should be enforced. Here is a list of typical checks:
  1. All identiﬁers must be declared before used.
  2. An identiﬁer cannot be declared more than once in a block. 3. No identiﬁer can be declared to have the type void or void []. 4. Operands must be type compatible with operators.
Assignment must be type compatible.
A function must be called with the right number of arguments, and in addition, the type of an actual parameter must be assignment compatible with the type of the corresponding formal parameter.
The type of a returned value must be assignment compatible with the result type of the corresponding function.
The “conditional expression” in a for/if/while statement must evaluate to a boolean value. Therefore, the following program

while (1)

// do something

should cause the error message numbered 22 to be printed.
break and continue must be contained in a while/for. By introducing an instance variable to record the nesting depth of a while statement, both checks can be done in a few lines.
A array name itself can only be used as an argument in a function call:

void f(int x[]) { } int main() {

f(x); // OK

x + 1; // ERROR

return 0;

}
An array variable can only be indexed by integer expressions.
Array initialisers must be used for arrays only, by following the rules given in the VC language speciﬁcation.

In addition, if a return statement is immediately followed by a statement (other than “;”), you may report an error. But this is optional.

It is also optional to detect if a function returning a non-void type always contains a return statement in every possible execution path. Run the Java compiler on some test cases to see how well this is done.

In the case of expressions, your type checker will infer the type of an expression and store the type (as a synthesised attribute) in the corresponding expression node. This synthesised attribute is represented by the instance variable type deﬁned in the abstract class VC.ASTs.Expr.java and inherited in all its concrete expression classes and the instance variable type deﬁned in the abstract class VC.ASTs.Var.java and inherited in its concrete class VC.ASTs.SimpleVar.java.

The synthesised attribute type for variables and expressions will be evaluated bottom-up. You can pass the attribute bottom-up by letting all corresponding visitor methods return ast.type.

Note that type is a synthesised attribute, which is computed in the depth-ﬁrst traversal of the AST. All the following six AST classes:

IntType FloatType BooleanType StringType VoidType ErrorType

contain the methods equals and assignable, which will be used to compare if two types are identical and if two types are assignment compatible, respectively. Let e1Type and e2Type be the types of two expressions. Then

e1Type.equals(e2Type) returns true iﬀ both types are identical.

e1Type.assignable(e2Type) returns true iﬀ e2Type is assignment compatible to e1Type .

In addition, both tests return true if e1Type or e2Type is errorType. This tactic avoids generating too many spurious errors.

In the case when an array name is passed as an argument in a function call, don’t rely on the method assignable! You need to handle this as a special case.

Accordingly, the standard environment contains the six pre-deﬁned types:

StdEnvironment.intType StdEnvironment.floatType StdEnvironment.booleanType StdEnvironment.stringType StdEnvironment.voidType StdEnvironment.errorType

They are declared in VC.StdEnvironment.java and initialised in the method establishEnvironment of the class Checker. The ﬁrst ﬁve are already used in the partially ﬁnished class Checker.

You are given only two test ﬁles, which covers all semantic errors deﬁned in the error message table given in Section 3.

Checker.java does not compile. The Java compiler will complain its being an abstract class unless have have implemented all the missing visitor methods.

You need to add roughly 500 lines of code to obtain a static analyser that works beautifully for the VC language. You are free to modify the supplied visitor methods in Checker.java, although most of them should work already. However, it is not necessary to modify the constructor and the method establishEnvironment.

Decorating ASTs

The results of semantic analysis is recorded by decorating the AST as explained above. In summary, the following decorations are used:

Each Ident node is decorated by establishing a link to its declaration if any and to null otherwise.

Each SimpleVar node or any of expression nodes is decorated by setting its type ﬁeld to the type of the expression.
Type Coercions

In addition to performing the identiﬁcation and type checking, the semantic analyser also handles type coercions to facilitate the ﬁnal code generation.

In the language, type coercions will go only from int to ﬂoat. Let x:T denote the fact that the variable or expression x is of type T. In the following four cases, the expression e must be converted to ﬂoat:

an assignment v:ﬂoat = e:int.

a mixed-mode binary expression e1:int <op> e2:ﬂoat (or e1:ﬂoat <op> e2:int.)

a call expression foo(… e:int …), where the corresponding parameter declaration is void/int/float foo(… float f …) { … }.

an expression in a return statement return e:int, where the return type for the corresponding function is ﬂoat.

Type coercions are achieved by introducing extra nodes into the program’s AST in some visitor methods (e.g., visitAssignExpr and visitBinaryExpr). A special unary operator i2f is used to convert an integer value to a ﬂoating-point value. For example, assignment coercions can be realised by including the following code in visitBinaryExpr:

Operator op = new Operator(“i2f”, dummyPos); UnaryExpr eAST = new UnaryExpr(op, ast.E, dummyPos); eAST.type = StdEnvironment.floatType;

ast.E = eAST;

This will change the original BinaryExpr AST from

BinaryExpr

/

/

v ExprNode for e

to

BinaryExpr

/

/

v UnaryExpr

/

/

i2f ExprNode for e

The other three kinds of coercions are implemented using similar code.

In addition to performing the type coercions discussed above, your static analyser is required to replace each overloaded operator with an appropriate non-overloaded operator to indicate whether the intended operation is an integer or a ﬂoating-point operation.

Some operators such as + and – are overloaded in the sense that they can be applied to either a pair of integers or a pair of ﬂoating-point numbers. Every operator <op> is associated with two non-overloaded operators: i<op> for integer operations and f<op> for ﬂoating-point operations. For example, the two non-overloaded operators for + are i+ and f+, the two non-overloaded operators for <= are i<= and f<=, and so on.

In JVM, the boolean values are represented by integer values. Therefore, an operator <op> acting on boolean values is represented using

i<op>. The operators that can act on boolean values are &&, ||, !, == and !=. It is straightforward to resolve the overloaded operators:
1. &&, || and !
  
  These operators can act only on boolean values and will always be replaced by i&&, i|| and i!, respectively. The code required is:
  
  ast.O.spelling = “i” + ast.O.spelling;
2. + – * / < <= > >= (where + and – are both unary and binary)
  
  These operators must be replaced with appropriate non-overloaded operators. A given expression is evaluated using the ﬂoating-point operation as long as one of the operands is of type ﬂoat. The code required is:
  
  ast.O.spelling = “f” + ast.O.spelling; or
  
  ast.O.spelling = “i” + ast.O.spelling;
  
  whichever is appropriate.
3. == and !=
These two operators can be applied to either a pair of boolean values or a pair of integers or a pair of ﬂoating-point numbers. In the ﬁrst case, the integer operations should be used ((after type coercion, if required, as explained in Section 7.1 of the VC Spec). The other two cases are handled similarly as in Case 2.

Type coercions will not be assessed for this assignment. But it will be accessed in Assignment 5 — your code generator would not work properly if type coercions are incorrect.

For an ill-typed expression such as “i + true“, you can leave the operator `+’ unchanged in the AST, as all the programs that manage to reach the code generator will be free of lexical, syntactical and semantic errors.

The total number of lines required for type coercions is about 25 lines, with the same ﬁve lines repeated a few times.

The following example is used to illustrate type coercions for assignment statements. Here are the ASTs before and after type coercions are performed:

int main() { float f; int i;

f = i + 1; return 0;

}

Here are the ASTs before and after type coercions are performed for a more complex program:

int main() { float x; boolean b;

if (x != 0 && b == true)

x = (+1.0 + 2) * (2 + 3);

return 0;

}
More on Arrays

As explained in Question 1 of this assignment’s FAQs, type coercions should also be done to the expressions inside initialisers.

Type checking for arrays that are passed as parameters proceeds exactly as in C. This is demonstrated by the way the four calls in t2.vc are type-checked.

For an array declaration with an initialiser but without a size parameter, your checker should calculate the exact size of the array from the initialiser and then modify the the AST from the parser to ﬁll the missing size parameter. This is demonstrated below by an example:

int main() {

int x[] = {1, 2, 3, 4};

}

Note that EmptyExpr for ArrType (highlighted in red) has been replaced by IntExpr –> 4 (also highlighted in red), where 4 is the size of the array x.
The Parser

If you want to use our parser in case yours does not work properly, copy ~cs3131/VC/RefSols/Parser.zip to your Parser directory and type:

unzip Parser.zip

This installs Parser.class and its two associated classses, Parser$TypeAndIdent.class and Parser$a.class, under package VC.Parser.

It is not necessary to understand how this parser works. Your type checker will only work on the AST constructed for the program by the parser.
Testing Your Type Checker

For this assignment, there does not seem to be a need to produce by default both a linearised AST and a reconstructed VC program every time when the compiler is run. The compiler options have been changed slightly as follows:

[jxue@daniel Checker]$ java VC.vc

======= The VC compiler ======= [# vc #]: no input file

Usage: java VC.vc [-options] filename where options include:

-d [1234] display the AST (without SourcePosition)

1: the AST from the parser (without SourcePosition) 2: the AST from the parser (with SourcePosition)

3: the AST from the checker (without SourcePosition) 4: the AST from the checker (with SourcePosition)

-t [file] print the (non-annotated) AST into

(or filename + “t” if is unspecified)

-u [file] unparse the (non-annotated) AST into

(or filename + “u” if is unspecified)
Marking Criteria

Your type checker will be assessed only by examining how it handles various semantically legal and illegal programs. Only syntactically legal programs will be used.

Your type checker will not be marked up or down for how well it recovers from semantic errors and how well it avoids spurious error messages.

Small test cases will be designed so that, in general, each test case has only one semantic error. In the case of multiple semantic errors in a test case, these errors are designed to be independent of each other, as exempliﬁed by the supplied test cases t1.vc and t2.vc. If necessary, the error messages from your type checker will also be examined manually.

Therefore, we will have to use “fgrep” rather than “diﬀ” to mark this assignment. As an example, on the following program:

int main() { int i; float j;

i = j = 1.0 + true; return 0;

}

the output from our checker is:

% java VC.vcchecker y.vc

======= The VC compiler =======

Pass 1: Lexical and syntactic Analysis Pass 2: Semantic Analysis

ERROR: 5(9)..5(18): *9: incompatible type for this binary operator: + Compilation was unsuccessful.

Two more error messages are possible:

j = 1.0 + true: incompatible type for = i = j: incompatible type for =

Our checker regards these as spurious errors and has refrained from reporting them. Whether your checker chooses to report them or not will not be marked. In this particular case, error message number 9 must be reported.

The positional information will not be assessed. As before, there are no subjective marks.
Submitting Your Checker

give cs3131 checker Checker.java (and vc.java if you have modified it)
Late Penalties

This assignment is worth 30 marks (out of 100). You are strongly advised to start early and do not wait until the last minute. You will lose 6 marks for each day the assignment is late.

Given the nature of our ﬁve programming assignments (as mentioned in the course line and also our ﬁrst lecture), this course adopts a non- standard penalty rate (approved by the school). Given their tight deadlines, all programming assignments will usually be marked 1 or 2 days after their due dates in order to provide some feedback to you quickly to help you start working on the following assignment almost immediately.

Extensions will not be granted unless you have legitimate reasons and have let the LIC know ASAP, preferably one week before its due date.
Plagiarism

As you should be aware, UNSW has a commitment to detecting plagiarism in assignments. In this particular course, we run a special program that detects similarity between assignment submissions of diﬀerent students, and then manually inspect those with high similarity to guarantee that the suspected plagiarism is apparent.

If you receive a written letter relating to suspected plagiarism, please contact the LIC with the speciﬁed deadline. While those students can collect their assignments, their marks will only be ﬁnalised after we have reviewed the explanation regarding their suspected plagiarism.

This year, CSE will adopt a uniform set of penalties for the programming assignments in all CSE courses. There will be a range of penalties, ranging from “0 marks for the assessment item”, “negative marks for the value of the assessment item” to “failure of course with 0FL.”

Here is a statement of UNSW on plagiarism:

Have fun!

Jingling Xue

Last updated 03/19/2023 06:33:26

Whatsapp Us

[SOLVED] COMP3131/9102: Assignment 4

Reviews

Whatsapp Us

[SOLVED] COMP3131/9102: Assignment 4

Reviews

Related products

[Solved] A java program to Remove duplicates from given array of N elemets

[Solved] Assignment 1 Inventory Management System

[Solved] A stack machine is a simple implementation model for executing computer programs

[SOLVED] CST2335 Lab 3 Mobile Application

[Solved] A program to remove duplicates from given array of N elemets

[Solved] Arbitrage