Your goal is to finish a predictive parser and write a type checker for a given language. The input to your project willbe a program and the output will be either a) error messages if there is a type mismatch or syntax error or b) lists ofsymbols with equivalent types if there is no error. Your type checker will enforce semantic checks on the inputprogram, and will be described in the following. First we specify the grammar of our language.1. Grammar Descriptionprogram decl bodydecl type_decl_section var_decl_sectiontype_decl_section TYPE type_decl_listtype_decl_section type_decl_list type_decl type_decl_listtype_decl_list type_decltype_decl id_list COLON type_name SEMICOLONtype_name REALtype_name INTtype_name BOOLEANtype_name STRINGtype_name LONGtype_name IDvar_decl_section VAR var_decl_listvar_decl_section var_decl_list var_decl var_decl_listvar_decl_list var_declvar_decl id_list COLON type_name SEMICOLONid_list ID COMMA id_listid_list IDbody LBRACE stmt_list RBRACEstmt_list stmt stmt_liststmt_list stmtstmt assign_stmtstmt while_stmtstmt do_stmtstmt switch_stmtassign_stmt ID EQUAL expr SEMICOLONwhile_stmt WHILE condition bodydo_stmt DO body WHILE condition SEMICOLONswitch_stmt SWITCH ID LBRACE case_list RBRACEcase_list case case_listcase_list casecase CASE NUM COLON bodyexpr term PLUS exprexpr termterm factor MULT termterm factor DIV termterm factorfactor LPAREN expr RPARENfactor NUMfactor REALNUMfactor IDcondition IDcondition primary relop primaryprimary IDprimary NUMprimary REALNUMrelop GREATERrelop GTEQrelop LESSrelop NOTEQUALrelop LTEQThe tokens used in the grammar description are:TYPE = TYPEVAR = VARREAL = REALINT = INTBOOLEAN = BOOLEANSTRING = STRINGLONG = LONGWHILE = WHILEDO = DOSWITCH = SWITCHCASE = CASECOMMA = ,COLON = :SEMICOLON = ;LBRACE = {RBRACE = }LPAREN = (RPAREN = )EQUAL = =PLUS = +MULT = *DIV = /GREATER = >GTEQ = >=LESS = <LTEQ = <=NOTEQUAL = <>ID = letter (letter + digit)*NUM = 0 + (pdigit digit*)REALNUM = NUM . digit digit*2. Language SemanticsAs can be seen from the grammar, in this language types are first declared, then variables are declared, then thebody of the program follows.2.1. TypesThe language has five built-in types: INT , REAL , BOOLEAN , STRING , and LONG .Programmers can declare types either explicitly or implicitly.Explicit types are names that are not built-in types and that have their first appearance in the program as partof the id_list of a type_decl .Implicit types are not built-in types and not explicit programmer-declared types. Implicit types have their firstappearance as a type_name in a var_decl or a type_decl .ExampleConsider the following program written in our language:TYPEa : INT;b : a;VARx : b;y : c;{y = x;}There are three types declared by the programmer in this example, a , b , and c , where a and b are explicittypes and c is an implicit type.2.2. VariablesProgrammers can declare variables either explicitly or implicitly.Explicit variables are declared in an id_list of a var_decl .A variable is declared implicitly if it is not declared explicitly but it appears in the program body.ExampleConsider the following program written in our language:TYPEa : INT;b : a;VARx : b;y : c;{y = x;z = 10;w = z * 5;}This program has four variables declared: x , y , z , and w , with x and y explicitly declared and z and wimplicitly declared. Note that the implicitly declared variables z and w also have an implicitly declared type.2.3. Declaration vs. UseAny appearance of a name (type or variable) in the program is either a declaration or a use.The following lists all possible declarations of a name:1. Any appearance of a name in the id_list part of a type_decl2. Any appearance of a name in the id_list part of a var_decl3. The first appearance of a name in the entire program, if the name appears as type_name in a type_decl4. The first appearance of a name in the entire program, if the name appears as type_name in a var_decl5. The first appearance of a name in the entire program, if the name appears inside the body of the programAny other appearance of a name is considered a use of that name.Note that the above definitions exclude the built-in type names.Given the following example (the line numbers are not part of the input):01 TYPE02 a : INT;03 b : a;04 VAR05 x : b;06 y : c;07 {08 y = x;09 z = 10;10 w = z * 5;11 }We can categorize all appearances of names as declaration or use:Line 2, the appearance of name a is a declarationLine 3, the appearance of name b is a declarationLine 3, the appearance of name a is a useLine 5, the appearance of name x is a declarationLine 5, the appearance of name b is a useLine 6, the appearance of name y is a declarationLine 6, the appearance of name c is a declarationLine 8, the appearance of name y is a useLine 8, the appearance of name x is a useLine 9, the appearance of name z is a declarationLine 10, the appearance of name w is a declarationLine 10, the appearance of name z is a use2.4. Type SystemOur language uses structural equivalence for checking type equivalence.Implicit types (in variable declarations or on implicitly declared variables) will be inferred from the usage (in asimplified form of Hindley-Milner type inference).Here are all the type rules/constraints that your type checker will enforce (constraints are labeled from C1 to C5 forreference):C1: The left hand side of an assignment should have the same type as the right hand side of that assignmentC2: The operands of an operation ( PLUS , MINUS , MULT , and DIV ) should have the same type (it can be anytype, including STRING and BOOLEAN )C3: The operands of a relational operator (see relop in grammar) should have the same type (it can be anytype, including STRING and BOOLEAN )C4: condition should be of type BOOLEANC5: The variable that follows the SWITCH keyword in switch_stmt should be of type INTThe type of an expr is the same as the type of its operandsThe result of p1 relop p2 is of type BOOLEAN (assuming that p1 and p2 have the same type)NUM constants are of type INTREALNUM constants are of type REALIf two types cannot be determined to be the same according to the above rules, the two types are different3. Incomplete ParserThe provided parser is incomplete, as it is missing an implementation for some of the non-terminals. You mustfinish the given parser so that it can parse any valid input according to our grammar. If you detect a syntax error inthe input, you should output the following message and exit:Syntax ErrorYou can start coding by finishing the parser first and then move on to implementing the type checking part. Youshould make sure that your parser generates a syntax error message if the input program does not follow theproper syntax.We recommend that you check your code on the submission website to make sure it passes all the test cases in theparsing category before moving on to implementing the type checking part.Our grammar is not LL(1) i.e. it does not satisfy the conditions for predictive parser, however, it is still possible towrite a predictive parser by looking at more than one token. A notable case is when parsing condition .4. OutputYour program will check for the following semantic errors and output the correct message when it encounters thaterror. Note that there will only be at most one error per test case.4.1. Duplication Errors1. Errors involving programmer-defined types:Programmer-defined type declared more than once:Explicit type redeclared explicitly (error code 1.1)An explicitly declared type can be declared again explicitly by appearing as part of an id_list in a typedeclaration.Implicit type redeclared explicitly (error code 1.2)An implicitly declared type can be declared again explicitly by appearing as part of an id_list in a typedeclaration.Note that a previously declared type name (either implicit or explicit) cannot be declared again implicitly. Sinceit has already been introduced, the new reference to the name (as type_name in a type_decl or var_decl )would be a use and not a declaration.Programmer-defined type redeclared as variable (error code 1.3)If a previously declared type appears again in an id_list of a variable declaration, the type is redeclared as avariable.Programmer-defined type used as variable (error code 1.4)If a previously declared type appears in the body of the program, the type is used as a variable.2. Errors involving variable declarations:Variable declared more than once (error code 2.1)An explicitly declared variable can be declared again explicitly by appearing as part of an id_list in avariable declaration.Variable used as a type (error code 2.2)If an explicitly declared variable is used as type_name in a variable declaration, the variable is used as a type.Note that an explicitly declared variable cannot be declared again implicitly, appearances of the name in theprogram body are uses. In the same way, an implicitly declared variable cannot be declared again, because alllater appearances are uses.Also note that if a built-in type is redeclared or used in the body of the program, it should result in a syntax error.For these errors, you should output one line in the following format:ERROR CODE <code> <symbol_name>in which <code> should be replaced with the proper code (see the error codes listed above) and <symbol_name>should be replaced with the name of the type or variable related to the error.4.2. Type MismatchIf any of the type constraints (listed in the Type System section above) is violated in the input program, then theoutput of your program should be:TYPE MISMATCH <line_number> <constraint>Where <line_number> is replaced with the line number that the violation occurs and <constraint> should bereplaced with the label of the violated type constraint (possible values are C1 through C5, see section on TypeSystem for details of each constraint). Note that you can assume that anywhere a violation can occur it will be on asingle line.4.3. No Semantic ErrorsIf there are no semantic errors in the program, then your program should output lists of types and variables that aretype-equivalent. The symbols should be listed in the order they appear in the program and built-in types should belisted first in the following order: BOOLEAN , INT , LONG , REAL , STRING . Each list must be on a single line of theoutput and each symbol in the list should be separated by a single space character. Each list must be terminated bya # character.The following pseudo-code should explain the output format more precisely:for each built-in type T:{output Toutput all names that are type-equivalent with T in order of their appearancemark outputted names to avoid re-printing them lateroutput #
}if there are unprinted names left:{for each unprinted name N in order of appearance:{output Noutput all other names that are type-equivalent with N in order of their appearanceoutput #
}}The phrase in order of appearance in the above pseudo-code means that names that appear before others inthe program should be printed first. This order should be easy to maintain since it is the natural order of storingnames in your symbol table.5. ExamplesGiven the following:TYPEa, b, c, b : INT;VARx : a;{x = 10;}The output will be the following:ERROR CODE 1.1 bGiven the following:TYPEa : INT;VARx : INT;b, a : STRING;{x = 10;}The output should be the following:ERROR CODE 1.3 aGiven the following:VARx1 : INT;x2, x3, x1 : a;{x1 = 0;}The output should be the following:ERROR CODE 2.1 x1Given the following:VARx, y : STRING;z : x;{y = x;}The output should be the following:ERROR CODE 2.2 xGiven the following:VARx100 : INT;y : STRING;{x100 = y;}The output should be the following:TYPE MISMATCH 5 C1Given the following:VARx : INT;{x = 100;y = 20.10;y = x;}The output should be the following:TYPE MISMATCH 6 C1Given the following:VARx, y : a1;{WHILE x <> 10{x = x + y;y = y * 1.0;}}The output should be the following:TYPE MISMATCH 7 C2Given the following:TYPEa, b : INT;c : a;d : STRING;VARx : e;y : c;test : d;{a1 = 100;b1 = a1 + (10 * 50);foo = b1 / 50;SWITCH foo{CASE 1:{foo = 0;}CASE 2:{test = test * test;}}h = x;}The output should be the following:BOOLEAN #INT a b c y a1 b1 foo #LONG #REAL #STRING d test #x e h #6. EvaluationYour submission will be graded on passing the automated test cases.The test cases (there will be multiple test cases in each category, each with equal weight) will be broken down inthe following way (out of 105 points):Parsing: 37 pointsErrors involving programmer-defined types (error codes 1.x): 18 pointsErrors involving variable declarations (error codes 2.x): 10 pointsType mismatch errors and no semantic error cases: 40 pointsThe parsing category is not partially graded, you need to pass all test cases in that category to get the 37 points. Allother categories are partially graded.
CSE340
[SOLVED] CSE340 Project 3: Type Checking
$25
File Name: CSE340_Project_3:_Type_Checking.zip
File Size: 292.02 KB
CSE340_Project3_v2
Only logged in customers who have purchased this product may leave a review.
Reviews
There are no reviews yet.