20191030 Assignment 3parser: Computer Systems 20007081 Combined
Assignment 3parser
Description
You must complete the implementation of the parser program in the file parser.cpp.
The program reads a Jack class from standard input and writes an XML representation of its abstract syntax tree to standard output. It uses the tokeniser functions described in tokeniser.h to parse a Jack class read from standard input and construct an abstract syntax tree using the functions as described in abstractsyntaxtree.h. The main function is responsible for calling the jackparser function and passing the result to the function astprintasxml. The astprintasxml function is responsible for writing an XML representation of the abstract syntax tree to standard output.
Compiling and Running parser
When the Makefile attempts to compile the program parser, it will use the file parser.cpp, any other .cpp files it can find whose names start parser. For example, if we have our own class abc that we want to use when implementing parser we would name the extra files, parserabc.cpp and parserabc.h
and sharedxyz.h.
The program can be compiled using the command:
make parser
The suite of provided tests can be run using the command:
make testparser
The test scripts do not show the program outputs, just passed or failed, but they do show you the commands being used to run each test. You can copypaste these commands if you want to run a particular test yourself and see all of the output.
Note: Do not modify the provided Makefile or the subdirectories bin, includes or lib. These will be replaced during testing by the web submission system.
Tokeniser
The tokeniser to be used by the parser programs is described in includestokeniser.h. This lists all the tokens that are recognised as well as 9 extra token kinds that can be used with the functions mustbe and have to simplify writing the parser. For example a call of havetkstatement will return true if the next token in the input can start one the Jack statements. Similarly, havetkinifixop can be used to see if a token is an infix operator and havetkrelop can be used to see if a token is a relational operator.
The mustbe which is used to raise an error if the next token is not what is expected, always returns the token that it matched. This can be very useful when the next input is an identifier or operator because we need to remember the particular token that was found. For example, if the next token in the input is an identifier token, mustbetkidentifier will return the identifier token but still advance to the next token in
https:myuni.adelaide.edu.aucourses44936pagesassignment37Cparser 13
20191030 Assignment 3parser: Computer Systems 20007081 Combined
the input. Similarly, if the next token in the input was , a call to mustbetkinfixop will the return the token for thebut still advance the input to the next token.
Notes:
All tokeniser errors are reported by returning the end of input token tkeoi.
Once a tkeoi token is returned, all future attempts to read a token will return tkeoi.
jackparser
A parser goes over the tokenised text and emits output indicating that it understood the texts grammatical structure. In order to do so, the parser must include functions that look for canonical structures in a certain languagein our case Jackand then emit these structures in some agreed upon formalism. The structure of your jackparser function should follow the one developed in workshops rather than the structure in the textbook.
The grammar of the Jack language that must be recognised can be found in the comments near the beginning of the startup file parser.cpp. It has been reorganised slightly so that it is now LL1, that is, you can always tell what to parse next by looking at the next token. In particular, the production rule for a term has been changed so that all options starting with a variable are now in a separate production rule named varterm. Since a large part of the initial work in writing a parser is expanding the grammar into matching code, the startup file parser.cpp includes an empty function for each production rule. This will save you a lot of typing. However, you do not need to keep this structure and are free to change it as much as you like.
Notes:
All input must be read using the tokeniser functions described in includestokeniser.h. You should use the symbol table functions described in includessymbols.h.
No output should be produced, the parser simply returns an abstractsyntaxtree.
Do not modify the main function.
The Abstract Syntax Tree
The abstract syntax tree returned by the jackparser function should contain one node for each production rule of the Jack grammar described in the parser.cpp file with a few exceptions. These exceptions must match those in the supplied test data and may include:
Jack source files only contain a single class so the root node must be an astclass node.
Static variables, field variables, parameters and local variables are all represented using astvardec nodes. The order of these nodes must match the order in which their variables are declared.
There are two ifStatement nodes, one with an else statement and one without.
There are two return nodes, one for a void return and one for returning the value of an expression. There are no varterm nodes, they are replaced by what would have been their child nodes.
When creating nodes to represent a subroutine call in a do statement or expression where no varName or className has been provided, the subroutine is assumed to be a method of the class being parsed. Therefore, an astthis node will need to be created.
https:myuni.adelaide.edu.aucourses44936pagesassignment37Cparser 23
20191030 Assignment 3parser: Computer Systems 20007081 Combined
The abstract syntax tree nodes are immutableyou cannot change themso a nodes subtrees must be created first. In cases when a node has a variable number of children you may need to create a vectorast variable to hold the subtrees until you are ready to create the required node.
Errors to Catch
There are lots of different kinds of errors that a compiler may be able to detect. However, for the purposes of this assignment we are only interested in detecting the following errors:
Syntax errors. If at any point in the parsing you cannot find the next symbol that must be present you have detected a syntax error.
Declarations of more than one variable with the same name in the same context. That is, no two static, field, parameter or local variables can have the same name, no static variable can have the same name as a field variable and no parameter can have the same name as a local variable.
Attempting to use an undeclared variable. Not all such errors can be detected because in a subroutine call we cannot tell the difference between an undeclared variable and the name of another class.
Attempting to return a value from a void function or void method or an attempt to not return a value from a non void function or method or an attempt to return something other than this from a constructor.
A constructor, function or method that might not execute a return statement.
A constructor declared with a return type that is not its own class.
Attempts by a function to access a field of its class. The field is treated as an undeclared variable.
Errors to Ignore
The following semantic errors will be ignored and the parsing allowed to complete:
Attempts to declare more than one constructor, function or method with the same name or to call a constructor, function or method that does not exist. Detecting errors in naming subroutines will be deferred to the assembler when the final VM code version of a program is translated into assembly language.
Attempts to apply operators, infix or unary, to values of the wrong types. This is a potentially significant error that we will ignore.
Attempts to return a value of a different type from the declared return type of a function or method. This is a potentially significant error that we will ignore except in the case of constructors.
Attempts to call a subroutine without specifying a varName or className inside function. This is assumed to be a method of the class being parsed but a function has no object to operate on. This is a potentially significant error that we will ignore.
https:myuni.adelaide.edu.aucourses44936pagesassignment37Cparser 33
Reviews
There are no reviews yet.