5/5 - (1 vote)

Project Step 1 Scanner

Due: September 6th

Name: [SOLVED] ECE 468 Project Step 1 Scanner
Brand: Assignment Chef
SKU: 3109866590
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

The first real step of the project is building the first phase of a compiler: the scanner (sometimes called a tokenizer). A scanners job is to convert a series of characters in an input file into a sequence of tokens the words in the program. So, for example, the input

A := B + 4

Would translate into the following tokens:

IDENTIFIER (Value = "A")OPERATOR (Value = ":=")IDENTIFIER (Value = "B")OPERATOR (Value = "+")INTLITERAL (Value = "4")

The way that we define tokens in a programming language is with regular expressions . For example, a regular expression that defines an integer literal token looks like:

[0-9]+ (read: 1 or more digits),

while a regular expression that defines a float literal token looks like:

[0-9]+.[0-9]* | .[0-9]+ (read: Either 1 or more digits followed by a decimal followed by 0 or more digits; or a decimal followed by 1 or more digits)

(See the notes for Lecture 2 for details).

While you can write a scanner by hand, it is very tedious. Instead, we typically use tools to help us automatically generate scanners. The tools we recommend you to use are either flex (if youre planning on writing your compiler in C or C++) or ANTLR (if youre planning on writing your compiler in Java). flex is available on most Unixes/Linux (including the ecegrid machines), while ANTLR requires a download . If you want to use other tools to generate your scanner, feel free, but we will be able to provide less help.

If you use ANTLR, you will find its API documentation . If you use flex , the manual is a good resource, and a working example using flex and its accompanying tool bison (which you will use in step 2) is available here

Token definitions

We will be building a compiler for a simple language called MICRO in this class. The token definitions (written in plain English) are as follows:

an IDENTIFIER token will begin with a letter, and be followed by any number of letters and numbers.IDENTIFIERS are case sensitive. INTLITERAL: integer number ex) 0, 123, 678FLOATLITERAL: floating point number available in two different formatyyyy.xxxxxx or .xxxxxxxex) 3.141592 , .1414 , .0001 , 456.98STRINGLITERAL: any sequence of characters except '"' between '"' and '"' ex) "Hello world!" , "***********" , "this is a string"COMMENT:Starts with "--" and lasts till the end of lineex) -- this is a commentex) -- any thing after the "--" is ignored KeywordsPROGRAM,BEGIN,END,FUNCTION,READ,WRITE,IF,ELSE,FI,FOR,ROF,RETURN,INT,VOID,STRING,FLOATOperators:= + - * / = != < > ( ) ; , <= >=

What you need to do

You should build a scanner that will take an input file and output a list of all the tokens in the program. For each token, you should output the token type (e.g., OPERATOR ) and its value (e.g., + ).

There are sample inputs and outputs here. These are the only inputs we will test your compiler on. Your outputs need to match our outputs exactly (we will be comparing them using diff , though we will ignore whitespace).

Hints

Note that even though our sample outputs combine together a bunch of different tokens as a single type (e.g., all keywords have the token type KEYWORD ), you will be better served by defining every keyword and operator as a different token type (so your scanner will have different tokens for, say, := and < ), and then writing a little bit of extra code to print the output we expect for that token type.

While it might seem weird, you will need to define a token that eats up any whitespace in your program (recall that your compiler really only sees a list of characters; it has no reason to think that a tab character isnt an important character). Make sure that when you recognize a whitespace token, you just silently drop it, rather than printing it out.

What you need to submit

All of the necessary code for your compiler that you wrote yourself. You do not need to include the ANTLR jar files if you are using ANTLR.
A Makefile with the following targets:
1. compiler: this target will build your compiler
2. clean: this target will remove any intermediate files that were created to build the compiler
3. team: this target will print the same team information that you printed in step 0.
A shell script (this mustbe written in bash, which is located at /bin/bashon the ecegrid machines) called runmethat runs your scanner. This script should take in two arguments: first, the input file to the scanner and second, the filename where you want to put the scanners output. You can assume that we will have run make compilerbefore running this script.

While you may create as many other directories as you would like to organize your code or any intermediate products of the compilation process, both your Makefile and your runme script should be in the root directory of your repository.

Do not submit any binaries . Your git repo should only contain source files; no products of compilation.

You should tag your step 1 submission as step1-submission

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] ECE 468 Project Step 1 Scanner

Due: September 6th

Token definitions

What you need to do

Hints

What you need to submit

Reviews

Related products

[Solved] Car.py

[Solved] Program that reads in the file climate_data_2017_numeric.csv

[SOLVED] Project 9-1: Monthly Payment Calculator

[Solved] Program6_1.py

[SOLVED] ITEC136 Python Program

[Solved] Assignment 9 Changing Sort Keys