Note: This is a 1-week assignment (due 1 week after release).
In this homework, you’ll implement string support and file handling. You’ll get more practice dealing with data on the heap and learn how to extend your compiler’s functionality via the C runtime.
At the end of this homework assignment, your interpreter and compiler should support the following grammar (we’ve highlighted what you’ll be adding):
<expr> ::= <num> | <id> + | <string> | true | false + | stdin + | stdout | (<z-prim>) | (<un_prim> <expr>) | (<bin_prim> <expr> <expr>) | (if <expr> <expr> <expr>) | (let ((<id> <expr>)) <expr>) | (do <expr> <expr> ...) <z-prim> ::= read-num | newline <un_prim> ::= add1 | sub1 | zero? | num? | not | pair? | left | right | print + | open-in + | open-out + | close-in + | close-out <bin_prim> ::= + | - | = | < | pair + | input + | output
We will NOT be grading your tests for this homework.
However, it will still be that case that when you submit your implementation to Gradescope (to the assignment hw5
), your suite of examples in the examples/
directory will be run against the reference interpreter and compiler. If the reference implementation fails on any of your examples, Gradescope will show you how its output differed from the expected output of your example (if you wrote a .out
file for it).
You can do this as many times as you want. We encourage you to use this option to develop a good set of examples before you start working on your interpreter and compiler!
Your programs can now read from standard input, using both the read-num
operator defined in class and the input
operator you’ll write on this assignment. For testing, you should provide inputs by writing .in
files for each .lisp
file that expects an input. For example, if we defined a program like this in examples/read-num.lisp
:
(print (pair (read-num) (pair (read-num) (read-num))))
We could define its input in examples/read-num.in
:
8 13 21
The testing system will provide this as the input to both the interpreter and the compiler.
Additionally, the testing framework handles rows in examples/examples.csv
of the form
<PROGRAM>,<INPUT1>,<OUTPUT1>,<INPUT2>,<OUTPUT2>,...
For instance, to test a program that echoes single characters read from stdin
to stdout
, you could write:
(output stdout (input stdin 1)), a, a, b, b
When testing reading/writing to files, it can be easy for state to get mixed up between tests (e.g. one test creates a file and writes some things to it, while the test that runs after it expects the file not to exist).
Therefore, we’ve set this homework up such that when you run dune runtest -f
, your code will run in a directory that contains a subdirectory called tmp
. (As an example path, you could have some tests that read and write to the path tmp/hello.txt
.) Reading and writing files from tmp
directory will work both locally and on Gradescope.
This is a great place to read/write files in your tests, but please do not store anything important in this directory—it WILL get erased on each test run! (As a consequence, you will likely not be able to see the actual files that get created and accessed when you use the testing framework.)
If you want to see the files that your program reads from and writes to, we recommend running your program manually in the same manner as previous homeworks (i.e., using dune exec
). Since your tests will likely be writing from paths like tmp/hello.txt
, we recommend creating a tmp
folder in whatever directory you run your code manually from.
In this task, you will implement support for strings. For now, this just means adding support for string literals (i.e., s-expressions built with the Str
constructor). Here are some pointers:
- String literals are sequences of characters enclosed in double-quotes. Strings should be displayed in the same double-quote-enclosed representation.
- Strings may contain characters with special meaning (namely, newlines and double-quotes). When displaying strings, these should be escaped as
and
""
to ensure the resulting expression is well-formed. For instance, a string containing only a double-quote should be displayed as""""""
and not""""""
. You do not need to support special characters besides quote and newline.
Task 1.1 (ungraded): Write tests for strings in the examples/
directory.
Task 1.2: Add support for strings to the interpreter.
Hint: Add a constructor to value
and extend the interpreter’s display_value
function to display strings (use String.escaped
to escape special characters).
Task 1.3: Add support for strings to the compiler. We’ll represent strings like C does: NUL
-terminated sequences of characters. The runtime value for a string should be a pointer to the first character of the sequence tagged with 0b011
.
Hint: Since we’re only concerned with string literals for now, you can implement string support using DqString
, which will embed a string literal into the compiled program as data. As with DqLabel
, you should be sure that program execution never runs this directive—it’s just data, not instructions.
Hint: String literals embedded in this way must be placed at 8-byte aligned addresses, since you will need to tag the pointers with 0b011
. You can use the Align
directive to ensure the next directive will be aligned properly.
Hint: The ASCII character NUL
has ASCII value 0
and is written in C as