5/5 – (2 votes)
Introduction
The program you will write in this assignment is a simple Unix shell. The shell is the basic text interface of the Unix world, and you will use shells for most of the remaining projects in this course (not the one you wrote!). While building this shell, you should gain a much better understanding of how real Unix shells work.
Finally, you will get a chance to practice process creation and control.
Overview of Shells
A command-line interpreter (CLI), or a shell, is a program that runs other programs on the behalf of its user. A shell repeatedly prints a prompt, waits for input from the user, and then executes the actions that were requested. The shell you implement in this project will be similar to, but much simpler than, the shell you use when you log in to a typical Unix system.
The input provided by the user takes the form of a sequence of command lines. A command line is a sequence of ASCII words separated by whitespace. The first word in the command line is either the name of a built-in command or the name of an executable file. The remaining words are command-line arguments.
Shells will usually execute the requested actions by running other programs. However, some actions simply cannot be executed by other programs. For these, the shell provides a set of commands known as built-in commands, and it processes these differently. You will implement these details later in the project.
Shells also offer other convenience features. For example, the user may want to capture the output of a program in a file, for later analysis. The user may also want to be able to run several child processes at the same time. You will implement both these features in this project.
At the end of this project, you will have a fully-functioning (albeit relatively simple) Unix shell.
Typographical Conventions
Vocabulary/terminology that you should know will be italicized.
Things that need to be emphasized will be in bold font.
Filenames, code, and terminal output (generally, anything you might expect to see in the terminal when working on this project) will be in teletype. Usually, if its prefaced by unix>, this is something you should type in your regular shell, while things prefaced by utcsh> should be given as input to your project.
🐚 The shell emoji will be used when we wish to point out a difference between how this project works and how most real-world shells (e.g. bash, zsh) work.
Getting Started with Your Partner
Begin by talking with your partner and agree on how you will collaborate. If you need to work remotely, you can find some ideas in the remote collaboration guide
Set up a repository on the UTCS GitLab server in accordance with the Git Instructions
As you get started this week, please submit your answers to the questions in our Group Planning document this Friday evening. On the second Friday of the project, we will ask you to submit a Group Reflections document.
Getting Started with the Shell Project
We provide starter code for this project. Get it from the class web page either by running the following command from the command line:
unix> wget https://www.cs.utexas.edu/%7Eans/classes/cs439/projects/shell_project/shell_project.tar.gz
or by downloading it in your browser.
Put the file shell_project.tar.gz in the protected directory (the project directory) in which you plan to do your work. Then do the following:
- Type the command tar xvzf shell_project.tar.gz to expand the tar archive. Once the command has finished expanding the archive, you should see the following files:
Files: Makefile # Compiles your shell program and runs the tests README.shell # Used for submission # Files for Part 0 fib.c # Implement fibonacci here # Files for Part 1/2 argprinter.c # A test program which can be used to debug execv issues util.c # Instructor-provided utility functions util.h # The header file for util.c utcsh.c # Implement your shell here tests # A directory of tests for your shell examples # A directory with example shell scripts in it
- Type the command make to ensure that your compiler can build the skeleton code.
- Fill out the requested information in README.shell, where applicable.
- Download and read over the questions in the design document
- Log your time (now and every time you work!) in the Pair Programming Log.
Read this entire handout and consider the overall design of your shell before writing any code. To help you consider your design, please look over the questions in the design document. If you do not do this, you may discover that you need to rewrite major portions of your code as you progress!
Part 0: fork()/wait()
In this phase of the project, you will learn about the fork() and wait() system calls that you will use in the rest of the project.
Part 0.1: Reading
Sections 5.4 and 5.6 of OSTEP may be helpful to read before starting this project. You may also wish to consult the class resources (e.g. on C programming and shell usage) before starting.
Part 0.2: Fibonacci
Update fib.c so that if invoked on the command line with some integer argument n, where n is less than or equal to 13, it recursively computes the nth Fibonacci number.
To ensure that you learn about fork()/wait(), there are restrictions on how your fib program must be written. The full rules are below, but the general idea is that the program will be written in a recursive manner, and each recursive call must be made by a new process (i.e. by calling fork() followed by doFib()). The child then returns its result to the parent, which waits until the child is done.
Example outputs:
unix> fib 32unix> fib 1055
Your fib program must conform to the following rules:
- The output when given an argument n between 0 and 13 (inclusive) must be the n-th Fibonacci number.
- Fibonacci numbers must be computed recursively, with each recursive call occurring in a new process.
- The final result must be printed by the original process.
- You are allowed to modify the body of doFib().
- You must not modify the number of parameters or return values of doFib().
- You are allowed to create helper methods, as long as the computation of the Fibonacci number is done through creation of child processes.
- If given invalid arguments, your fib program should print the usage message and exit.
- Your program must not create a fork bomb when run. Any fork bomb behavior will result in a zero for this part of the project.
- Your output must exactly match the examples given above!
Part 1: Shell Skeleton
In this part of the assignment, you will start building the basic framework of your shell. At the end of this section, you should have a basic functioning shell framework, which you will extend and upgrade in future sections.
Your basic shell will be called utcsh1.
Note: Parts 1 and 2 will walk you through a recommended implementation order for the shell. You are not required to implement everything in this order, however, you must implement all functionality in both parts to receive full credit.
Remember to read this entire document before implementing anything!
Part 1.1: Reading
Before you implement anything, you should read this entire document, as well as the design document template. Yes, that is a lot of reading. Yes, you should do it anyways (at least skim the documents!).
No additional external reading is required, though if you have not worked with command line interfaces before, you may wish to read Ubuntus command line tutorial
You may also find it helpful to read the manpages for strtok, strcmp, and execv, though this will not be required until later.
Part 1.2: REPL
The core of any shell is the REPL, or the read-evaluate-print loop. This is a loop that does the following three actions repeatedly:
- Read input from the user (or from script the user specifies).
- Evaluate the input, figuring out what the user wants to do and doing it.
- Print any output associated with the requested action.
Implement a REPL in the main() function in utcsh.c. Print utcsh> at the start of the line, then read the users input.
For now, your REPL should ignore most inputs. The only command it will respond to is the built-in command exit, which will cause the shell to exit by calling exit(0). You should also call exit(0) if you reach the end-of-file while reading input.
For reading lines of input, you should use getline(). Use man getline to learn more about this function.
🐚 An interesting point is that, in C, 0 is false and nonzero is true, while in the world of UNIX exit codes, 0 indicates success and nonzero indicates failure. This turns out to be very useful, but can be a bit hard to keep track of when youre initially learning how to work with shells.
Part 1.3: Parsing and Built-in Commands
Recall from the introduction that a command line consists of ASCII words separated by space. Implement some way to split the command line so that you can recover these words, e.g. you should be able to immediately tell that the 4th word of "path a b c d e" is "d".
We recommend that you use strtok() for this. Read the manpage for this function carefully. Careless use of this function has been known to lead to many hours of debugging.
Expand your shells ability to process built-in commands by adding error checking to exit and implementing two new built-in commands:
- exit: You implemented this built-in in the last section. Now add error handling: It is an error to pass any arguments to exit. See the next section (1.4) for details about how to handle errors.
- cd: cd always takes exactly one argument (any other number is an error). This should call the chdir() system call with the user-supplied argument. If the chdir fails, that is also an error.
- path: the path command takes zero or more arguments, with each argument separated by whitespace from the others. A typical usage might look like this:
utcsh> path /bin /usr/bin
This command will be used in Part 2.1for now, just worry about being able to separate the arguments of this command without crashing.
Part 1.4: Handling Errors
The previous section has our first encounter with errors. For ease of implementation, your shell will only ever have one error message: An error has occurred.
Whenever an error occurs, your shell should print the error message on stderr and continue. The only time your shell should exit in response to an error is described in Part 1.6. 2
An example snippet for how to print the error is given below. If the snippet does not meet your needs, you should write your own function:
char emsg[30] = "An error has occurred
"; int nbytes_written = write(STDERR_FILENO, emsg, strlen(emsg)); if(nbytes_written != strlen(emsg)){ exit(2); // Shouldn't really happen -- if it does, error is unrecoverable }
It is never acceptable to crash, segfault, or otherwise break the shell in response to bad user input. Your shell must always exit gracefully, i.e. by calling exit() or returning from main().
🐚 Of course, most real world shells implement a huge variety of error messages to help the user figure out where something went wrong.
Part 1.5: Executing External Commands
If the command given is not one of the three built-in commands, it should be treated as the path to an external executable program.
For these external commands, execute the program using the fork-and-exec method discussed in class. Here are some hints to help you out:
For the child process: The child process must execute the given command by using the execv() call. You may not call system() to run a command. Remember that if execv() returns, there was an error (usually caused by incorrect arguments or the file not existing).
For the parent process: The parent should use wait() or waitpid() to wait on the child. Note that the parent does not care about what happens to the child. As long as fork() succeeds, the parent considers the process launch to have been a success.
🐚 Typical shells will collect the exit code of the child to communicate information to the programmer. For example, the exit code of the diff program can tell you not just whether two files were the same, but how they differed. For simplicity, utcsh does not worry about this.
Part 1.6: Reading A Script
Sometimes, it is very annoying to have to type in commands one at a time. One common solution for this is to create a script by putting a related sequence of commands into a file and using the shell to run that file.
Implement a script system: if utcsh is invoked with one argument, instead of reading commands from stdin, it assumes that its argument is a filename and attempts to read commands one at a time from that file instead of from stdin.
You can find example scripts in the examples/ directory. say_hello.utcsh is the most basic script and consists of a bunch of external commands There is also the more advanced say_hello_path.utcsh, which relies on the path feature (which you will implement in 2.1).
There are two other important changes when operating in script mode:
- The utcsh> prompt should not be printed in script mode.
- If the input file is invalid, or there is more than one argument, utcsh should print an error message and exit with an error code, i.e. call exit(1). This is the only situation in which an error should cause utcsh to exit.
Note that until you finish this section, you will not be able to run the automated test suite. Once you have finished this section, you may check Section 4 for details on running the tests.
🐚 To show you what a script looks like for bash, weve included two bash scripts in the examples directory. One does the same thing as the say_hello scripts and can be run with bash examples/say_hello.bash. The other can be run with bash examples/file_exists.bash <filename> and will tell you whether <filename> exists, and if so, if it is a regular file or a directory.
At this point, you have a basic shell that can run both built-in and external commands, both from a script and from stdin (keyboard input)for example, you should be able to run the say_hello.utcsh script in the examples directory.
Now might be a good time to make a git commit, if you havent done so already!
Part 2: Advanced Shell Features
Part 2.1: Paths
When you implemented external program execution, you assumed that the 0-th argument was the path to an executable file. Unfortunately, this is annoying for users, because nobody wants to type /usr/local/bin/ls every time they want to run the ls command.
The solution to this is a PATH: a set of user-specified directories to search for external programs. When the shell is given a command it does not recognize, it looks for this program in its PATH.
Note that, for the rest of this document, path will refer to a string with slashes in it which is used to locate a file, while PATH will be used to refer to a list of paths used to search for binary files. 3
If the program youre given is not an absolute path, i.e. a path which starts from /, you should search for your program in each directory in the PATH. For example, if your PATH is "/bin" "/usr/bin", you would search for /bin/ls and /usr/bin/ls, executing the first one you found (and returning an error if neither exists). You can check that the file exists and is executable using the functions we provide in the skeleton code. If the file does not exist, or it is not executable, this is an error.
The user can set the PATH with the path command. Each argument to the path corresponds to an entry in the shells PATH. The path command completely overwrites the existing PATHit does not append entries. If the PATH is empty because the user executed a path command with no arguments, utcsh cannot execute any external programs unless the full path to the program is provided.
A variable for the PATH is already provided for you in the skeleton code, called shell_paths. You can manipulate this variable directly, or by using the helper functions in util.c/util.h.
🐚 Real shells also let you specify relative paths to programs, e.g. you can type bin/myprog to run a program relative to your current working directory. You do not need to worry about this for utcsh: the program name will either be an absolute path or the name of a program to be searched for in shell_paths.
Reminder: the shell itself does not implement ls or any other programit simply looks them up in the path and executes them.
Part 2.2: Redirection
Many times, a shell user prefers to send the output of a program to a file rather than to the screen. Usually, a shell provides this nice feature with the > character. Formally this is called redirection of output. Your shell should include this feature.
For example, if a user types ls -al /tmp > output, nothing should be printed to the screen. Instead, the standard output and standard error of the program should be rerouted to the file output.
If the output file already exists, you should overwrite and truncate it. Look through the flags in man 2 open to find out how to do this.
Here are some rules about the redirection operator:
- Multiple redirects in a command are an error, e.g. ls > file1 > file2.
- A redirect without a corresponding command is an error, e.g. > file1.
- A redirect without a corresponding file is an error, e.g. ls >
- There will always be spaces around a redirect, e.g. ls>file1 is requesting command execution of a file called ls>file1, not a redirection.
- You do not need to worry about redirection for built-in commands, e.g. we will not test what happens when you type path /bin > file.
🐚 Real shells usually allow multiple redirects and redirect stdout and stderr separately, and allow you to redirect them to each other, e.g. you can direct stdout into stderr.
Part 2.3: Concurrent Commands
Your shell will allow the user to launch concurrent commands. Remember: when two things are concurrent, they appear to execute at the same time whether they actually run simultaneously or not (logical parallelism). In UTCSH, this is accomplished with the ampersand operator:
utcsh> cmd1 & cmd2 & cmd3 args1
Instead of running cmd1, waiting for it to finish, and then running cmd2, your shell should run cmd1, cmd2, and cmd3 (with whatever args were passed) before waiting for any of them to complete.
Then, once all processes have been started, you must use wait() or waitpid() to make sure that all processes have completed before moving on.
Each individual command may optionally have its own redirection, e.g.
utcsh> cmd1 > file1 & cmd2 arg1 arg2 > file2 & cmd3 > file3
Unlike the redirection operator, the ampersand operator might not have spaces around it. For example cmd1 arg1&cmd2 > file2 is a valid command line, and requests the execution of two commands. In addition, some or all of the commands on either side of the ampersand may be blank. This means that, for example, &&&&&&& is a valid command line.
As you process these commands, there are a number of special cases to consider. In doing so, you may assume the following:
- If a command line has multiple concurrent commands that are all external, the current spec applies.
- If a command line has multiple concurrent commands that are all built-in, the shell should execute them sequentially from left-to-right.
- You may assume that we will not test command lines that have mixed concurrent external/internal built-in commands. Your shell should not crash if this happens, but otherwise, there are no requirements on what it must do.
🐚 In most bash-like shells, & is actually appended to the end of a command to instruct it to run in the background. You can search for Bash Job Control if you want to learn more, but dont try to use this syntax in your actual shell to run jobs in parallel, or weird things might happen!
Part 3: Hints
General Hints
- Remember that C does not have strings in the same way that Java doesyou will need to be careful about string handling. You can read more about string handling in these notes. For a refresher on general C principles, please see our guide to C basics
- Always, always check the return codes of all system calls, from the very beginning of your work. This will often catch mistakes in how youre using these functions.
- USE GIT. Make a commit every time you have working code, or have implemented a small part of a milestone, not just when you need to get the current version to your partner. Committing more frequently lets you try things out much more easily, since its easy to revert changes if you screw things up.
- For more information on using git, check our our guide to version control and git. In this class, well be using the UTCS GitLab server. For help getting started, check out our guide to getting started with UTCS GitLab.
- You are allowed to modify any file you want in this project. You dont need to ask for permission, or leave a note for us, or say anything on Canvas, just do it! However, note that we use our own copy of tests (any files present under the tests directory), so any changes you make to the test cases or testing framework will be reverted before we grade.
- Errors can be debugged with printf and gdb. For general debugging help, check out our debugging FAQ for CS 439.
Hints for Part 0
- The waitpid(), fork(), and exit functions will come in handy. Use man to learn about them. Remember, you can use man man to learn about man.
- The WEXITSTATUS macro described in the waitpid manpage may be useful.
Hints for Part 1
- A real concern in any text processing program in C how much memory to allocate for text handling. In order to simplify your shell implementation, you are allowed to limit the size of your inputs according to the macros defined in util.h. When interpreting these limits, remember that a command line may consist of multiple commands.If the input violates these limits, you may print the error message and continue processing. Do not crash the shell if these limits are violated.It is possible to write the shell in a way that these limits are not needed, but it is slightly more challenging.
- Think carefully about how you design your tokenization routines. Right now, you only have to deal with one command. In Part 2, youre going to deal with multiple commands, possibly each with their own redirects, and each of which can error independently of the others. Make sure your design can grow to accommodate this.A good basic design is to allocate an array of char*, then use strtok to fill it up one element at a time. At the end of this procedure, array[0] should be the 0-th argument, array[1] should be the 1st, and so on. You should then store this information in a way that allows multiple copies (i.e. not in a global structure, which tends to be a bad idea anyways).
- Be extremely careful about doing a == b or a = b when a and b are char*. This likely does not do what you think it does. In order to do the operations, look into strcmp(), strcpy(), and strncpy() in string.h.
Hints for Part 2
- Output redirection can be achieved by using a combination of open() and dup2(). Check the man pages for more details. You should make these calls after the child process has forked, but before the call to execv(). Check https://www.cs.utexas.edu/~theksong/2020/243/Using-dup2-to-redirect-output/ for an example of dup2() usage.
- If youve been using the recommended functions so far, you might need to add a new function or data structure to make concurrent commands work easily.
- Dont be afraid to modify the skeleton code. We are not checking that your skeleton code is unmodified, were checking that your program works and is well-written. If you have to modify, add to, or remove from the skeleton to achieve this, do it!
Line Count Hints
We are providing the rough number of lines of code used in the reference solution as a rough hint for you, so you can see how much work is needed for each function. These numbers have been rounded to the nearest multiple of 10.
| Function | Lines of Code |
|---|---|
| tokenize_command_line | 50 lines |
| parse_command | 60 lines |
| eval | 60 lines |
| try_exec_builtin | 60 lines |
| exec_external_cmd | 30 lines |
| main | 50 lines |
Part 4: Checking Your Work
To help you check your work, weve provided a small test suite, along with some tools to help you run it.
Each test in the test suite will check three things from your shell:
- The exit code
- The standard output
- The standard error
If any of these differ, the test suite will print an error and tell you what part of the output was wrong, along with commands you can run to see the difference.
In order to make this easier on you, weve included some helper rules in the Makefile to let you run tests easily.
- To run the full testsuite, run make check.
- To run an individual test, run make testcase id=#, e.g. make testcase id=15.
- To get a description of a test, run make describe id=#. This can be useful if youre not sure what a test does or want to get commands to run it yourself (e.g. to run it under a debugger)
No test should run for more than 10 seconds without either passing or failing. If your test runs for longer than this, you likely have an infinite loop in your code.
In general, you should not look directly at the test files themselves unless you want to understand the test suite or modify the tests. If you want to run the command that the test runs, use make describe.
Part 5: On Programming and Logistics
Makefile
Your code will be tested and graded with the output of make utcsh or make (the two rules are equivalent in the provided Makefile). To aid you in debugging, two additional rules have been created in the makefile:
- make debug will create a binary which is not heavily-optimized and has more debugging information than the default build. If you want to feed your program into a debugger like gdb, valgrind, or rr, you should use this rule to generate it.
- make asan will create a binary with sanitizers. Think of these as extra error checking code that the compiler adds to the program for you. When you run a program that has been compiled with sanitizers, the binary itself will warn you about memory leaks, invalid pointer dereferences, and other such issues.We do not enable the sanitizers by default because they can turn an otherwise-correct program into an incorrect one, e.g. if your program is correct except for a small memory leak, the sanitized binary will still exit with an error.
You may use these rules to quickly generate programs for debugging, but keep in mind that your grade will be based on the binary generated by make utcsh.
Design Document
As part of this project, you will submit a design document, where you will describe your design to us. Please note that this document is a set of questions that you will answer and is not free form. Your group will submit one design document.
General
- You must work in two-person teams on this project. Failure to do so will result in a 0 for the project. Once you have contacted your assigned partner, do the following:
- exchange first and last names, EIDs, and CS logins
- fill out the README.shell distributed with the project
- register in Canvas as a Shell Group. (Add yourselves to an empty group of your choosing. Feel free to change the name to something more creative! Keep it clean.)
- Create a private GitLab repo and invite your partner at at least maintainer level. See our git and version control guide for details on how to do this.
You must follow the pair programming guidelines set forth for this class. If you have not registered a group by the group registration deadline, we will assign you to a group.Please see the Grading Criteria to understand how failure to follow the pair programming guidelines OR fill out the README.shell will affect your grade.
- You must follow the guidelines laid out in the C Style Guide or you will lose points. This includes selecting reasonable names for your files and variables.
- This project will be graded on the UTCS public linux machines. Although you are welcome to do testing and development on any platform you like, we cannot assist you in setting up other environments, and you must test and do final debugging on the UTCS public linux machines. The statement It worked on my machine will not be considered in the grading process.
- The execution of your solution shell will be evaluated using the test cases that are included in your project directory. To receive credit for the test cases, your shell should pass the provided test case, as determined by make clean && make utcsh && make check.
- Your code must compile without any additions or adjustments, or you will receive a 0 for the test cases portion of your grade.
- Do not use _exit() for this assignmentuse exit() instead.
- You are encouraged to not use linux.cs.utexas.edu for development. Instead, please find another option using the departments list of public UNIX hosts.
- You are encouraged to reuse your own code that you might have developed in previous courses to handle things such as queues, sorting, etc. You are also encouraged to use code provided by a public library such as the GNU library.
- You may not look at the written work of any student other than your partner. This includes, for example, looking at another students screen to help them debug or looking at another students print-out. See the syllabus for additional details.
- If you find that the problem is under specified, please make reasonable assumptions and document them in the README.shell file. Any clarifications or revisions to the assignment will be posted to EdStem.
Submitting Your Work
- After you finish your code, use make turnin to submit a compressed tarball named turnin.tar.gz for submission. It may be a good idea to unpack this tarball into a clean directory on a UTCS linux system to make sure it still compiles. You should then upload the file to the Project 0 Test Cases assignment on Canvas. Make sure you have included the necessary information in the README.shell and placed your pair programming log in the project directory.
- Once you have completed your design document, please submit it to the Project 0 Design and Documentation assignment in Canvas. Make sure you have included your name, CS login, and UT EID in the design document.The purpose of the design document is to explain and defend your design to us. Its grade will reflect both your answers to the questions and the correctness and completeness of the implementation of your design. It is possible to receive partial credit for speculating on the design of portions you do not implement, but your grade will be reduced due to the lack of implementation.
Grading
Code will be evaluated based on its correctness, clarity, and elegance according to the Grading Criteria. Strive for simplicity. Think before you code.
The most important factor in grading your code design and documentation will be code inspection and evaluation of the descriptions in the write-ups. Remember, if your code does not follow the standards, it is wrong. If your code is not clear and easy to understand, it is wrong.
Footnotes
Project adapted from one used in OSTEP. Many thanks to the Drs. Arpaci-Dusseau for permission to use their work.
[1]: This is both an homage to the UTCS department and a play on the name of the popular tcsh shell. ↩
[2]: Note that we check the return value of the write call in spite of the fact that all we can do if its wrong is exit. This is good programming practice, and you should be sure to always check the return codes of any system or library call that you make. ↩
[3]: Sometimes you hear PATH referred to as the path, but in most real-world contexts, you will need to deduce which one is meant from context. ↩

![[Solved] CS439 Project 0: The Shell](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] CS439 Project 0: The Shell](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.