[Solved] CSC230 Exercise 9- Parsing File Input

$25

File Name: CSC230_Exercise_9-_Parsing_File_Input.zip
File Size: 348.54 KB

SKU: [Solved] CSC230 Exercise 9- Parsing File Input Category: Tag:
5/5 - (1 vote)

Parsing File Input

For this exercise, youre going to write a program that reads input from a file and writes out a report to standard output. Youll find a partial implementation, fields.c, along with a sample input and expected output files on the course homepage. You can also download these files using the following curl commands.

curl -O https://www.csc2.ncsu.edu/courses/csc230/exercise/exercise09/fields.c curl -O https://www.csc2.ncsu.edu/courses/csc230/exercise/exercise09/input.txt curl -O https://www.csc2.ncsu.edu/courses/csc230/exercise/exercise09/expected.txt curl -O https://www.csc2.ncsu.edu/courses/csc230/exercise/exercise09/expected-stderr.txt

Input Format

Your program will use fopen() to open an input file named input.txt for reading.

Youll read lines from the input file using fscanf(), taking advantage of the format string syntax we went over in class (e.g., field widths, character classes, literal characters, skipping whitespace). I just made up the format for this input file, trying to include lots of opportunities for using parts of the scanf() format string syntax. Each valid line from input.txt is formatted like the following example:

Venus Jensen 33770530841 [email protected] FRNO 624-771-4676 SIJ SBE WHV TVW

  • The first field is a persons first name, consisting of at most 11 characters, and not containing any spaces. You can parse this like a string, using the field width to prevent buffer overflow.
  • The next field is a persons last name, also consisting of at most 11 characters, and not containing any spaces.
  • The third field is a sequence of 1 or more decimal digits. Youll need to read past this this field, but its not used in the output, so you can parse over it without saving its value anywhere.

Remember, the scanf() format string supports a syntax for parsing a conversion specification, but not saving it anywhere. Use that syntax to parse this sequence of digits but not save it.

  • The fourth field is an email address, consisting of a string of up to 8 lower-case letters (an ID, like your unity ID), followed by an @ sign, then a sequence of lower case letters and dots. You only need the part before the @ sign, so you can save just that and skip over the rest.

Remember, the scanf() format string supports a character class syntax for matching a string consisting of any set of characters you want. Use this syntax to match everything up to (but not including) the @ sign. Then, tell scanf() to skip over the rest of the email address, without saving it anywhere.

Remember that literal characters in the format string can be used to match copies of themselves in the input. You could use this to match the @ in the email address, then use a character class to match the rest of the email address, without saving it anywhere.

  • The fifth field is a code consisting of capital letters. You dont need to report this field in the output, so your format string can parse it without saving it anywhere.
  • The sixth field is a phone number, consisting of three decimal digits, followed by a dash, then three more decimal digits, then another dash, then four more decimal digits. The phone number may or may not have spaces on either side of the dash characters (either way is fine). In the output, youll need to report the phone number in a slightly different format, so youll want to save the three numeric parts to three different strings, skipping over the characters and possible spaces in between.
  • After the phone numbers, theres a sequence of zero or more codes, each consisting of capital letters.

You dont need these codes in the output, so you can just skip over all of them (up to the end-of-line).

Fields in the input are separated by one or more spaces.

Output Format

Optional (everything below about dealing with invalid input is not required): If an input line isnt in the format described above, its considered an invalid line. For each of these, well print a line with the following error message to standard error, then skip over that line from the input to continue processing remaining lines. Your error message should look like the following, where the number at the end is the input line number where the error occurred (counting from 1). To report this message, youll have to count input lines as you process them. The partial implementation already has a variable for doing this.

Invalid input, line 7

For each valid line of input, your program will print an output line like the following.

vbjensen Venus Jensen (624)771-4676

This gives the ID of the person, right justified in an 8-column field. This is followed by the first name and the last name, each printed right-justified in an 11-column field. Finally, the phone number is printed as shown above, with parentheses around the area code and a dash between the last two parts of the phone number (called the prefix and the line number, I think). Fields in the output are separated by a single space (so, for example, theres a space right after the ID, before the start of the 11-column field containing the first name).

Skipping Lines

Youre going to need to write code to skip input lines. For a valid input line, you can use this to skip the sequence of arbitrary codes at the end of the line. Optional: For an invalid line, this is even more important, since parsing will stop as soon as fscanf() encounters input thats inconsistent with your format string. Youll need to discard the rest of the bad input line, just to move ahead to the next line in the input.

Note to class: Instead of the below, its fine if you simply use getchar() (and discard the characters read) until the program reaches (and discards) the
at the end of the line; way simpler than using fscanf(). Skipping input lines is easy, with a little help from fscanf(). Id like you to figure out some of this yourself, but here are some thoughts to get you started. Remember, you can define a character class using ^ to tell fscanf() characters not to match (rather than characters to match). With this, you could easily make a format string that says to parse everything up to the newline.

After youve told fscanf() to parse and discard everything up to (but not including) the newline, you still have to tell it to read past the newline itself. There are a few ways to do this, but remember that something like fscanf(
) may not do exactly what you want. Any whitespace (including a newline) in the format string tells fscanf() to skip any amount of whitespace in the input. So, fscanf(
) would skip any number of blank lines. Instead, maybe think of it like this: If youve already read everything in the current line, up to the newline character at the end, you should just have to read one more character to finish reading the line.

In my solution, I handled this with two calls to fscanf(), one to discard all the characters remaining on the current line, then one more call to read the newline character at the end. If you try to do both of these things in the same call to fscanf(), you could run into problems with blank input lines or or cases when youve already read everything up to the newline (so, only the newline remains).

Sample Execution

When your program is done, you should be able to run it as follows.

$ ./fields

yhhart Young Hart (608)370-2757 vbjensen Venus Jensen (624)771-4676 jbnorris Jasmin Norris (582)502-1033 mbblackw Micheal Blackwell (497)640-3345 auford Abram Ford (729)740-2360 rroconne Rigoberto Oconnell (396)554-1359 Invalid input, line 7

krhester Kristin Hester (508)026-8319 Invalid input, line 9

mzchurch Monique Church (760)516-6023

If you want to make sure your program is producing exactly the right output, you can capture your standard output and standard error streams to two different files, then compare what you got against what were expecting:

$ ./fields > output.txt 2> stderr.txt

$ diff output.txt expected.txt

$ diff stderr.txt expected-stderr.txt

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CSC230 Exercise 9- Parsing File Input
$25