, , ,

[SOLVED] Cs/coe 1501 projects 1 to 5 solution

$25

File Name: Cs_coe_1501_projects_1_to_5_solution.zip
File Size: 339.12 KB

5/5 - (1 vote)

Goal:
To gain a better understanding of search algorithms and symbol tables by
implementing an autocompletion engine.

## Background:
In order to make typing on a mobile device quick and easy, an autocomplete
feature is commonly implemented to try to guess what word a user wishes to type
before they are finished. Such a feature requires extensive use of search
algorithms.

## Specifications:
* First, download a copy of the English dictionary that you will use for this
project from [here](https://people.cs.pitt.edu/~nlf4/cs1501/handouts/dictionary.txt)
and place it in your repository folder
* **Do not** add this file to your git repository, however! (i.e., do not run
`git add dictionary.txt`).
* You must implement a De La Briandais (DLB) trie data structure (as described
in lecture) to use in your project.
* When your program is run, it should first create a new DLB trie and add all
of the words in `dictionary.txt` to that trie (this will be the dictionary
trie).
* Once the dictionary trie is established, you should prompt the user
to start typing a word. For this project, you will be writing only a simple
program to test autocomplete functionality. Due to complexities with
gathering single character input in Java, you should accept a single character at a time, each followed by `Enter`. After each character, present the user with the list of predictions of the word that they are trying to type.
If the user types a number (1-5), you should consider the user to have selected
the corresponding prediction, and restart the process for a new word. Consider `$` to be a terminator character; if the user enters a
`$`, consider the characters input so far to be the full word as intended (regardless of suggestions). Finally, if
the user enters a `!` at any point, your program should exit.
* To generate the list of predictions, your program should not only consult
the dictionary trie, but also keep track of what words the user has entered
in the past. If the user has previously entered the same sequence of
characters as a prefix to a word, you should prioritize the words that most frequently resulted from this sequence
previously. If the user has never entered the current sequence before, or
has entered fewer than 5 words with the current seequence as a prefix
(i.e., not enough words to complete the list of 5 predictions), your program
should suggest words from dictionary.txt that have the current sequence as a
prefix.
* You program should propose at most 5 suggestions at a time. If there are
fewer than 5 suggestions available from the user’s history and the dictionary
trie, then only print out the available suggestions.
* If the current sequence of characters has not been entered by the user
before and does not appear in `dictionary.txt`, you should display a
message to the user stating that no predicions were found, and allow
the user to continue entering characters one at a time. Once the user
enters a “$”, you should consider the word to be finished and add it
to the user’s history so that you can predict it in the future.
* Your program should run in a case sensitive manner in order to allow for
easier prediction of proper nouns and acronyms.
* The design of the data structure that keeps track of a user’s previously
entered words is entirely up to you. You must create a file named
`approach.txt` that both describes your approach to implementing this symbol
table and justifies your decision to take this approach. Note that this file
does not need to be extensive, just a few lines so the TA is aware of what to
look for in your code and why you chose this approach.
* The history of the user’s entered words should persist across runs of your
program. To enable this, your program should save a representation of
this data structure to the file `user_history.txt` before exiting.
* Do not add this file to your git repository! (i.e., do not run `git
add user_history.txt`) It should be generated by your program
if it does not exist.
* Each time the user enters a character, you should use Java’s
[`System.nanoTime()`]
(https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime–)
to calculate how long your program takes to find the predictions. You should
display this time along with the list of predictions.
* After the user enters `!`, your program should output the average time
that was required to produce a list of predictions.

An example run of the program would proceed as follows:

“`
Enter your first character: t

(0.000251 s)
Predictions:
(1) t (2) ta (3) tab (4) tab’s (5) tabbed

Enter the next character: h

(0.000159 s)
Predictions:
(1) thalami (2) thalamus (3) thalamus’s (4) thalidomide (5) thalidomide’s

Enter the next character: e

(0.000052 s)
Predictions:
(1) the (2) theater (3) theater’s (4) theatergoer (5) theatergoer’s

Enter the next character: r

(0.000225 s)
Predictions:
(1) therapeutic (2) therapeutically (3) therapeutics (4) therapeutics’s (5) therapies

Enter the next character: e

(0.000182 s)
Predictions:
(1) there (2) there’s (3) thereabout (4) thereabouts (5) thereafter

Enter the next character: 3

WORD COMPLETED: thereabout

Enter first character of the next word: t

(0.000128 s)
Predictions:
(1) thereabout (2) t (3) ta (4) tab (5) tab’s

Enter the next character: h

(0.000094 s)
Predictions:
(1) thereabout (2) thalami (3) thalamus (4) thalamus’s (5) thalidomide

Enter the next character: e

(0.000085 s)
Predictions:
(1) thereabout (2) the (3) theater (4) theater’s (5) theatergoer

Enter the next character: r

(0.000145 s)
Predictions:
(1) thereabout (2) therapeutic (3) therapeutically (4) therapeutics (5) therapeutics’s

Enter the next character: e

(0.000130 s)
Predictions:
(1) thereabout (2) there (3) there’s (4) thereabouts (5) thereafter

Enter the next character: !

Average time: 0.000145 s
Bye!
“`

## Submission Guidelines:
* **DO NOT** add `dictionary.txt` to your git repository.
* **DO NOT** add `user_history.txt` to your git repository, it must be
generated by your program.
* **DO NOT SUBMIT** any IDE package files.
* You must name your main program file `ac_test.java`.
* You must be able to compile your program by running
`javac ac_test.java`.
* You must be able to run your program by running `java ac_test`.
* You must fill out `info_sheet.txt`.
* Be sure to remember to push the latest copy of your code back to your GitHub
repository before the the assignment is due. At the deadline, the
repositories will automatically be copied for grading. Whatever is present
in your GitHub repository at that time will be considered your submission for
this assignment.

## Additional Notes:
* You are free to use any data structures written by you, the textbook authors,
or in the Java standard library to implement the user history symbol table.
However, if you use code that you do not write yourself, you must research
the runtime of that particular implementation and discuss that in your
`approach.txt`.
* Note that if your user history predictions contain a word that is also
contained in the dictionary predictions, this word should not be presented
as a suggestion to the user twice in the same prompt (e.g., the final list
of predictions in the example above).
* You do not need to implement any sort of autocorrect. You can assume
that each character entered by the user is intentional.
* You can assume that the user will not try to enter any numerical characters
aside from selecting a prediction.

## Grading Rubric:
* DLB trie implemented as described in class: 20
* Dictionary predictions are correctly populated: 25
* User history predictions are correctly populated: 30
* Sufficient justification of user history approach: 10
* User interaction works as specified: 5
* Timing data presented: 5
* Assignment info sheet/submission: 5

Goal:
To understand the innerworkings and implementation of the LZW compression algorithm, and to gain a better understanding of the performance it offers.

## High-level description:
As we discussed in lecture, LZW is a compression algorithm that was created in 1984 by Abraham Lempel, Jacob Ziv, and Terry Welch.
In its most basic form, it will output a compressed file as a series of fixed-length codewords.
This is the approach implemented in the LZW code provided by the authors of the textbook.
As we discussed in class, *variable-width* codewords can be used to increase the size of codewords output as the dictionary fills up.
Further, once the dictionary fills up, the algorithm can either stop adding patterns and continue compression with only the patterns already discovered, or the algorithm can reset the codebook to find new patterns.
The LZW code provided by the textbook authors simply continues to used patterns added to the codebook.

For this project, you will be modifying the LZW source code provided by the authors of the text book to use variable-width codewords, and to optionally reset the codebook under certain conditions.
With these changes in hand, you will then compare the performance of your modified LZW code with the provided LZW code, and further with the performance of a widely used compression application of your choice.

## Specifications:
1. First download copies of the 14 example files from [https://people.cs.pitt.edu/~nlf4/cs1501/handouts/example_files/](https://people.cs.pitt.edu/~nlf4/cs1501/handouts/example_files/) and place them in the `example_files` folder of your repository.
* Do not add these files to your git repository, however! (i.e., do not run `git add …` for any of these files).
1. Make a copy of `LZW.java` named `MyLZW.java`. You will be modifying this file for your assignment. Note that `LZW.java` is the example LZW code provided by the textbook.
1. Before making the required changes to `MyLZW.java`, you will need to read through the code, and run example compressions/expansions to understand how it is currently working. Note that `LZW.java` (and hence your `MyLZW.java`) requires the following library files (also developed by the textbook authors): `BinaryStdIn.java`, `BinaryStdOut.java`, `TST.java`, `Queue.java`, `StdIn.java`, and `StdOut.java`. These files have already been added to your repository.
1. With a firm understanding of the provided code in hand, you can proceed to make the following changes to `MyLZW.java`:
* Make it so that the algorithm will vary the size of the output/input codewords from 9 to 16 bits.
* The codeword size should be increased when all of the codewords of a previous size have been used
* Modify the code to have three options when the codebook is filled up (i.e., all 16 bit codewords have been used):
1. **Do Nothing mode** Do nothing and continue to use the full codebook (this is already implemented by `LZW.java`).
1. **Reset mode** Reset the dictionary back to its initial state so that new codewords can be added. Be careful to reset at the appropriate place for both compression and expansion, so that the algorithms remain in sync. This is very tricky and may require alot of planning in order to get it working correctly.
1. **Monitor mode** Initially do nothing (keep using the full codebook) but begin monitoring the *compression ratio* whenever you fill the codebook. Define the compression ratio to be the size of the uncompressed data that has been processed/generated so far divided by the size of the compressed data generated/processed so far (for compression/expansion, respectively). If the compression ratio degrades by more than a set threshold from the point when the last codeword was added then reset the dictionary back to its initial state. To determine the threshold for resetting you will take a ratio of compression ratios [(old ratio)/(new ratio)], where old ratio is the ratio recorded when your program last filled the codebook, and new ratio is the current compression ratio. If the ratio of ratios exceeds 1.1 then you should reset. For example, if the compression ratio when you start monitoring is 2.5 and the compression ratio at some later point is 2.3, the ratio of ratios at that point would be 2.5/2.3 = 1.087, so you should not reset the dictionary. Continuing, if your compression ratio drops to 2.2, the ratio of ratios would become 2.5/2.2 or 1.136. This means that your ratio of ratios has exceeded the threshold of 1.1 and you should now reset the dictionary. Be very careful to coordinate the code for both compression and expansion so that it works correctly.
* You cannot encode either a codeword size switch or a codebook reset into the compressed file useing a delimiter. You must have your compression/expansion code detect when to switch codeword sizes and reset based on the state of the codebook.
* Which mode should be used should be chosen by the program during compression. Whichever mode is used to compress a file should also be used to expand the file. However, you should not require the user to state the mode to use for expansion. The mode used to compress a file should be stored at the beginning of the output file, so that it can be automatically retrieved during expansion. To establish the mode to be used during compression, your program should accept 3 new command line arguments:
* “n” for Do Nothing mode
* “r” for Reset mode
* “m” for Monitor mode
* Note that the provided LZW code already accepts a command line argument to determine whether compression or expansion should be performed (“-” and “+”, respectively), and that input/output files are provided via standard I/O redirection (“<” to indicate an input file and “>” to indicate an output file). Hence, your new arguments should be handled in addition to what is provided. For example, to compress the file foo.txt to generate foo.lzw using Reset mode, you should be able to run:
“`
java MyLZW – r < foo.txt > foo.lzw
“`
Similarly to expand foo.lzw into foo2.txt, you should run:
“`
java MyLZW + < foo.lzw > foo2.txt
“`
Note that this example does not overwrite foo.txt.
This is a good approach to take in testing your programs so that you can compare foo.txt and foo2.txt to ensure that they are the same file.
1. Once all of the required changes have been made to `MyLZW.java`, you should evaluate its performance on the 14 provided example files: `all.tar`, `assig2.doc`, `bmps.tar`, `code.txt`, `code2.txt`, `edit.exe`, `frosty.jpg`, `gone_fishin.bmp`, `large.txt`, `Lego-big.gif`, `medium.txt`, `texts.tar`, `wacky.bmp`, and `winnt256.bmp`. Specifically, for each of the provided example files, measure the original file size, compressed file size, and compression ratio (original file size / compressed file size) when compressed using the following techniques:
* The unmodified `LZW.java` program (i.e., 12 bit codewords)
* Your `MyLZW.java` (variable width codewords) using Do Nothing mode
* Your `MyLZW.java` (variable width codewords) using Reset mode
* Your `MyLZW.java` (variable width codewords) using Monitor mode
* Another existing compression application of your choice (e.g., 7zip, WinZIP, gzip, bzip2)
You should organize your results of these compressions/expansions into a table in a text file named `results.txt` and submit it along with your code.

## Submission Guidelines:
* **DO NOT SUBMIT** any IDE package files.
* You must name the primary driver for your program `MyLZW.java`.
* You must be able to compile your game by running `javac MyLZW.java`.
* You must be able to run your program as shown in the above example.
* You must fill out info_sheet.txt.
* Be sure to remember to push the latest copy of your code back to your GitHub repository before the the assignment is due. At the deadline, the repositories will automatically be copied for grading. Whatever is present in your GitHub repository at that time will be considered your submission for this assignment.

## Additional Notes/Hints:
* In the author’s code the bits per codeword (W) and number of codewords (L) values are constants. However, in your version you will need them to be variables. Clearly, as the bits per codeword value increases, so does the number of codewords value.
* The TST the author uses can grow dynamically, so it does not matter how large the dictionary will be. However, for the `expand()` method an array of String is used for the dictionary. Make sure this is large enough to accommodate the maximum possible number of codewords.
* Carefully trace what your code is doing as you modify it. You only have to write a few lines of code for this program, but it could still require a substantial amount of time to get to work properly. Clearly the trickiest parts occur when the bits per codeword values are increased and when the dictionary is reset. I recommend tracing these portions of code, either on paper or with output statements to make sure your compression and expansion sections are treating them correctly. One idea is the have an extra output file for each of the `compress()` and `expand()` methods to output any trace code. Printing out (codeword, string) pairs in the iterations just before and after a bit change or reset is done can help you a lot to synchronize your code properly.
* Be especially careful with the dictionary reset and monitor compression ratio options. These are very tricky and take alot of thought to get to work. Think about what happens when the dictionary is reset and what is necessary to do in the `compress()` and `expand()` methods. I recommend getting the variable width codeword part of the program to work first and then moving on to implementing Reset mode and Monitor mode.
* Start on this project early! Not only will the implementation be tricky, but you will need to finish the programming portion of your project with enough time left over to gather results using your code to compress the example files.
* Note that `LZW.java` (and consequently your `MyLZW.java`) rely on redirecting standard in and standard out to the input and output files (respectively). An overview of I/O redirection can be found [here](https://www.tldp.org/LDP/abs/html/io-redirection.html). Note that a consequence of this is that any text printed to standard out (i.e., via `System.out.println()`) will be redirected to the output file instead of the terminal. Standard error, however, should still be displayed to the terminal, and hence, you can use `System.err.println()` to output debugging information. This I/O redirection may also complicate running MyLZW from some IDEs. If you are having trouble running MyLZW from your IDE, please try to run your program from the command line.
* Consider the notes in `LZW.java` (and `TST.java`) concerning the speed of the `substring()` function. In order to run your experiments faster, you may want to edit LZW.java (MyLZW.java) and TST.java to remove all calls to `substring()`. There is no penalty for continuing to use `substring()` for this assignment, but you will experience noticeably slow performance.

## Grading Rubric:
* Command line arguments are interpreted as specified: 10
* Variable width keywords (9-16 bits) working properly: 25
* Reset mode implemented and working properly: 20
* Monitor mode implemented and working properly: 20
* Results:
* For unmodified `LZW.java`: 4
* For variable width codewords (`MyLZW.java`) with Do Nothing mode: 4
* For variable width codewords (`MyLZW.java`) with Reset mode: 4
* For variable width codewords (`MyLZW.java`) with Monitor mode: 4
* For an appropriate popular compression application: 4
* Assignment info sheet/submission: 5

Goal:
To explore an advanced application of priority queues in order to gain a deeper understanding of the data structure.

## High-level description:
You will be writing a basic application to help a user select a car to buy.
You will write a menu-based user interface driver program (to be run in the terminal, no GUI), but most of the logic will be in implementing a priority queue-based data structure.
You should write a PQ-based data structure that stores objects according to the relative priorities of two of their attributes, making it efficient to retrieve objects with the minimum value of either attribute.
Your data structure should further be indexable to allow for efficient updates of entered items.
You will want users to be able to enter details about cars that they are considering buying.
The user should then be able to efficiently retrieve the car with the lowest mileage or lowest price.
These retrievals should be possible on the set of all entered cars or on the set of all cars of a specific make and model (e.g., “lowest price Ford Fiesta”, “lowest mileage Cadillac Escalade”).

## Specifications:
1. First you must create a class to store data about cars to buy
Specifically, this class must contain the following information:
* A unique VIN number (17 character string of numbers and capital letters (but no I (i), O (o), or Q (q) to avoid confusion with numerals 1 and 0)
* The car’s make (e.g., Ford, Toyota, Honda)
* The car’s model (e.g., Fiesta, Camry, Civic)
* The price to purchase (in dollars)
* The mileage of the car
* The color of the car
1. You must write a terminal menu-based driver program (again, no GUI).
Specifically, your driver must present the user with the following options:
1. Add a car
* This will (one at a time) prompt the user for all of the above-listed attributes of a car to keep track of
1. Update a car
* This option will prompt the user for the VIN number of a car to update, and then ask the user if they would like to update 1) the price of the car, 2) the mileage of the car, or 3) the color of the car
1. Remove a specific car from consideration
* This option will prompt the user for the VIN number of a car to remove from the data structure (e.g., if it is no longer for sale)
* Note that this mean you will need to support removal of cars other than the minimum (price or mileage) car
1. Retrieve the lowest price car
1. Retrieve the lowest mileage car
1. Retrieve the lowest price car by make and model
* This option will prompt the user to enter (one at a time) a make and model and then return the car with the minimum price for that make and model
1. Retrieve the lowest mileage car by make and model
* This option will prompt the user to enter (one at a time) a make and model and then return the car with the minimum mileage for that make and model
1. Retrieval operations should not remove the car with minimum price or mileage from the datastructure, just return information about that car.
Cars should only be removed via the “remove a specific car from consideration” menu option.
1. To aid in the testing of your application, you will find an example file with some test data store in this repository (`cars.txt`).
Your progam should read in the contents of this file to initialize your data structure each time it is run.
You can assume that this file will already exist, and you do not need to write an updated version of the data sturcture back to the file.
1. To ensure efficiency of operations, you must base your data structure around the use of heaps with indirection (making them indexable).
Note that operations on either attribute (e.g., retrieve minimum price, retrieve minimum mileage) should have a logarthmic runtime (both for all cars and for a specific make and model).
Updates and removals should also have a logarithmic runtime.
Take care in selecting your approach to the indirection data structure to account for the types of keys you will need to store and the type and number operations that you will need to perform on them.
1. Because this project requires you to make a number of decisions about how to implement its requirements, you will need to write a documentation file explaining your implementation, and justifying your decisions.
Name this file `documentation.txt`.
Be sure to describe your carefully document your approach to ease the effort required to trace through your code for grading.
Be sure to include descriptions of the runtime and space requirements of your approach and use them in your justification of why you think your approach is the best way to go.

## Submission Guidelines:
* **DO NOT SUBMIT** any IDE package files.
* You must name the primary driver for your program `CarTracker.java`.
* You must be able to compile your game by running `javac CarTracker.java`.
* You must be able to run your program with `java CarTracker`.
* You must document and justify your approach in `documentation.txt`.
* You must fill out `info_sheet.txt`.
* Be sure to remember to push the latest copy of your code back to your GitHub repository before the the assignment is due. At the deadline, the repositories will automatically be copied for grading. Whatever is present in your GitHub repository at that time will be considered your submission for this assignment.

## Additional Notes/Hints:
* You are free to use code provided by the book authors in implementing your solution. It is up to you to decide if it would be easier to modify the provided code to meet the requirements of this project or if it would be easier to start with a clean slate with all of your own code.
* Your program does not need to enforce that users enter properly formatted VIN numbers, but you must design your data structure to operate efficiently on VIN numbers as specified here. This should make testing your program much easier.

## Grading Rubric:
* Adding a car works properly: 10
* Updating a car works properly: 10
* Removing a car works properly: 15
* Retrieval for all cars works properly: 10
* Retrieval for a given make/model works properly: 15
* Operations on either attribute are efficient due to heap-backed data structure: 15
* Validity of justitifications: 15
* Menu-based driver program works properly and has appropriately labeled options: 5
* Assignment info sheet/submission: 5

Goal:
To gain a better understanding of graphs and graph algorithms through practical implementation.

## High-level description:
Your program will analyze a given graph representing a computer network according to several specified metrics.
The vertices of these graphs will represent switches in the network, while the edges represent either fiber optic or copper cables run between the switches.
Your program should operate entirely via a console interface menu (no GUI).

## Specifications:
1. Your program should accept a single command line argument that specifies the name of a file containing a description of a graph. Two such files are provided (`network_data1.txt` and `network_data2.txt`). The format of these files is as follows:
* The first line contains a single int stating the number of vertices in the graph. These vertices will be numbered 0 to v-1.
* Each following line will describe a single edge in the graph, with each of the following data items listed separated by spaces.
* First, two integers specify the endpoints of the edge.
* Next, a string describes the type of cable that edge represents (either “optical” or “copper”).
* Next, an integer states the bandwidth of the cable in megabits per second.
* Finally, an integer states the length of the cable in meters.
* E.g., the line `0 5 optical 10000 25` describes an edge between vertex 0 and vertex 5 that represents a 25 meter long optical cable with bandwidth of 10 gigabits per second.
* You should assume that all cables are full duplex and hence represent connections in both directions (e.g., in the example above data can flow from vertex 0 to vertex 5 at 10 gigabits per second and from vertex 5 to vertex 0 at 10 gigabits per second simultaneously).
1. You must internally represent the graph as an adjacency list.
1. After loading the graph from the specified file, your program should present the user with a menu with the following options:
1. Find the __lowest latency path__ between any two points, and give the bandwidth available along that path.
1. First, your program should prompt the user for the two vertices that they wish to find the lowest latency path between.
1. Then, your program should output the edges that comprise this lowest latency path in order from the first user-specified vertex to the second.
1. You must find the path between these vertices that will require the least amount of time for a single data packet to travel. For this project, we will simply compute the time required to travel along a path through the graph as the sum of the times required to travel each edge, where the time to travel each edge is computed as the length of the cable represented by that edge divided by the speed at which data can be send along a connection of that type.
* A single data packet can be sent along a copper cable at a speed of 230000000 meters per second.
* A single data packet can be sent along a fiber optic cable at a speed of 200000000 meters per second.
1. Your program should also output the bandwidth that is available along the resulting path (minimum bandwidth of all edges in the path).
1. Determine whether or not the graph is __copper-only connected__, or whether it is connected considering only copper links (i.e., ignoring fiber optic cables).
1. Find the __lowest average latency spanning tree__ for the graph (i.e., a spanning tree with the lowest average latency per edge).
1. Determine whether or not the graph would remain connected if __any two vertices in the graph were to fail__.
1. Note that you are not prompting the users for two vertices that could fail, you will need to determine whether the failure of *any pair* of vertices would cause the graph to become disconnected.
1. Quit the program.

## Submission Guidelines:
* **DO NOT SUBMIT** any IDE package files.
* You must name the primary driver for your program NetworkAnalysis.java.
* You must be able to compile your program by running `javac NetworkAnalysis.java`.
* You must be able to run your program with `java NetworkAnalysis data_filename.txt` (e.g., `java NetworkAnalysis network_data1.txt`).
* You must fill out info_sheet.txt.
* Be sure to remember to push the latest copy of your code back to your GitHub repository before the the assignment is due. At the deadline, the repositories will automatically be copied for grading. Whatever is present in your GitHub repository at that time will be considered your submission for this assignment.

## Additional Notes/Hints:
* Though code for the algorithms used in the assignment has been provided by the authors of your text book, note that use of this code will require extensive adaptations to account for the needs of this project.
* The assumed calculation of network latency used here is a drastic simplification for this project. Interested students are encouraged to investigate a more detailed study of computer networks independently (recommended reading: _Computer Networks: A Systems Approach_ by Peterson and Davie).

## Grading Rubric
* Menu interface is user-friendly: 5
* Graph is properly read and represented: 5
* Queries:
* Lowest latency path computation: 15
* Bandwidth correctly reported: 5
* Copper-only connectivity: 15
* Minimum average latency spanning tree: 20
* Surviving 2 vertex failures: 30
* Assignment info sheet/submission: 5

Goal:To get hands on experience with algorithms to perform mathematical operations on large integers, using RSA as an example.

Note that the result of this project should NEVER be used for any security applications. It is purely instructive. Always use trusted and tested crypto libraries.

## High-level description:
You will be writing two programs. The first will generate a 512-bit RSA keypair and store the public and private keys in files named `pubkey.rsa` and `privkey.rsa`, respectively.
The second will generate and verify digital signatures using a SHA-256 hash. You will use Java’s [MessageDigest](https://docs.oracle.com/javase/8/docs/api/java/security/MessageDigest.html) class to complete this project.
In order for either of these programs to work, however, you will need to complete an implementation of a class to process large integers.

## Specifications:
1. You are provided with the start of a class to process large integers called LargeInteger. LargeIntegers are represented internally as [two’s-complement](https://en.wikipedia.org/wiki/Two%27s_complement) _raw integers_ using byte arrays (i.e., instances of `byte[]`).
1. Currently, LargeInteger has the following operations implemented:
* A constructor to generate an n-bit random, positive, probably prime, integer using a specified source of randomness. This constructor uses a probabilistic primality test to ensure that it is probably prime (with 2^-100 chance of being composite).
* A constructor that creates a new LargeInteger based off of a provided `byte[]`.
* A method to compute the sum of two LargeIntegers.
* A method to determine the negation of a LargeInteger.
* A method to compute the difference of two LargeIntegers.
* Several other helper methods.
1. Due to the use of a two’s complement representation of the integers, LargeIntegers should always have at least one leading 0 bit (indicating that the integer is positive) in their `byte[]` representation. This property may cause the array to be bigger than expected (e.g., a 1024-bit generated prime will be represented using a length 129 byte array).
1. LargeIntegers are also be represented using a _big-endian_ byte-order, so the most significant byte is at the 0<sup>th</sup> index of the `byte[]`.
1. In order to generate RSA keys and perform RSA encryptions and decryptions, you will further need to implement the following functions:
* `LargeInteger multiply(LargeInteger other)`
* `LargeInteger[] XGCD(LargeInteger other)`
* `LargeInteger modularExp(LargeInteger y, LargeInteger n)`
* Any additional helper functions that you deem necessary.
1. You may *not* use any calls the Java API class `java.math.BigInteger`, or any other JCL class in finishing LargeInteger. The probably-prime LargeInteger constructor does call BigInteger’s probablePrime method, however, this can be the only call to BigInteger in your LargeInteger code.
1. Once LargeInteger is complete, write a program named `RsaKeyGen` to generate a new RSA keypair.
1. To generate a keypair, follow the following steps, as described in lecture.
1. Pick p and q to be random primes of an appropriate size to generate a 512-bit key
1. Calculate n as p*q
1. Calculate φ(n) as (p-1)*(q-1)
1. Choose an e such that 1 < e < φ(n) and gcd(e, φ(n)) = 1 (e must not share a factor with φ(n))
1. Determine d such that d = e⁻¹ mod φ(n)
1. After generating e, d, and n, save e and n to `pubkey.rsa`, and d and n to `privkey.rsa`.
1. Once you have your RSA keys generated, write a second program named `RsaSign` to sign files and verify signatures. This program should accept two command-line arguments: a flag to specify whether to sign or verify (`s` or `v`), and the name of the file to sign/verify.
1. If called to sign (e.g., `java RsaSign s myfile.txt`) your program should:
1. Generate a SHA-256 hash of the contents of the specified file (e.g., `myfile.txt`).
1. “Decrypt” this hash value using the private key stored in `privkey.rsa` (i.e., raise the hash value to the d<sup>th</sup> power mod n).
* Your program should exit and display an error if `privkey.rsa` is not found in the current directory.
1. Write out the signature to a file named as the original, with an extra `.sig` extension (e.g., `myfile.txt.sig`).
1. If called to verify (e.g., `java RsaSign v myfile.txt`) your program should:
1. Read the contents of the original file (e.g., `myfile.txt`).
1. Generate a SHA-256 hash of the contents of the original file.
1. Read the signed hash of the original file from the corresponding `.sig` file (e.g., `myfile.txt.sig`).
* Your program should exit and display an error if the `.sig` file is not found in the current directory.
1. “encrypt” this value with the key from `pubkey.rsa` (i.e., raise it to the e<sup>th</sup> power mod n).
* Your program should exit and display an error if `pubkey.rsa` is not found in the current directory.
1. Compare the hash value that was generated from `myfile.txt` to the one that was recovered from the signature. Print a message to the console indicating whether the signature is valid (i.e., whether the values are the same).

## Submission Guidelines:
* **DO NOT SUBMIT** any IDE package files.
* You must name your key generation program `RsaKeyGen.java`, and your signing/verification program `RsaSign.java`. Thus:
* You must be able to compile your program by running `javac RsaKeyGen.java` and `javac RsaSign.java`.
* You must be able to run your key generation program by running `java RsaKeyGen`, and your signing/verification program with `java RsaSign s <filename>` and `java RsaSign v <filename>`.
* You must fill out `info_sheet.txt`.
* Be sure to remember to push the latest copy of your code back to your GitHub repository before the the assignment is due. At the deadline, the repositories will automatically be copied for grading. Whatever is present in your GitHub repository at that time will be considered your submission for this assignment.

## Additional Notes/Hints:
* An example of using `java.security.MessageDigest` to generate the SHA-256 hash of a file is provided in `HashEx.java`
* You may find the creation of `pubkey.rsa`, `privkey.rsa`, and signature files to be most easily accomplished through the use of `java.io.ObjectOutputStream`. The format of your key and signature files is up to you.
* **NEVER USE CODE FROM THIS PROJECT IN PRODUCTION CODE.** This is purely instructive. Always use trusted and tested crypto libraries.

## Grading Rubric
* LargeInteger
* `multiply` works properly: 20
* `XGCD` works properly: 25
* `modularExp`: 10
* Key generation
* p and q are generated appropriately: 3
* n and φ(n) computed appropriately: 3
* e is selected appropriately: 4
* d is selected appropriately: 5
* Key files are generated appropriately: 5
* Signing
* Hash is generated correctly: 2
* Hash is “decrypted” (signed) correctly: 5
* Signature file is generated appropriately: 3
* Verification
* Hash is re-generated correctly: 2
* Signature is “encrypted” (verified) correctly: 5
* Signed files are appropriated verified: 3
* Assignment info sheet/submission: 5

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Cs/coe 1501 projects 1 to 5 solution[SOLVED] Cs/coe 1501 projects 1 to 5 solution
$25