[Solved] Programming Assignment #4: Tries COP 3502

30 $

File Name: Programming_Assignment_#4:_Tries_COP_3502.zip
File Size: 386.22 KB

SKU: [Solved] Programming Assignment #4: Tries COP 3502 Category: Tag:

Or Upload Your Assignment Here:


5/5 – (2 votes)

In this programming assignment, you will gain experience with an advanced tree data structure that is used to store strings, and which allows for efficient insertion and lookup: a trie!By completing this assignment and reflecting upon the awesomeness of tries, you will fortify your knowledge of algorithms and data structures and solidify your mastery of many C programming topics you have been practicing all semester: dynamic memory allocation, file I/O, processing command line arguments, dealing with structs and pointers to structs, and so much more.In the end, you will have implemented a tremendously useful data structure that has many applications in text processing and corpus linguistics.

AttachmentsTrie.h, corpus{01-05}.txt, input{01-05}.txt, output{01-05}.txt, printTrie.txtDeliverablesTrie.c1. Overview: TriesWe have seen in class that the trie data structure can be used to store strings. It provides for efficient string insertion and lookup; insertion into a trie is O(k) (where k is the length of the string being inserted), and searching for a string is an O(k) operation (worst-case). In a trie, a node does not store the string it represents; rather, the edges taken to reach that node from the root indicate the string it represents. Each node contains a flag (or count) variable that indicates whether the string represented by that node has been inserted into the trie (or how many times it has been inserted). For example:

Figure 1:This is a trie that codifies the words “and,” “cherry,” “chocolate,” “like,” and “pie.” The string “pie” is represented (or counted) twice. All other strings are counted only once.00 0 1 0 00010201ahdpiei likec0n0 0e o0 00 01 0rrycol001ateroot“pie” (2)“like” (1)“chocolate” (1)“cherry” (1)“and” (1)1.1. TrieNode Struct (Trie.h)In this assignment, you will insert words from a corpus (that is, a body of text from an input file) into a trie. The struct you will use for your trie nodes is as follows:typedef struct TrieNode{// number of times this string occurs in the corpusint count;// 26 TrieNode pointers, one for each letter of the alphabetstruct TrieNode *children[26];// the co-occurrence subtrie for this stringstruct TrieNode *subtrie;} TrieNode;You must use this trie node struct, which is specified in Trie.h without any modifications. You should#include the header file from Trie.c like so:#include “Trie.h”Notice that the trie node, because it only has 26 children, represents strings in a case insensitive way(i.e., “apple” and “AppLE” are treated the same in this trie).1.2. Co-occurrence SubtriesWords that appear together in the same sentence are said to “co-occur.” Consider, for example, the following sentence:I like cherry pie and chocolate pie .In the example sentence, the word “chocolate” co-occurs with the following terms (with counts in parentheses): and (1), cherry (1), i (1), like (1), pie (2).For the purposes of this assignment, we will never consider a word to co-occur with itself. So, in the example sentence above, “pie” co-occurs with: and (1), cherry (1), chocolate (1), i (1), like (1). Notice that “pie” itself is not in the list of words co-occurring with “pie.”If we place these terms (and their associated counts) into their own tries, those tries will look like this:Figure 2:Co-occcurrence subtries for “chocolate” (left) and “pie” (right), based on the sentence, “I like cherry pie and chocolate pie.”In Figure 2, the trie on the left is what we will call the co-occurrence subtrie for the word “chocolate,” based on the example sentence above (I like cherry pie and chocolate pie). Its root should be stored in the subtrie pointer field of the node marked “chocolate” in the original trie diagram in Figure 1 (see page 2 above).Similarly, the trie on the right in Figure 2 is the co-occurrence subtrie for the word “pie.” Its root should be stored in the subtrie pointer field of the node marked “pie” in the original trie diagram in Figure 1 (see page 2, above).Within these subtries, all subtrie pointers should be initialized to NULL, because we will NOT produce sub-subtries in this assignment!00 0 1 0 00010201ahdpiei likec0n0e001rryroot 00 0 1 000101ahdi likec0n0e001rryroot00c000olao01et2. Input Files and Output Format2.1. Command Line ArgumentsYour program will take two command line arguments, specifying two files to be read at runtime:./a.out corpus01.txt input01.txtThe first filename specifies a corpus that will be used to construct your trie and subtries. The second filename specifies an input file with strings to be processed based on the contents of your trie.For more information on processing command line arguments with argc and argv, see the instructions from the Program #3 PDF. You can assume that I will always specify valid input files when I run your program.2.2. Corpus File2.2.1 FormatThe corpus file contains a series of sentences. Each line contains a single sentence with at least 1 word and no more than 20 words. Each word contains fewer than 1024 characters. Each sentence is terminated with a single period, and that period is always preceded by a space. For example:corpus01.txt:I like cherry pie and chocolate pie .2.2.2 Building the Main TrieFirst and foremost, each word from the corpus file should be inserted into your trie. If a word occurs multiple times in the corpus, you should increment count variables accordingly. For example, the trie inFigure 1 (see page 2, above) corresponds to the text given in the corpus01.txt file above.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] Programming Assignment #4: Tries COP 3502
30 $