[Solved] COP4530 Project6-Word, Number, and Character Usage Statistics

$25

File Name: COP4530_Project6-Word,_Number,_and_Character_Usage_Statistics.zip
File Size: 574.62 KB

SKU: [Solved] COP4530 Project6-Word, Number, and Character Usage Statistics Category: Tag:
5/5 - (1 vote)

Practice selecting and making use of appropriate STL containers and algorithms to perform tasks.

Statement of Work: Implement a program that collects the statistics of word, number, and character usage in a file (redirected as the standard input).

Requirements:

  1. Write a program that will read input (from standard input) until end of input is reached (end of file), which will count the number of times each word, number, and character appears in the input.
    • A word is defined as a consecutive sequence of letters (a..z or A..Z).
    • Words are case insensitive (AA, Aa, aA, and aa are the same).
    • A number is defined as any consecutive sequence of digits (0..9).
    • Note that both words and numbers can be of length of 1, that is, contain one letter or one digit, respectively.
    • Different sequences represent different numbers. For example, number 001 is different from number 1.
    • Words are separated by any non-letter characters.
    • Numbers are separated by any non-digit characters.

Output specifications:

    • Your program should track the number of times each word, number, and character appears.
    • The program should then output the ten most used characters, the ten most used numbers, and the ten most used words, along with the number of times each of these characters/numbers/words are used.
    • Since words are case insensitive, the program should only output the words in lower case.
    • The characters, numbers and words should be printed in descending order based on the number of times they are used.
    • Breaking ties (for the Top Tens):
      • When two characters occur the same number of times, the character with the smaller ASCII value should be considered as being used more frequently.
      • When two words (or numbers) occur the same number of times, the word (or number) that occurs earlier in the input should be considered as being used more frequently.
  1. An example executable code of the program is provided to you (see below). You should make the outputs of your program match this sample executable. When printing characters, use t for tab and
    for newline. All other characters should be printed normally.
  2. Write a makefile for your project that compiles an executable called x
  3. Make use of any appropriate C++ STL containers and algorithms. You should also use C++ string class instead of default c-strings. Here are a few good reference links for the library lookups:

Note that you should select whatever container(s) will make YOUR programs algorithms the most efficient in terms of growth rate (i.e. Big-O complexity analysis).

  1. In a file called txt, write up your analysis of the complexity analysis of the important algorithms and procedures in your program. Note that your analyses will be based on not only the code YOU write, but also on the STL containers you choose for managing your data. Your analyses need to include analysis of at least (but not limited to) each of these necessary tasks:
    • Reading the input set
    • Storing the characters / words / numbers in your chosen containers, and setting their tracking values
    • Looking up the final tracking info on your character / word / number frequencies
    • Deciding on (and accessing for output) your Top Ten most frequent list for each case
    • Any other important algorithm/tasks you perform to complete the job
  1. While not a program requirement for submission, it is recommended that you verify your analysis of your program elements by testing larger input sets and also by measuring the actual run time speed of those test runs. You can do this in a program easily by using the ctime library and capturing the returns from the clock() function before and after an algorithm, then subtract the two clock times to see the difference. Conver to seconds by dividing by the constant CLOCKS_PER_SEC. On linprog, you can look up more details at the manual page for clock (man clock).

Example executable, some test cases

Download a set of 4 sample test files at this link. This is a tar file containing 4 test files (test0, test1, test2, test3). You will need to unpack this tar file in your project directory.

When you create your own executable, youll need to re-direct any test files as the standard input to your program, like this:

proj6.x < test0

The provided example executable can be run from linprog, at the location ~myers/dsprog/proj6.x . So, for example, you can run the same test file as in the example above with:

~myers/dsprog/proj6.x < test0

ABET Assessment

This is our assignment that will satisfy the ABET assessment requirement that students in this course are able to do algorithm complexity analysis on chosen solutions to problems.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] COP4530 Project6-Word, Number, and Character Usage Statistics
$25