[Solved] Programming Project 8

30 $

File Name: Programming_Project_8.zip
File Size: 197.82 KB

SKU: [Solved] Programming Project 8 Category: Tag:

Or Upload Your Assignment Here:


Assignment OverviewThe ProblemComputer-based translation is one of the most important and challenging problems facing computer science. Because of the significant differences between the semantics of different languages, approaching this problem involves much more than translating each word and concatenating the resulting translations. However, in this assignment we will take that simple approach: simply translate word by word. In this project you are supposed to translate a paragraph originally written in English to 7 other languages. We have selected some of the most widely used languages according to Wikipedia: https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers

ObjectivesThe main goal of this project is to help you understand the notion of dictionaries versus the notion of lists in Python (and the corresponding data structures in any other programming language). After completing your part of the code, you should be able to pinpoint performance differences between the two of them.

Assignment DeliverableThe deliverable for this assignment is the following file: proj08.py – the source code for your Python programBe sure to use the specified file name and to submit it for grading via the handin system before the project deadline.

BackgroundThe Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. They have developed the TEI format which has become the standard method of storing texts in digital form. You can read more about TEI here: http://www.tei-c.orgOne of the organizations using this format is http://www.freedict.org which offers an extensive and free set of bilingual dictionaries. For our project we will use 7 of their dictionaries to convert from English to Arabic, Spanish, Hindi, Russian, French, German and Italian! All their dictionaries are available for download at https://github.com/freedict/fd-dictionaries, but we have copied the 7 we need into the project directory. They are provided both individually and collected together into a zip file for your convenience. They are large.

Project Description / SpecificationOverview: we provide a working program based on lists; you code a solution based on dictionaries. Most of the code for this project has been already done for you, except for the most important part!Completing the missing code is your task. Our TEI parsing logic is encapsulated in tei_parser.py, feel free to take a look, but do not modify anything (you do not need to) as we will be using our own version of the file while grading. This code uses two forms of data representations (data structures) for its operation. The first representation is list-based. Each database (DB)—the term ‘database’ is used loosely in our context to refer to the data structure holding the words along with their translations—is represented using only lists. The code for this section is completely done for you. You will see that without modifying anything, that just running the file will display 7 translations for your original text.We claim however that this is not an efficient way of approaching this problem! At the end of youroutput you will see a ‘Running-Time Summary’ section. On our machine it took approximately 200milliseconds to translate a small paragraph. We also claim that using another data representation candrastically decrease the time required for translation (do net be deceived by the 0.00 millisecondsdisplayed at the end of your output. We cannot take it down that far of course! This happens onlybecause you have not done your part yet.).Our second form of representation is a dictionary-based structure. This is your part. You need toconvert the already existing list-based representation into a dictionary-based representation.1. getDictFormattedDatabases(listDatabases). Your first task is to implementthis function by replacing the filler statement pass with Python code. The parameter is a listof tuples where the second item in the tuple is a list of tuples of the form (word, translation).That is, the parameter listDatabases is a list of tuples, structured as follows:[(lang1-name, list-based-DB1), (lang2-name, list-based-DB2) …]For example:[(“Arabic”, arabicListBasedDb), (“Hindi”, hindiListBasedDb) …]A single list-based DB (the second element in each of the tuples above) takes again the form of a list oftuples, as follows:[(word1, word1-translation), (word2, word2-translation) …]For example,[(‘have’,’avoir’), (‘baby’,’bébé’) …]The function returns a list of tuples where the second item in the tuple is no longer a list, but has beenconverted to a dictionary of items in the form word:translation. That is, the following form:[(lang1-name, dict-based-DB1), (lang2-name, dict-based-DB2) …]For example,[(“Arabic”, arabicDictBasedDb), (“Hindi”, hindiDictBasedDb) …]A single dict-based DB (the second element in each of the tuples above) takes the following form:{word1:word1-translation, word2:word2-translation …}For example,{‘have’:’avoir’, ‘baby’:’bébé’ …}2) translateFromDict(text, dictBasedDb) The logic of this method is supposed toreturn the translation of the first parameter (text, a string) to the target language defined by thesecond parameter (dictBasedDb). Think of the parameter text as a word to be translated. Simplydo a word-to-word translation. Special care should be given to those words followed by a full stop(a.k.a .period). Notice that due to the initial preprocessing of the inputted text, a full stop will alwaysbe adjacent to its preceding word and at least one space away from its subsequent word. If a word thathas a period attached, its translation must have a period attached. Return a string that is a translation ofthe parameter text. If text is not in the dictBasedDb, simply return the text.3) performDictBasedTranslations(preprocessedText,dictBasedDatabases)Finally, complete the implementation of the function performDictBasedTranslations. Theparameter preprocessedText is the complete text to be translated; dictBasedDatabases is alist of all the language translation databases. This function should loop over all your dict-baseddatabases to translate the preprocessedText into each of the languages. Display the name of thelanguage and the translation using provided display function.3) You may use other functions—I didn’t for this project.4) In the original language DB there may be multiple translations for one English word. Weignore that so that any one of those translations is accepted—we don’t care which.HintsAs with most problems break this problem down into smaller pieces. Like the last project it is best tostart with a smaller set of input data. In this case you can “easily” create it.a. Begin by not using the provided proj08.py. That is, build a separate test program startingfrom scratch.b. Create your own listDatabases as described in the project description above forgetDictFormattedDatabases . The original “ListBasedDb” have 6K or so entries(tuples). For testing you only need two or three (see next item). Instead of 7 languages, onlyuse two or three. Constructing this is an excellent example of something to do with othersbecause this will not be part of your final project—it is only for testing. Note that each tuplehas an English word, paired with that word translated into the language of choice, e.g. ifSpanish(‘hello’, ‘hola’). (note that the translated word could be gibberish, e.g.(‘hello’, ‘xxxx’). )c. Create your own preprocessedText . Use something really simple such as ‘helloworld.’ (note that I included a period). If you use ‘hello world’, then include those twowords in your listDatabases .d. Now you can write your own getDictFormattedDatabases function. It will takeyour listDatabases (the list-based objects) and return the same information withdictionaries (actually list of tuples containing dictionaries).e. In a similar way you can create the other specified functions.f. Once you get them all working with your small listDatabases you can copy your codeinto the provided proj08.py to complete and test the project.Sample OutputThere is no sample output. The proj08.py we provide generates all the translations using lists. Your programwill generate the same output—that is, the same translation will be printed twice. At the end of the programsome run times will be printed—different computers will generate different times.========================================Educational ResearchWhen you have completed the project insert the 5-line comment specified below.For each of the following statements, please respond with how much they apply to your experiencecompleting the programming project, on the following scale:1 = Strongly disagree / Not true of me at all234 = Neither agree nor disagree / Somewhat true of me567 = Strongly agree / Extremely true of me***Please note that your responses to these questions will not affect your project grade, so pleaseanswer as honestly as possible.***Q1: Upon completing the project, I felt proud/accomplishedQ2: While working on the project, I often felt frustrated/annoyedQ3: While working on the project, I felt inadequate/stupidQ4: Considering the difficulty of this course, the teacher, and my skills, I think I will do well inthis course.Please insert your answers into the bottom of your project program as a comment, formatted exactly asfollows (so we can write a program to extract them).# Questions# Q1: 5# Q2: 3# Q3: 4# Q4: 6

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] Programming Project 8
30 $