[Solved] CS471 Assignment 2 -Search Engine Helper

30 $

File Name: CS471_Assignment_2_-Search_Engine_Helper.zip
File Size: 376.8 KB

SKU: [Solved] CS471 Assignment 2 -Search Engine Helper Category: Tag:

Or Upload Your Assignment Here:


Write a parallel program to search a given corpus and return the most relevant search results. You are given a corpus called Aristo Mini Corpus (https://www.kaggle.com/allenai/aristominicorpus).

Aristo Mini Corpus:

The Aristo Mini corpus contains 1,197,377 science-relevant sentences drawn from public data. It provides simple science-relevant text that may be useful to help answer elementary science questions. You will work on 1500 sentence only divided across 50 File, each file is 30 lines.

Input: a given query in form of a sentence or a question.

Output: search results that contain all the words of the query.

Example:

Search query:

Capital of Egypt

If the corpus has the following sentences:

File1:

There is a capital for each country.

Capital of Egypt is Cairo.

File2:

The Capital of Egypt is Cairo.

You can visit the country you want.

Output should be:

Capital of Egypt is Cairo.

The Capital of Egypt is Cairo.

Pseudo code of search steps applied for each file:

For each Sentence in File:

Match = true;

For each word in the query:

IF word not in CurrentSentence:

MatchScore = false; IF MatchingScore is true:

Store Sentence;

ResultsFound += 1;

Parallel Scenario:

  • You will use Master Slave Paradigm.
  • Master will distribute the corpus files on slaves.
  • Slaves will search the given part of a corpus.
  • Each slave will return number of search results found and the corresponding relevant sentences. ü Master will collect the number of search results and write them to a file.

Expected input/output format:

Enter your query: sunlight energy nutrients

Output File:

Search Results Found = 2

Chlorophyll can make food the plant can use from carbon dioxide, water, nutrients, and energy from sunlight.

A process by which a plant produces its food using energy from sunlight, carbon dioxide from the air,and water and nutrients from the soil.

Requirements:

  • Study the MPI lab of the scatter and gather methods.
  • You have one week for questions about the assignment and the lab ( 22 Mar. to 28 Mar.).
  • Use all functions you learned so far in MPI library. (For Allreduce and Allgather it is not a must to use them).
  • You have to choose your functions carefully, which means if there is a value that should be sent to all slaves use MPI_Bcast, if there are values to be reduced using a specific operator use MPI_Reduce and so on.
  • Calculate the running time of the parallel program.
  • Run your code on the attached test cases, to ensure your result is right.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CS471 Assignment 2 -Search Engine Helper
30 $