[Solved] INFORMATION RETRIEVAL-CS 121 / INF 141 Project 2 The Spacetime Crawler

$25

File Name: INFORMATION_RETRIEVAL-CS_121_/_INF_141_Project_2__The_Spacetime_Crawler.zip
File Size: 668.82 KB

SKU: [Solved] INFORMATION RETRIEVAL-CS 121 / INF 141 Project 2 – The Spacetime Crawler Category: Tag:
5/5 - (1 vote)

Project 2 The Spacetime Crawler

This assignment is to be done in groups of up to 3. You can use text-processing code that you or any classmate in your team wrote for the previous assignment. You cannot use crawler code written by nongroup-member classmates. Use code found over the Internet at your own peril it may not do exactly what the assignment requests. If you do end up using code you find on the Internet, you must disclose the origin of the code. As stated in the course policy document, concealing the origin of a piece of code is plagiarism. Use the Discussion Board on Piazza for general questions whose answers can benefit you and everyone.

Your crawler is standalone but shares data with the rest of the crawlers in the course. Each crawler has their own frontier in a server in ICS, and can manage this frontier. The frontier does a lot of the heavy work. Your crawler will be given 1 URL at a time and should proceed to download and process it .

Implementing your Project

Step 1 Getting the project

git clone https://github.com/Mondego/spacetime-crawler

Step 2 Installing the dependencies

Make sure you do not have conflicting libraries by issuing the command. python -m spacetime version

You should see the following output Sp acetim e Version is 2. 0

Rtyp es Version is 2.0

If the outputs do not match, or if it returns an error unrecognized argument: version, please uninstall the old spacetime, and rtypes by issuing the commands.

python -m pip uninstall spacetime

python -m pip uninstall rtypes

Get the latest repository of spacetime-crawler to get the latest version of spacetime and rtypes. Both packages are included with the assignment.

Step 3 Writing the required classes, functions, and parameters

You must set this correctly to get credit for the project. If we cant trace your crawler in our logs, its equivalent to you not doing the project.

  1. Write out the details of your teammates and you in the team.txt file: The details are a comma

separated list of your UCInetID, and student number. Each team member is written in a new line. E G:

panteater,12345678

pe teran t,87654321

2 . Run generate_crawler_application.py: It will generate files in two folders: applications/search and

d a ta m o d e l/search with the crawler code in them customized by the details in team.txt. If the details

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] INFORMATION RETRIEVAL-CS 121 / INF 141 Project 2 The Spacetime Crawler
$25