, , , , ,

[SOLVED] Comp 370 homework 7 – web scraping

$25

File Name: Comp_370_homework_7_–_web_scraping.zip
File Size: 339.12 KB

1/5 - (1 vote)

In this homework, we’ll be using web scraping to collect the 5 trending stories from the Montreal Gazette. Our
objective is to collect the title, publication date, author, and opening “blurb” (as seen in the screen capture
here).We want a script “collect_trending.py” that does all the work (i.e., we don’t have to specify the trending articles
one by one or in a list. The script goes, figures out which they are, and then grabs them off the website).So to do this, you’ll have to write a scraper for two different page templates.
– To get the trending stories (and links to them), you’ll need to first scrape the homepage of Montreal
Gazette (https://montrealgazette.com/category/news/)– Then once you have links to the trending stories, you’ll need to scrape the key information off the article
page itself.
collect_trending.py is run as follows:
python collect_trending.py -o trending.json
Such that trending.json has the format:
[
{
“title”: “article title”,
“publication_date”: “date”,
“author”: “author”,
“blurb”: “blurb”
},
{
… article info
},

]For both page templates, use cache-ing to avoid overly taxing the Montreal Gazette website.
Submission Instructions
– Submit all your code in a zip file hw7.zip

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Comp 370 homework 7 – web scraping
$25