5/5 - (1 vote)

In this homework, we’ll be using web scraping to collect the 5 trending stories from the Montreal Gazette. Our
objective is to collect the title, publication date, author, and opening “blurb” (as seen in the screen capture
here).We want a script “collect_trending.py” that does all the work (i.e., we don’t have to specify the trending articles
one by one or in a list. The script goes, figures out which they are, and then grabs them off the website).
So to do this, you’ll have to write a scraper for two different page templates.– To get the trending stories (and links to them), you’ll need to first scrape the homepage of Montreal
Gazette (https://montrealgazette.com/category/news/)
– Then once you have links to the trending stories, you’ll need to scrape the key information off the article
page itself.collect_trending.py is run as follows:
python collect_trending.py -o trending.json
Such that trending.json has the format:
[
{
“title”: “article title”,
“publication_date”: “date”,
“author”: “author”,
“blurb”: “blurb”
},
{
… article info
},
…
]For both page templates, use cache-ing to avoid overly taxing the Montreal Gazette website.Submission Instructions
– Submit all your code in a zip file hw7.zip

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] Comp 370 homework 7 – web scraping

Reviews

Whatsapp Us

[SOLVED] Comp 370 homework 7 – web scraping

Reviews

Related products

[SOLVED] CSE 6242 / CX 4242: Data and Visual Analytics HW3

[SOLVED] Oop244 workshop 2: dynamic memory

[SOLVED] Cop 2220 assignment 1

[SOLVED] Cs7638 – project -particle filter –

[SOLVED] Ece463-563 project #1: cache design, memory hierarchy design (version 1.0)

[SOLVED] CS7643: Deep Learning Assignment 4