5/5 - (1 vote)

In this assignment, we’re interested in the main topics discussed on the /r/mcgill subreddit vs. the /r/concordia subreddit. We’ll do this using human annotation … and you’re the annotatorFirst, let’s collect some reddit posts (using the /new.json endpoint – details here). We’ll collect two data files. One from the McGill subreddit and one from the Concordia subreddit. For the purpose of this assignment, collect them manually. Meaning, in a web browser, get the json dump and download it to a file. You should have a a mcgill.json file and a concordia.json file.Write a script extract_to_tsv.py that accepts one of the files you collected from Reddit and outputs a random selection of posts from that file to a tsv (tab separated value) file. It should function like this: python3 extract_to_tsv.py -o If is greater than the file length, then the script should just output all lines. If there are more than (which is likely the case), then it should randomly select num_posts_to_output (the parameter you passed to the script) of them and just output those. The output format (written to out_file) is: Name title coding

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] Comp 370 homework 8 – data annotation

Reviews

Whatsapp Us

[SOLVED] Comp 370 homework 8 – data annotation

Reviews

Related products

[SOLVED] Algorithms and Data Structures I Project 5 The Stock Market

[SOLVED] SG2047 VISUALIZING SOCIETY TASK 2 – DESIGN A DATA GRAPHIC 2024/25

[SOLVED] CDS533 Assignment 2 Statistics for Data Science SQL

[SOLVED] ENVIR301 Data Mining Census Records

[SOLVED] BU510650 Data AnalyticsHaskell

[SOLVED] Csci 2270 – data structures assignment 5