[Solved] CS584-Assignment 1-classify text paragraphs into three categories

$25

File Name: CS584-Assignment_1-classify_text_paragraphs_into_three_categories.zip
File Size: 612.3 KB

SKU: [Solved] CS584-Assignment 1-classify text paragraphs into three categories Category: Tag:
5/5 - (1 vote)
  1. Document Classification (100 points) In this homework, you need to classify text paragraphs into three categories: Fyodor Dostoyevsky, Arthur Conan Doyle, and Jane Austen by building your own classifiers. The data provided is from Project Gutenberg. Please follow a few steps as below:
  • (5pts) Preprocess data: remove punctuations, irrelevant symbols, and common words etc.
  • (5pts) Construct examples: Divide each document into multiple paragraphs. Each paragraph will be one example.
  • (5pts) Data split: Sample these paragraphs into training and testing data.
  • (5pts) Feature extraction: Build a vocabulary to represent each paragraph using only training data. Consider TF-IDF features for each input example.
  • (60pts) Build two classifiers (described below).
  • (5pts) Plot training loss and validation loss at each epoch.
  • (5pts) Using cross-validation on the training data, report the classification test/validation error (or accuracy) for each category.
  • (10pts) Compare both classifiers and provide an analysis for the results.
  • Implement a Logistic Regression model using both gradient descent and stochastic gradient descent.
  • Implement a Multilayer Perceptron (MLP) model using backpropagation and mini-batch gradient descent.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CS584-Assignment 1-classify text paragraphs into three categories
$25