Assignment 4: News Headlines classification using Naive Bayes classifier
Problem statement: Given the headline of news, the objective is to find the category of the news. (Note: Use only headline as input to find the category ) For example:
Short description:
Headline: Why Keeping a Food Journal Is Better Than Going on a Diet
Date:
Link:
Authors:
Category: HEALTHY LIVING
Consider the following categories only: Business, Comedy, Sports, Crime, Religion, Healthy Living, Politics
- Dataset: news_category_dataset.json
- Classification Algorithm:
Naive Bayes
- Features:
Train the classifier using the following features.
- Bag-of-words
- TF-IDF
- Create your own custom feature vectors.
For example, feature vector can contains following features:
- Current word (Unigram)
- POS tag of current word
- Position of the word
- Length of the news instance
Here, a total of 3 models needs to be trained i.e., one model using
Bag-of-words features, one model using TF-IDF features and one model using Custom feature vectors.
For more information on feature selection, refer the following paper:
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan: Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002).
- Evaluation:
Perform 3-fold cross-validation for each model and report
- Overall precision, recall and F1-score
- Category-wise precision, recall and F1-score
Implementation notes:
- You can use existing libraries to implement Naive Bayes algorithm and other tasks such as feature extraction, POS tagging, cross-validation and calculation of metrics.

![[Solved] (CS571) Artificial Intelligence Assignment 4: News Headlines classification using Naive Bayes classifier](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] (CS571) Assignment-2](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.