[Solved] CSE802-Project

$25

File Name: CSE802-Project.zip
File Size: 131.88 KB

SKU: [Solved] CSE802-Project Category: Tag:
5/5 - (1 vote)

Dataset

  • Select a dataset that has many samples
  • Important to consider the number of classes (c) and the number of features (d) in the dataset
  • Ideally, you want to select a dataset which has at least 4 classes and

50 dimensions

  • This is an arbitrary number the point is to work on a realistic problem rather than a toy problem
  • Please review D2L for dataset suggestions

Training Set, Validation Set, Test Set

  • Partition your dataset into training set, validation set and test set
  • Use the training set to develop the classifier and the validation set to select suitable parameters for the classifier (fine-tuning parameters)
  • Use the test set to evaluate the performance of the tuned classifier
  • Present your results using a confusion matrix besides reporting the overall classification accuracy
  • Which classes are often confused?
  • What is the variance in classification accuracy when the partitioning exercise is repeated multiple times?
  • What is the impact of imbalanced classes on classification accuracy?
  • .

Pre-processing the Dataset

  • Data normalization:
  • Experiment with different feature normalization schemes
  • For example, should features be normalized in the range [0, 1] Or, should they be normalized using the z-score technique
  • Is normalization needed at all?
  • Dimensionality reduction:
  • Experiment with feature extraction (projection) and feature selection (e.g., SFFS) techniques
  • You can use already available software to test these options

Experimenting with Different Classifiers

  • Test the performance of different classifiers on the dataset
  • See D2L for software packages that can be used
  • Which classifier results in the best accuracy?
  • Write your own code to design a Bayesian Classifier based on the maximum a posteriori principle
  • Estimate class-conditional PDFs assuming a Multi-Variate Gaussian density function (or any other multi-variate density function)
  • Estimate class-conditional PDFs assuming that features are independent and that every feature variable can be modeled using a Gaussian (or any other function)
  • Use non-parametric density estimation schemes to determine the class-conditional density values of a test sample

The Goal

  • Gain insight into the dataset you selected
  • Test at least
  • 2 dimensionality reduction schemes
  • 3 classifiers from existing software packages
  • Bayesian classifiers that youd designed in homework #2, #3, #4
  • Test the impact of changing the training data do this multiple times in order to obtain the mean and variance of the error rate
  • Draw some conclusions about the dataset, the features used, stability of various classifiers, etc.
  • Submit the report by May 1, 2020
  • Grade will be based on the degree of analysis conducted

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CSE802-Project
$25