For each data set, your project will be evaluated as follows:
- You will get more points for larger/messier data sets:
- 0-5 pts <30K o 6-10 pts >=30K
- Data cleaning:
- provide a link where you found the data
- describe what steps you had to do for data cleaning (more points for messier data that needed cleaning)
- Data exploration:
- use at least 5 R functions for data exploration o create at least 2 informative R graphs for data exploration
- Run at least 3 ML algorithms on each data set, using at least 5 algorithms in all.
- this portion of your R script should include:
- code to run the algorithms
- commentary on feature selection you performed and why
- code to compute your metrics for evaluation as well as commentary discussing the results
- Run at least one ensemble method such as Random Forest, XGBoost o this portion of your R script should include:
- code to run the algorithms
- commentary on feature selection you performed and why
- code to compute your metrics for evaluation as well as commentary discussing the results
- Results analysis o rank the algorithms from best to worst performing on your data o add commentary on the performance of the algorithms
- your analysis concerning why the best performing algorithm worked best on that data
- commentary on what your script was able to learn from the data (big picture) and if this is likely to be useful
- this portion of your R script should include:
Project depth o 0-3 project minimally meets requirements o 4-6 project exceeds minimum requirements o 7-10 project went well above the requirements

![[Solved] CS4375 Project2](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] CS4375 Homework 3-Logistic Regression and Naive Bayes in R](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.