According to the class, we know Decision Tree and K- nearest neighbor. This time we use different classifiers/regressors to analyze the data set and compare their performance.
Problem
- In this assignment you need to use Decision Tree, K- nearest neighbor to analyze the data set.
- You need to submit your code and report. The report should include results, using different performance metrics to analyze the results. Also you need to discuss your ideas and conclusions about the results. (e.g. You can say why a classifier is better or worse than another)
Data set
Split the data randomly to training data and test data (70% / 30% ) then do your analysis
Use the Forest Fires Data Set Attribute Information:
- X x-axis spatial coordinate within the Montesinho park map: 1 to 9
- Y y-axis spatial coordinate within the Montesinho park map: 2 to 9
- month month of the year: jan to dec
- day day of the week: mon to sun
- FFMC FFMC index from the FWI system: 18.7 to 96.20
- DMC DMC index from the FWI system: 1.1 to 291.3
- DC DC index from the FWI system: 7.9 to 860.6
- ISI ISI index from the FWI system: 0.0 to 56.10
- temp temperature in Celsius degrees: 2.2 to 33.30
- RH relative humidity in %: 15.0 to 100
- wind wind speed in km/h: 0.40 to 9.40
- rain outside rain in mm/m2 : 0.0 to 6.4
- area the burned area of the forest (in ha): 0.00 to 1090.84 this output variable is very skewed towards 0.0, thus it may make sense to adjust the data using the logarithm.
http://cwfis.cfs.nrcan.gc.ca/background/summary/fwi
If you want to know what features 5-8 are you can read this website
Reviews
There are no reviews yet.