. Environment
- OS: Windows, Mac OS, or Linux
- Languages: C, C++, C#, Java, or Python (any version is ok)
- Goal: Build a decision tree, and then classify the test set using it
3. Requirements
The program must meet the following requirements:
- Execution file name: exe
- Execute the program with three arguments: training file name, test file name, output file name
- Example:
- Training file name=dt_train.txt, test file name=dt_test.txt, output file name=dt_result.txt
- Dataset
- We provide you with 2 datasets
- Buy_computer: dt_train.txt, dt_test.txt
- Car_evaluation: dt_train1.txt, dt_test1.txt
- You need to make your program that can deal with both datasets
- We will evaluate your program with other datasets with attributes such as the car_evaluation dataset
- We provide you with 2 datasets
- File format for a training set
[attribute_name_1]t[attribute_name_2]t [attribute_name_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
- [attribute_name_1] ~ [attribute_name_n]: n attribute names
- [attribute_1] ~ [attribute_n-1]
- n-1 attribute values of the corresponding tuple
- All the attributes are categorical (not continuous-valued) n [attribute_n]: a class label that the corresponding tuple belongs to n Example 1 (data_train.txt):
Figure 1. An example of the first training set.
- Example 2 (data_train1.txt):
Figure 2. An example of the second training set.
- Title: car evaluation database Attribute values
- Buying: vhigh, high, med, low
- Maint: vhigh, high, med, low
- Doors: 2, 3, 4, 5more
- Persons: 2, 4, more
- Lug_boot: small, med, big
- Safety: low, med, high
- Class labels: unacc, acc, good, vgood
- Number of instances: training set 1,382; test set 346
- Attribute selection measure: information gain, gain ratio, or gini index l File format for a test set
[attribute_name_1]t[attribute_name_2]t [attribute_name_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
- The test set does not have [attribute_name_n] (class label) n Example 1 (dt_test.txt):
Figure 3. An example of the first test set.
- Example 2 (dt_test1.txt):
Figure 4. An example of the second test set.
- Output file format
[attribute_name_1]t[attribute_name_2]t [attribute_name_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
- Output file name: txt (for 1th dataset), dt_result1.txt (for 2nd dataset) n You must print the following values:
- [attribute_1] ~ [attribute_n-1]: given attribute values in the test set
- [attribute_n]: a class label predicted by your model for the corresponding tuple n Please DO NOT CHANGE the order of the tuples in each test set.
- You should print your outputs to match the order of correct answers.
4. Note
- This is a competition project
- As the accuracy of your model is higher, you get a higher score
- We will first give a minimum score at least 70 if (1) you submit your program before the deadline, (2) your program is correctly performed without any errors, and (3) all requirements for this project are satisfied.
- Then, we will assign the additional scores from 0 to 30 based on your rank.

![[Solved] ITE4005 Assignment2](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] ITE4005 Assignment3](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.