- Environment l OS: Windows, Mac OS, or Linux
l Languages: C++, Java, or Python (any version is ok)
- Goal: Build a decision tree, and then classify the test set using it
The program must meet the following requirements: l Execution file name: dt.exe
- Execute the program with three arguments: training file name, test file name, output file name n Example:
Training file name=dt_train.txt, test file name=dt_test.txt, output file name=dt_result.txt If using python, you are allowed to use dt.py file instead of dy.exe.
- Dataset n We provide you with 2 datasets
- Buy_computer: dt_train.txt, dt_test.txt
- Car_evaluation: dt_train1.txt, dt_test1.txt n You need to make your program that can deal with any datasets n We will evaluate your program with other datasets.
- File format for a training set
[attribute_name_1]t[attribute_name_2]t [attribute_name_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
- [attribute_name_1] ~ [attribute_name_n]: n attribute names
- [attribute_1] ~ [attribute_n-1]
- n-1 attribute values of the corresponding tuple
- All the attributes are categorical (not continuous-valued) n [attribute_n]: a class label that the corresponding tuple belongs to n Example 1 (data_train.txt):
Figure 1. An example of the first training set.
- Example 2 (data_train1.txt):
Figure 2. An example of the second training set.
- Title: car evaluation database
- Attribute values l Buying: vhigh, high, med, low l Maint: vhigh, high, med, low l Doors: 2, 3, 4, 5more l Persons: 2, 4, more l Lug_boot: small, med, big l Safety: low, med, high
- Class labels: unacc, acc, good, vgood
- Number of instances: training set 1,382; test set 346
- Attribute selection measure: information gain, gain ratio, or gini index l File format for a test set
[attribute_name_1]t[attribute_name_2]t [attribute_name_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
[attribute_1]t[attribute_2]t [attribute_n-1]
- The test set does not have [attribute_name_n] (class label) n Example 1 (dt_test.txt):
Figure 3. An example of the first test set.
- Example 2 (dt_test1.txt):
Figure 4. An example of the second test set.
- Output file format
[attribute_name_1]t[attribute_name_2]t [attribute_name_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
[attribute_1]t[attribute_2]t [attribute_n]
- Output file name: txt (for 1th dataset), dt_result1.txt (for 2nd dataset) n You must print the following values:
- [attribute_1] ~ [attribute_n-1]: given attribute values in the test set
- [attribute_n]: a class label predicted by your model for the corresponding tuple n Please DO NOT CHANGE the order of the tuples in each test set.
- You should print your outputs to match the order of correct answers.
- Please be sure to use t to identify your attributes.
- Submission l Please submit the program files and the report to GitLab n Report
- File format must be *.pdf.
- Guideline Summary of your algorithm
- Detailed description of your codes (for each function)
- Instructions for compiling your source codes at TAs computer (e.g. screenshot) (Important!!) Any other specification of your implementation and testing
n Program and code
- An executable file
If you are in the following two cases, please submit alternative files (e.g., .py file, jar file, makefile)
- You cannot meet the requirements (.exe file) of the programming assignment due to your computing environment (ex. Mac OS or Linux)
- You are using python for implementing your program You MUST SUBMIT instructions for compiling your source codes. If TAs read your instructions but cannot compile your program, you will get a penalty. Please, write the instructions carefully.
- All source files
6. Testing program
- Please put the following files in a same directory: Testing program, your output files (dt_result.txt, dt_result1.txt), an attached answer file (dt_answer.txt, dt_answer1.txt)
- Execute the testing program with two arguments (answer file name and your output file name)
- Check your score for the input file
n the number of your correct prediction / the number of correct answers
- The test program was build with program mono. So, even if you are using mac or linux instead of window, you can run dt_test.exe using C# mono.

![[Solved] ITE4005 Assignment 2-decision tree](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] ITE4005 Assignment3](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.