Problem 1 . Using information gain as the criteria for deciding the node to split, learn the decision tree to depth 2 (i.e., the root should have grandchildren) using the data provided in Table 1. Show your work for the first splitting decision (i.e., how do you decide which feature to use for the first split).
Table 1: Previous tennis playing decisions based on outlook, temperature, humidity, and wind
Day | Outlook | Temperature | Humidity | Wind | Played Tennis? |
01 | Sunny | Hot | High | Weak | No |
02 | Sunny | Hot | High | Strong | No |
03 | Overcast | Hot | High | Weak | Yes |
04 | Rain | Mild | High | Weak | Yes |
05 | Rain | Cool | Normal | Weak | Yes |
06 | Rain | Cool | Normal | Strong | No |
07 | Overcast | Cool | High | Strong | Yes |
08 | Sunny | Cool | High | Weak | No |
09 | Sunny | Mild | Normal | Weak | Yes |
10 | Rain | Hot | Normal | Weak | Yes |
11 | Sunny | Mild | Normal | Strong | Yes |
12 | Overcast | Mild | High | Strong | Yes |
13 | Overcast | Hot | Normal | Weak | Yes |
14 | Rain | Mild | High | Strong | No |
Problem 2 [10 points]. Using the method of Lagrange multipliers, find all potential extremal points for f (x, y) = xy subject to the constraint x2 + 2y2 = 6. Which ones yield maximum? What is the maximum?
Problem 3 [20 points]. Consider building a linear classifier and then an SVM for the following two-class training data:
Positive class: Negative class:a) [10 points]. | [ 1 3]T, [0 2]T, [0 1]T, [0 OF[1 5]T, [1 6]T, [3 3]TCompute a linear classifier using a = 0.8 and using the samples in the following |
order [-1, 3], [1, 6], [0, 1], [1, 5], [0,2], [0, 0], [3, 3]. Show your work for the first seven iterations and also provide the final w = (wo, wi, w2).
- [3 points]. Plot the training points and, by inspection, draw a linear classifier that separates the data with maximum margin.
- [3 points]. This linear SVM is parameterized by h(x) = wTx + Estimate the parameters w and b for the classifier that you have drawn.
- [4 points]. Suppose you observe an additional set of points, all from the positive class: [2 0]T, [2 1]T, [2 3]T, [1 0]T, [-1 1]T, [0 0]T
What is the linear SVM (in terms of w and b) now?
Problem 4 What is the VC-dimension of (a) two half-planes (i.e., two linear separators)? (b) the inside of a triangle? Explain.
Problem 5 . Carry out policy iteration over the MDP example covered in class with R given in Table 2 and y = 0.9. For a state s, if R(s) = +1, s is a terminal state. For the transition model, assume that the agent has 0.9 probability of going to the intended direction and 0.1 probability of moving to the left. For example, if the agent is at the lower left corner (coordinates (1, 1)) and intends to go right, then it will reach (2,1) with 0.9 probability and (1, 2) with 0.1 probability. If a target cell is not reachable, then the corresponding probability goes back to the current cell. For example, if the agent is at (3, 3) and is trying to go up, then with 0.1 probability it goes to (2, 3) and with 0.9 probability it is stuck at (3, 3). For your answer you should provide:
- [15 points]. The first two iterations of your computation.
- [15 points]. The converged rewards and the extracted policy. For this problem, you need to provide last two iterations showing that the value changes are within 0.001 for all cells.
Table 2: Reward R for a 4 x 3 grid world
-0.05 | -0.05 | -0.05 | +1 |
-0.05 | OBS | -0.05 | -1 |
-0.05 | -0.05 | -0.05 | -0.05 |
As a suggestion, you should complete the first question manually to make sure you will be able to do so, for obvious reasons :). For solving the second, it is perhaps better to do it using a program, perhaps using Python or excel.
Reviews
There are no reviews yet.