Problem 1: Consider a classification problem with a binary class label Y and a single continuous feature X that takes values in (4,2) (2,4). Suppose (X,Y ) is generated by choosing Y at random with P(Y = 1) = P(Y = 2) = 1/2, and then drawing X conditional on Y according to uniform distributions. Specifically, assume that the class-conditional densities for X are
) and
In the below we consider 0-1 loss, that is, the risk of a classifier is the probability of an error.
- What is the marginal distribution of X? What is the conditional distribution of Y given X?
- What is the Bayes rule fB(x) and its risk P(Y 6= fB(x)?
Explain!
- Let f1(x;S) be the 1-nearest neighbor classifier based on a training sample S = {(x1,y1),,(xn,yn)} of i.i.d observations of (X,Y ). What is the risk Pr(Y 6= f1(X;S)? Explain. (Here, the risk is computed by integrating over training data and a new independent pair (X,Y )).
- Under the same scenario calculate the risk of the 3-nearest neighbor classifier.
- Which method, 1-nearest neighbor or 3-nearest neighbor, has smaller risk in this problem?
- ISLR Section 8.4 Problem 3
- ISLR Section 8.4 Problem 9 (a) (g)
1
Reviews
There are no reviews yet.