5/5 - (1 vote)

Exercise 1: Bayes rule (. Suppose that 5% of competitive athletes use performanceenhancing drugs and that a particular drug test has a 2% false positive rate and a 1.5% false negative rate.

(3 points) Athlete A tests positive for drug use. What is the probability that Athlete A is using drugs?
(3 points) Athlete B tests negative for drug use. What is the probability that Athlete B is not using drugs?

Exercise 2: Bayesian decision theory: losses and risks Consider a classification problem with K classes, using a loss _ik 0 if we choose class i when the input actually belongs to class k, for i,k {1,,K}.

(2 points) Write the expression for the expected risk R_i(x) for choosing class i as the class for a pattern x, and the rule for choosing the class for x.

Consider a two-class problem with losses given by the matrix.

(3 points) Give the optimal decision rule in the form p(C₁|x) > as a function of ₂₁.
(3 points) Imagine we consider both misclassification errors as equally costly. When is class 1 chosen (for what values of p(C₁|x))?
(3 points) Imagine we want to be very conservative when choosing class 2 and we seek a rule of the form p(C₂|x) > 0.99 (i.e., choose class 2 when its posterior probability exceeds 99%). What should ₂₁be?

Exercise 3: association rules . Given the following data of transactions at a supermarket, calculate the support and confidence values of the following association rules: meat avocado, avocado meat, yogurt avocado, avocado yogurt, meat yogurt, yogurt meat. What is the best rule to use in practice?

transaction #	items in basket
1	meat, avocado
2	yogurt, avocado
3	meat
4	yogurt, meat
5	avocado, meat, yogurt
6	meat, avocado

Exercise 4: true- and false-positive rates Consider the following table, where xn is a pattern, y_nits ground-truth label (1 = positive class, 2 = negative class) and p(C₁|xn) the posterior probability produced by some probabilistic classification algorithm:

n	1	2	3	4	5
y_n	1	2	2	1	2
p(C₁\|xn)	0.6	0.7	0.5	0.9	0.2

We use a classification rule of the form p(C₁|x) > where [0,1] is a threshold.

(8 points) Give, for all possible values of [0,1], the predicted labels and the corresponding confusion matrix and classification error.
(2 points) Plot the corresponding pairs (fp,tp) as an ROC curve.

Exercise 5: ROC curves . Imagine we have a classifier A that has false-positive and true-positive rates fp_A,tp_A [0,1] such that fp_A> tp_A(that is, this classifier is below the diagonal on the ROC space). Now consider a classifier B that negates the decision of A, that is, whenever A predicts the positive class then B predicts the negative class and vice versa. Compute the false-positive and true-positive rates fp_B,tp_Bfor classifier B. Where is this point in the ROC space?

Exercise 6: least-squares regression (14 points). Consider the following model, with parameters = {₁,₂,₃} R and an input x R:

h(x;) = ₁+ ₂sin2x + ₃sin4x R.

(2 points) Write the general expression of the least-squares error function of a model h(x;) with parameters given a sample .
(2 points) Apply it to the above model, simplifying it as much as possible.
(6 points) Find the least-squares estimate for the parameters.
(4 points) Assume the values are uniformly distributed in the interval [0,2]. Can you find a simpler, approximate way to find the least-squares estimate ? Hint: approximate by an integral.

Exercise 7: maximum likelihood estimate . A discrete random variable x {0,1,2} follows a Poisson distribution if it has the following probability mass function:

where the parameter is > 0.

(2 points) Verify that
(2 points) Write the general expression of the log-likelihood of a probability mass function p(x;) with parameters for an iid sample x₁,,x_N.
(5 points) Apply it to the above distribution, simplifying it as much as possible.
(6 points) Find the maximum likelihood estimate for the parameter .

Exercise 8: multivariate Bernoulli distribution Consider a multivariate Bernoulli distribution where [0,1]^Dare the parameters and x {0,1}^Dthe binary random vector:

(5 points) Compute the maximum likelihood estimate for given a sample X = {x1,,xN}.

Let us do document classification using a D-word dictionary (element d in xn is 1 if word d is in document n and 0 otherwise) using a multivariate Bernoulli model for each class. Assume we have K document classes for which we have already obtained the values of the optimal parameters _k= (_k₁,,_kD)^Tand prior distribution p(C_k) = _k, for k = 1,,K, by maximum likelihood.

(2 points) Write the discriminant function g_k(x) for a probabilistic classifier in general (not necessarily Bernoulli), and the rule to make a decision.
(5 points) Apply it to the multivariate Bernoulli case with K Show that g_k(x) is linear on x, i.e., it can be written as g_k(x) = w_k^Tx + w_k₀and give the expression for wk and w_k₀.
(3 points) Consider K = 2 classes. Show the decision rule can be written as if w^Tx + w₀> 0 then choose class 1, and give the expression for w and w₀.
(5 points) Compute the numerical values of w and w₀for a two-word dictionary where ₁= 0.7,

) and ). Plot in 2D all the possible values of x {0,1}^Dand the boundary

corresponding to this classifier.

Exercise 9: Gaussian classifiers . Consider a binary classification problem for x R^Dwhere we use Gaussian class-conditional probabilities) and

That is, they have the same mean and the covariance matrices are isotropic but different. Compute the expression for the class boundary. What shape is it?

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSE176 HomeWork#1

Reviews

Whatsapp Us

[Solved] CSE176 HomeWork#1

Reviews

Related products

[Solved] CSE176 Lab#4 -clustering algorithm

[Solved] CSE176 Lab#3 -nonparametric methods

[Solved] CSE176 Lab#5- gradient descent and linear models

[Solved] CSE176 Lab#2 -Using a Gaussian classifier

[Solved] CSE176 HomeWork#2