# Problem 1 [25%]

It is mentioned in Chapter 7 of ISL that a cubic regression spline with one knot at *ξ *can be obtained using a basis of the form *x*, *x*^{2}, *x*^{3}, [*x − ξ*]^{3}_{+}, where [*x − ξ*]^{3}_{+ }= (*x − ξ*)^{3 }if *x > ξ *and equals 0 otherwise. We will now show that a function of the form

*f*(*x*) = *β*_{0 }+ *β*_{1}*x *+ *β*_{2}*x*^{2 }+ *β*_{3}*x*^{3 }+ *β*_{4}[*x − ξ*]_{+}^{3}

is indeed a cubic regression spline, regardless of the values of *β*_{0},*β*_{1},*β*_{2}, *β*_{3},*β*_{4}.

- Find a cubic polynomial

*f*_{1}(*x*) = *a*_{1 }+ *b*_{1}*x *+ *c*_{1}*x*^{2 }+ *d*_{1}*x*^{3}

such that *f*(*x*) = *f*_{1}(*x*) for all *x ≤ ξ*. Express *a*_{1},*b*_{1},*c*_{1},*d*_{1 }in terms of *β*_{0},*β*_{1},*β*_{2},*β*_{3},*β*_{4}.

- Find a cubic polynomial
*f*_{2}(*x*) =*a*_{2 }+*b*_{2}*x*+*c*_{2}*x*^{2 }+*d*_{2}*x*^{3}

such that *f*(*x*) = *f*_{2}(*x*) for all *x > ξ*. Express *a*_{2},*b*_{2},*c*_{2},*d*_{2 }in terms of *β*_{0},*β*_{1},*β*_{2},*β*_{3},*β*_{4}. We have now established that *f*(*x*) is a piecewise polynomial.

- Show that
*f*_{1}(*ξ*) =*f*_{2}(*ξ*). That is,*f*(*x*) is continuous at*ξ*.

# Problem 2 [25%]

Use linear, cubic, and natural regression splines investigated Chapter 7 of ISL to the Auto data set. Is there evidence for non-linear relationships in this data set? Create some informative plots to justify your answer.

# Problem 3 [25%]

You will now derive the Bayesian connection to the lasso as discussed in Section 6.2.2. of ISL.

- Suppose that
*y*=_{i }*β*_{0 }+^{Pp}_{j}_{=1 }*x*+_{ij}β_{j }where_{i }_{1}*,…,*are independent and identically distributed from a normal distribution_{n }*N*(0*,*1). Write out the likelihood for the data as a function of values*β*. - Assume that the prior for
*β*:*β*_{1}*,…,β*is that they are independent and identically distributed according to a_{p }*Laplace*distribution with mean zero and variance*c*. Write out the posterior for*β*in this setting using Bayes theorem. - Argue that the lasso estimate is the value of
*β*with maximal probability under this posterior distribution. Compute log of the probability in order to make this point.*Hint*: The denominator (= the probability of data) can be ignored in computing the maximum probability. - Suppose that
_{1}*,…,*are independent and identically distributed according to the Laplace distribution._{n }

What are the maximum likelihood/MAP estimates of *β _{i }*under this assumption?

*Hint*: See https: //en.wikipedia.org/wiki/Least_absolute_deviations

1

# Problem 4 [25%]

*Based on a true story, according to*: The Drunkard’s Walk: How Randomness Rules Our Lives, Leonard Mlodinow

Suppose that you applied for a life insurance and underwent a physical exam. The bad news is that your application was rejected because you tested positive for HIV. The test’s *sensitivity *is 99*.*7% and *specificity *is 98*.*5% [https://en.wikipedia.org/wiki/Diagnosis_of_HIV/AIDS#Accuracy_of_HIV_testing]. However, after studying the CDC website, you find that in your ethnic group (age, gender, race, …) only one in 10,000 people is infected. What is the probability that you actually have HIV?

2

## Reviews

There are no reviews yet.