1 Bayesian Linear Regression
Given the training set x and corresponding label set t, we want to predict the label t of new test point x. In other words, we wish to evaluate the predictive distribution p(t|x,x,t).
A linear regression function can be expressed as below where the (x) is a basis function:
y(x,w) = wT(x)
In order to make prediction of t for new test data x from the learned w, we will:
- Multiply the likelihood function of new data p(t|x,w) and the posterior distribution of training set with label set.
- Take the integral over w to find the predictive distribution:
.
Now, please answer the following questions:
- Why we need the basis function (x) for linear regression? And what is the benefit for applying basis function over linear regression?
- Prove that the predictive distribution just mentioned is the same with the form
p(t|x,x,t) = N(t|m(x),s2(x))
where
s2(x) = 1 + (x)TS(x).
Here, the matrix S1 is given by S
(hint: p(w|x,t) p(t|x,w)p(w) and you may use the formulas shown in page 93.)
- Could we use linear regression function for classification? Why or why not? Explain it!
1
2 Linear Regression
In this homework, you need to predict the chance of being admit in base on relevant student resume data. The following two approaches need to be realized respectively:
- Maximum likelihood approach (ML)
- Maximum a posteriori approach (MAP)
model! Dataset provides total 500 students with 7 features. Can you use these features to predict the chance of admit for your own dream school?
One might consider the following steps to start the work:
- Download and check for the dataset.
- Create a new Colab or Jupyter notebook file.
- Divide the dataset into training and validation.
Dataset Description
- dataset X.csv contains 7 different resume feature served as the input.
GRE score, TOFEL score, University rating, SOP, LOR, CGPA, Research
- dataset T.csv contains the chance of admit regard as the target. Chance of Admit
Specification
- For those problems with Code Result at the end, you must show your result in your .ipynb file or you will get no
- For those problem with Explain at the end, you must have a clear explanation or you will get low points.
- You are also encouraged to have some discussion on those problem which is not marked as Explain.
- Feature select
In real-world applications, the dimension of data is usually more than one. In the training stage, please fit the data by applying a polynomial function of the form
D D D
y(x,w) = w0 + Xwixi + XXwijxixj (M = 2)
i=1 i=1 j=1
and minimizing the error function.
- In the feature selection stage, please apply polynomials of order M = 1 and M = 2 over the dimension D = 7 input data. Please evaluate the corresponding RMS error on the training set and valid set. (15%) Code Result
- How will you analysis the weights of polynomial model M = 1 and select the most contributive feature? Code Result, Explain (10%)
- Maximum likelihood approach
- Which basis function will you use to further improve your regression model, Polynomial, Gaussian, Sigmoidal, or hybrid? Explain (5%)
- Introduce the basis function you just decided in (a) to linear regression model and analyze the result you get. (Hint: You might want to discuss about the phenomenon when model becomes too complex.) Code Result, Explain (10%)
(x) = [1(x),2(x),,N(x),bias(x)]
- Apply N-fold cross-validation in your training stage to select at least one hyperparameter(order, parameter number, ) for model and do some discussion(underfitting, overfitting). Code Result, Explain (10%)
- Maximum a posterior approach
- What is the key difference between maximum likelihood approach and maximum a posterior approach? Explain (5%)
- Use Maximum a posterior approach method to retest the model in 2 you designed. You could choose Gaussian distribution as a prior. Code Result (10%)
- Compare the result between maximum likelihood approach and maximum a posterior approach. Is it consistent with your conclusion in (a)? Explain (5%)
Reviews
There are no reviews yet.