, , ,

[SOLVED] Indr 450/550 homework 4

$25

File Name: Indr_450_550_homework_4.zip
File Size: 216.66 KB

5/5 - (1 vote)

1. (35 points) Regression based methods
(a) Split to data into a training set (months 5 to 84) and a test set
(months 85 to 108). Fit a least squares regression to training
data with the following predictors:
yt = β0 + β1t + β2t
2 +
+β3x1t + β3x2t + … + β14x11t
+β15d1t + β16d2t + β17d3t + ϵt
where:
xit is an indicator (dummy) for month i (i = 1, 2, , , 11). Note
that we only need to use 11 of the 12 monthly indicators in the
regression and we skip month 12 dit is a difference at lag i, dit =
yt−i − yt−i−1.
(b) (5 points) Comment on the significance of the predictors and R2
value. Compute the MSE and RMSE of the fit on the training
data.
1
(c) (5 points) Compute the MSE and RMSE of the fit on the test
data.
(d) (25 points) Now fit a lasso regression to shrink the full model. Experiment with different penalty parameters (referred to as ’gamma’
in sklearn library) and compute the test RMSE for a few cases.
Report the most significant predictors based on the lasso optimization.
2. (50 points) Tree based methods
(a) (10 points) Fit a regression tree using all predictors . Experiment
with a few different values for the depth of the tree and report
the train and test MSE and RMSE.
(b) (10 points) Fit a bagged regression tree using bagging and using
all predictors. Report the train and test MSE and RMSE.
(c) (15 points) Fit a random forest. Experiment with different numbers of features and report the train and test MSE and RMSE.
(d) Plot the importance of the predictors for the best random forest.
(e) (15 points) Fit a boosted tree. Experiment with different depths
and learning rates and report the train and test MSE and RMSE.
(f) Plot the importance of the predictors for the best boosted tree.
3. (15 points) Interpreting the results.
(a) (10 points) Complete the below results summary table. For full
regression, you can note the statistically significant predictors
at 5%. For lasso you can list the predictors that remain after
shrinkage and for random forest and boosted tree you can list
the predictors based on the Importance measure order. List also
the specifications (maximum depth, maximum number of features
etc.) whenever they apply.
(b) (5 points) Using the summary table, please explain which three
or four predictors are the most important ones to predict the
sales of Audis. Which of the methods would you recommend for
predictipon for this data?
2
Table 1: Comparison Table
Method Train RMSE Test RMSE Predictors Spec.
Full Regression
Lasso
Regression Tree –
Bagged Tree –
Random Forest
Boosted Tree

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Indr 450/550 homework 4[SOLVED] Indr 450/550 homework 4
$25