[SOLVED] R database graph statistic University of Toronto Mississauga

$25

File Name: R_database_graph_statistic_University_of_Toronto_Mississauga.zip
File Size: 565.2 KB

5/5 - (1 vote)

University of Toronto Mississauga
STA302 Fall 2019 Assignment # 3
Due Date: Tuesday, November 19th 2019, during lecture.
Last Name / Surname (please print): First Name (please print):
Student Number:
Tutorial Section (circle one):
Instructor: Al Nosedal
T0101 18-19 Julian Braganza
INSTRUCTIONS and POLICIES:
Answer each of the questions.
Please, attach a printed version of your code and plots to your assignment. Failure to provide a printed copy of R code and graphs may result in receiving no marks.
Recall that missed assignments earn a mark of zero; no exceptions.
Medical certificates and/or other valid documentation are not accepted.
Late submissions will not be accepted. A hard copy of the assignment should be handed during lecture on the due date email submissions are not accepted.
Question
1
TOTAL
Value
20
20
Mark Earned
GOOD LUCK !
STA302 Fall 2019: Assignment # 3 Page 1 of 6

Problem 1. (20 marks) A Canadian supermarket carried out a survey among its cus- tomers to predict the factors that influenced the visit frequency to the store. Number of visits refers to the number of times a consumer visited the store in the past 30 days. All the remaining ratings are on a seven-point scale. For example, 7 on satisfaction means that the consumer is very satisfied with the supermarket while 1 means the customer is not at all satisfied with the supermarket.
Suppose a researcher wants to develop a regression model to predict the number of visits to the supermarket on the basis of consumer ratings on the remaining variables.
Dataset is available at
supermarket_url = https://mcs.utm.utoronto.ca/~nosedal/data/supermarket.txt
Developing regression models involves at least two considerations. The first is to develop a regression model that accounts for the most variation of the dependent variable. At the same time, the regression model should be as simple and economical as possible. How might researchers conduct regression analysis so that they can examine several models and then choose the most attractive one? The answer is to use search procedures.
Search procedures are processes by which more than one multiple regression model is de- veloped for a given database, and the models are compared and sorted by different criteria, depending on the given procedure.
Perhaps the most widely known and used of the search procedures is stepwise regression. Stepwise regression is a step-by-step process that begins by developing a regression model with a single predictor variable and adds and deletes predictors one step at a time, examining the fit of the model at each step until no more significant predictors remain outside of the model.
STEP 1. In Step 1 of a stepwise regression procedure, the k independent variables are examined one at a time by developing a simple regression model for each independent variable to predict the dependent variable. The model containing the largest absolute value of t for an independent variable is selected, and the independent variable associated with the model is selected as the best single predictor of y at the first step. If the first independent variable selected at step 1 is denoted x1, the model appears in the form:
y 1 = 0 + 1 x 1
If, after examining all possible single-predictor models, it is concluded that none of the independent variables produces a t value that is significant at , then the search procedure stops at Step 1 and recommends no model.
STEP 2. In Step 2, the stepwise procedure examines all possible two-predictor regression models with x1 as one of the independent variables in the model and determines which of the other k1 independent variables in conjunction with x1 produces the highest absolute t
STA302 Fall 2019: Assignment # 3 Page 2 of 6

value in the model. If this other variable selected from the remaining independent variables is denoted x2 and is included in the model selected at Step 2 along with x1, the model appears in the form:
y 1 = 0 + 1 x 1 + 2 x 2
At this point, stepwise regression pauses and examines the t value of the regression co- efficient for x1. Occasionally, the regression coefficient for x1 will become statistically non-significant when x2 is entered into the model. In that case, stepwise regression will drop x1 out of the model and go back and examine which of the other k 2 independent variables, if any, will produce the largest significant absolute t value when that variable is included in the model along with x2. If no other variables show significant t values, the procedure halts.
STEP 3. Step 3 begins with independent variables x1 and x2 (the variables that were finally selected at Step 2) in the model. At this step, a search is made to determine which of the k 2 remaining independent variables in conjunction with x1 and x2 produces the largest significant absolute t value in the regression model. Let us denote the one that is selected as x3. If no significant t values are acknowledged at this step, the process stops here and the model determined in Step 2 is the final model. At Step 3, the model appears in the form:
y1 = 0 + 11 + 22 + 33
In a manner similar to Step 2, stepwise regression now goes back and examines the t values of the regression coefficients of x1 and x2 in this Step 3 model. If either or both of the t values are now nonsignificant, the variables are dropped out of the model and the process calls for a search through the remaining k 3 independent variables to determine which, if any, in conjunction with x3 produce the largest significant t values in this model. The stepwise regression process continues step by step until no significant independent variables remain that are not in the model.
Now, I would like you to use a stepwise regression search procedure on the supermarket data to find a regression model.
Use R to answer each of the following questions. You have to show all your work to get full credit. Answers, even if correct, with no justifications will not receive any marks.
STA302 Fall 2019: Assignment # 3 Page 3 of 6

STA302 Fall 2019: Assignment # 3 Page 4 of 6
a) Using R, Examine each of the independent variables, one at a time, to determine the strength of each predictor in a simple regression model. Report your results in the following table:
Dependent Variable Number of Visits
Independent Variable
t statistic R2
Number of Visits Number of Visits
Let x1 be the independent variable that produces the largest absolute t value. Provide R code and computer output for these models.

STA302 Fall 2019: Assignment # 3 Page 5 of 6
b) Using R, conduct a search among the two remaining independent variables to determine which of those variables in conjuction with x1 produces the largest significant t value. Report your results in the following table:
Dependent Independent Independent t statistic Variable Variable (x1) Variable (x2) (x1)
t statistic R2 (x2)
Number of Visits Number of Visits
Provide R code and computer output for the Step 2 regression models.

STA302 Fall 2019: Assignment # 3 Page 6 of 6
c) Regardless of results from Step 2, fit a regression model with all three independent variables. Report your results in the following table:
Dependent Independent Independent Independent t statistic t statistic t statistic R2 Variable Variable (x1) Variable (x2) Variable (x3) (x1) (x2) (x3)
Number of Visits
Provide R code and computer output for the Step 3 regression model.
d) Use the technique of stepwise regression with = 0.10 to choose a linear regression model. Provide its equation explicitly.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] R database graph statistic University of Toronto Mississauga
$25