[SOLVED] 代写 R math statistic theory F-test in Multiple Regression

30 $

File Name: 代写_R_math_statistic_theory_F-test_in_Multiple_Regression.zip
File Size: 565.2 KB

SKU: 5480328889 Category: Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Or Upload Your Assignment Here:


F-test in Multiple Regression
Comparing Nested Models
Richard Ressler 2019-11-20
1

Learning Outcomes
• Develop and Apply Tests for including multiple variables at the same time.
• Compare Nested Models using the F-Tests and anova() in R • References: Section 10.3 in the book
2

Case Study: Kentucky Derby
• Speed
vs Year and Year2.
## Year
## 1 1896
## 2 1897
## 3 1898
## 4 1899
## 5 1900 Lieut. Gibson7
## 6 1901His Eminence5
Winner Starters NetToWinner Time Speed T
Ben Brush8
Typhoon II6
Plaudit4
Manuel5
4850 127.75 35.23 D
4850 132.50 33.96 H
4850 129.00 34.88
4850 132.00 34.09
4850 126.25 35.64
4850 127.75 35.23
3
library(Sleuth3) data(ex0920) head(ex0920)

Year vs Speed
qplot(Year, Speed, data = ex0920) + geom_smooth(se = FALSE)
37
36
35
34
1920 1950 1980 2010
Year
4
Speed

Goal
• Get a p-value for the association between year and speed.
• A linear model looks okay but a model with a quadratic term
might be better.
• μ(Speed|Year) = β0 + β1Year + β2Year2
• So to see if Year2 is important, we need to test:
• H0 :β2 =0givenβ1 ̸=0 • HA :β2 ̸=0givenβ1 ̸=0
5

Full and Reduced Models:
• Reduced Model: μ(Speed|Year) = β0 + β1Year
• Full Model: μ(Speed|Year) = β0 + β1Year + β2Year2 • Use F-test strategy to run this hypothesis test.
1. Fit both full and reduced models.
2. Calculate sum of squared residuals under both models and the
corresponding degrees of freedom.
3. Calculate the F-statistic.
4. Compare to theoretical F-distribution under H0
6

Fit Under Reduced – Simple Linear Regression
37
36
35
34
1920 1950 1980 2010
Year
7
Speed

Residuals under Reduced
37
36
35
34
1920 1950 1980 2010
Year
8
Speed

Residuals against Fit for Reduced
1
0
−1
−2
35.0 35.5 36.0 36.5 37.0
fitted(lmreduced)
9
resid(lmreduced)

Fit under Full
37
36
35
34
1920 1950 1980 2010
Year
10
Speed

Residuals under Full
37
36
35
34
1920 1950 1980 2010
Year
11
Speed

Residuals against Fit for Full
1
0
−1
34.5 35.0 35.5 36.0 36.5 37.
fitted(lmfull)
0
12
resid(lmfull)

Running the F Test to compare the models in R
• First, fit both reduced and full models.
• Save the output to two different variables.
ex0920$Year2 <- ex0920$Year ^ 2lmfull <- lm(Speed ~ Year + Year2, data = ex0920) lmreduced <- lm(Speed ~ Year, data = ex0920)• Run anova() with the reduced model as the first argument. anova(lmreduced, lmfull)## Analysis of Variance Table#### Model 1: Speed ~ Year## Model 2: Speed ~ Year + Year2 #### 1## 2## —Res.DfRSS Df Sum of SqF 114 41.837Pr(>F)
113 33.08318.7539 29.9 2.757e-07 ***
13

What is that Table?
## Analysis of Variance Table
##
## Model 1: Speed ~ Year
## Model 2: Speed ~ Year + Year2
Pr(>F)
113 33.08318.7539 29.9 2.757e-07 ***
##
## 1
## 2
## —
## Signif. codes:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
Res.DfRSS Df Sum of SqF
114 41.837
Res.Df RSS Df Sum of Sq F Pr(>F)
dfreduced RSSreduced
dffull RSSfull dfextra ESS F -stat p-value
14

F -test
• We can use the F-test for any two nested models.
• Nested: The reduced model is a special case of the full model created by setting constraints on some of the parameters of the full.
• e.g., set one or more parameters to zero.
15

Another Example: Starters Variable “Marginal Fit”
37
36
35
34
5 10 15 20
Starters
16
Speed

Consider adding variables for Starters and Starters2
• μ(Speed|Year,Starters) =
β0 + β1Year + β2Year2 + β3Starters + β4Starters2
• H0 : β3 = β4 = 0
• HA : eitherβ3 ̸=0orβ4 ̸=0
• Full Model: μ(Speed|Year,Starters) =
β0 + β1Year + β2Year2 + β3Starters + β4Starters2
• Reduced Model:
μ(Speed|Year, Starters) = β0 + β1Year + β2Year2
17

Fit and Run F Test in R
ex0920$Starters2 <- ex0920$Starters ^ 2lmfull <- lm(Speed ~ Year + Year2 + Starters +Starters2, data = ex0920)lmreduced <- lm(Speed ~ Year + Year2, data = ex0920)anova(lmreduced, lmfull) ## Analysis of Variance Table#### Model 1: Speed ~ Year + Year2## Model 2: Speed ~ Year + Year2 + Starters + Starters2## Res.DfRSS Df Sum of SqFPr(>F)
## 1113 33.083
## 2111 30.90322.1803 3.9156 0.02274 *
## —
## Signif. codes:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
18

Example of a non-nested model
• Model1: μ(Speed|Year,Starters)=β0+β1Year+β2Year2
• Model 2:
μ(Speed|Year,Starters)=β0 +β1Starters+β2Starters2
• Cannot use an F-test to compare these two models.
• Why? Mathematical theory only guarantees the F-distribution when the models are nested.
• When models are not nested, use other methods to evaluate
• e.g., adjusted R2, Cp, AIC, or BIC methods from section 12.4 (more on this later).
19

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] 代写 R math statistic theory F-test in Multiple Regression
30 $