[SOLVED] graph statistic Exercise 5

$25

File Name: graph_statistic_Exercise_5.zip
File Size: 244.92 KB

5/5 - (1 vote)

Exercise 5
Your colleague is again trying to conduct some statistical analyses for her projects, but she hasnt taken a stats course in a while.She turns to you again for help.First, shes interested in a research question that has hours slept as an outcome variable.Her predictors are 1 scores on a psychometric assessment for anxiety and 2 geographic location.

Your colleague is interested in identifying how North, South, and East participants differ from West participants in hours slept.
What type of coding is optimal for this?

What steps should she follow in order to run the regression including this categorical predictor?

Shed also like to compare the groups to see if theyre different from the overall mean.She thinks that in the population all of the categories are the same size i.e., each category size in the population is total population sizenumber of groups.
Is there another coding method she could use?Explain why it is preferable for this research question.

Can you provide her the steps to code and run this analysis?

She ran the analysis from question 1b herself and came back with the results.

Call:
lmformulasleepregion, datafakedata

Residuals:
Min1QMedian3Q Max
23.0555.458 0.972 7.00018.800

Coefficients:
Estimate Std. Error t value Prt
Intercept46.46 1.84 25.22 2e16
regionN11.54 3.293.510.00055
regionS 1.74 2.730.640.52461
regionE 7.60 1.993.820.00018

Signif. codes:00.0010.010.05 . 0.1 1

Residual standard error: 9.03 on 196 degrees of freedom
Multiple Rsquared:0.107,Adjusted Rsquared:0.0934
Fstatistic: 7.83 on 3 and 196 DF,pvalue: 5.78e05

She wants to know how to interpret the regionE coefficient, how to interpret the Intercept coefficient, and what it means when it says theyre significant.

She has several predictors in another model and would like your advice on the output.She satisfied all the assumptions and successfully fit the model in R.
Explain to her how to interpret the coefficient for X3 from the result below.
Call:
lmformulayX1X2X3, datafakedata2

Residuals:
Min1QMedian3Q Max
3.0311 1.66880.24751.27053.9498

Coefficients:
Estimate Std. Error t value Prt
Intercept 61.72651474.881448 0.824 0.4369
X1 1.565139 0.799297 1.958 0.0911 .
X2 0.507857 0.772417 0.657 0.5319
X3 0.165630 0.894589 0.185 0.0504 .
X40.127194 0.7635620.167 0.8724

Signif. codes:00.0010.010.05 . 0.1 1

Residual standard error: 2.61 on 7 degrees of freedom
Multiple Rsquared:0.9824,Adjusted Rsquared:0.9699
Fstatistic: 78.34 on 5 and 7 DF,pvalue: 0.000005452

Is X3 a meaningful predictor of y?Why or why not?

Determine if there are any issues with collinearity from the following output, which is based on the results found in question 4a.
Call:
imcdiagxdesign.matrix, ylinear.modelmodel1
All Individual Multicollinearity Diagnostics Result
VIFTOL Wi Fi LeamerCVIF Klein
X1 38.9453 0.025775.8905 113.8358 0.1602 0.3558 0
X2254.5083 0.0039 507.0167 760.5250 0.0627 2.3251 1
X3 57.8406 0.0173 113.6811 170.5217 0.1315 0.5284 1
X4287.7631 0.0035 573.5261 860.2892 0.0589 2.6290 1

Rsquare of y on all x: 0.9824

Explain 1 what collinearity is and 2 why it is or isnt a problem for the results in question 3.

In a final model, she predicted a dichotomous outcome using a set of predictors.You recognize this as logistic regression.

Explain how logistic regression is different from OLS regression with a continuous outcome variable.Be sure to define the link function.

Interpret the Age coefficient from her output below.Express the result in change in odds of earning more than 50:K.The dichotomous outcome variable is earn more than 50K, where 1more than 50K; Age is a discrete interval value.

summarylogitMod
Call:
glmformulaABOVE50K AGECAPITALGAINOCCUPATION
EDUCATIONNUM, familybinomial, datafakerdata

Deviance Residuals:
Min 1Q Median 3QMax
3.83800.53190.0073 0.6267 3.2847

Coefficients:
Estimate Std. Error z value Prz
Intercept4.576571300.24641856 18.5720.0000000000000002
AGE 2.277128540.07205131 31.6040.0000000000000002

Null deviance: 15216.0on 10975degrees of freedom
Residual deviance:8740.9on 10953degrees of freedom
AIC: 8786.9

Number of Fisher Scoring iterations: 8

Provide the 95 CI for the change in odds for Age note: The Zcrit is 1.96.

You notice that she has some missing data.She doesnt know why the data is missing and she decided to use the default method for missing data handling: casewise deletion.
Explain to her the problems with using casewise deletion when she doesnt know why the data is missing.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] graph statistic Exercise 5
$25