, , , ,

[SOLVED] Soc-ga 2332 intro to stats lab 11

$25

File Name: Soc_ga_2332_intro_to_stats_lab_11.zip
File Size: 310.86 KB

Categories: , , , , Tags: , , , ,
5/5 - (1 vote)

Wenhao Jiang
Part 1: Bi-variate Associations (Contingency Tables)
For today, we will use a similar dataset about same-sex marriage support. But now we have three support levels (1 = Oppose, 2 = Neutral, 3 = Support) instead of a binary outcome.
In R, you can create a contingency table by using the table() function and input the two categorical variables you are interested in. To conduct a chi-square test of independence, simply use the function chisq.test(your_contingency_table).
## create variables for contingency tables support_df <- support_df %>% mutate(## convert dummies to categorical variables gender = ifelse(female == 0, “male”, “female”), race = ifelse(black == 1, “black”, “white”))
## simple contingency table and chi-square test for support levels and race t1 <- table(support_df$support_level, support_df$race) t1
##
## black white
## 1 72 180 ## 2 74 161 ## 3 148 365
chisq.test(t1)
##
## Pearson’s Chi-squared test
##
## data: t1
## X-squared = 0.65239, df = 2, p-value = 0.7217
Part 1 Exercise
Recall that the χ2 statistic is defined as:
χ2 = X (fo −efe)2,
f
where fo is the observed frequency and fe is the expected frequency.
You are given the following contingency table of support levels and gender:
Cell Contents
|————————-|
| N | | Expected N | | N / Row Total | | N / Col Total | | N / Table Total |
|————————-|
Total Observations in Table: 1000
| support_df$gender
support_df$support_level | female | male | Row Total |
————————-|———–|———–|———–|
1 | 105 | 147 | 252 |
| 123.228 | 128.772 | | | 0.417 | 0.583 | 0.252 |
| 0.215 | 0.288 | | | 0.105 | 0.147 | |
————————-|———–|———–|———–|
2 | 109 | 126 | 235 |
| 114.915 | 120.085 | | | 0.464 | 0.536 | 0.235 |
| 0.223 | 0.247 | | | 0.109 | 0.126 | |
————————-|———–|———–|———–|
3 | 275 | 238 | 513 |
| 250.857 | 262.143 | | | 0.536 | 0.464 | 0.513 |
| 0.562 | 0.466 | | | 0.275 | 0.238 | |
————————-|———–|———–|———–|
Column Total | 489 | 511 | 1000 |
| 0.489 | 0.511 | |
————————-|———–|———–|———–|
1. How do you calculate the expected frequency for each cell? Verify your answer with the expected frequencies in the table.
2. State you null and alternative hypotheses of the χ2 test;
3. Calculate the χ2 statistic using the formula above;
4. Calculate the p-value of your test statistic. Hint: (a) recall that the degree of freedom is calculated by df = (nrow − 1) · (ncol − 1), (b) search pchisq
## your code here
Part 2: Ordered Logit Regression Model
2.1 Model Setup
• The cumulative probability for individual i’s choice up to response level j is given by:
j
Ci,j = Pr(yi ≤ j) = XPrj = 1,2,…,J
k=1
• This specific form of cumulative probability stems from the Sigmoid function:
f(x) = 1 + exp1 (−x)
which is monotonically increasing w.r.t. x
• Here we replace x by the linear combination of category-specific cutpoints φj and individual-specific characteristics and their coefficients
– Intuitively, we would use as in binary logistic model
– We use because the model was specified in this way at the time of invention – path dependence
– It only changes the sign of β, but not its magnitude
• As the Sigmoid function is monotonically increasing, we will have:
– φ0 < φ1 < … < φJ
– φ0 to be −∞ and φJ to be ∞.
• The probability of being in response category j for the same individual i is:
Pr
• θj and β are estimated using Maximum Likelihood Estimation (MLE). In R, you can estimate a ordered logit model using the polr() function from the MASS package.
## estimate ordered logit model
ologit1 <- polr(support_level ~ eduy, data = support_df, method=”logistic”) ologit2 <- polr(support_level ~ eduy + age, data = support_df, method=”logistic”) ologit3 <- polr(support_level ~ eduy + age + female, data = support_df, method=”logistic”) ologit4 <- polr(support_level ~ eduy + age + female + black, data = support_df, method=
## stargazer
stargazer(ologit1, ologit2, ologit3, ologit4, type=”text”)
“logistic”)
##
## ===================================================
## Dependent variable:
## ————————————–
## support_level
## (1) (2) (3) (4)
## —————————————————
## eduy 0.639*** 0.817*** 0.864*** 0.864*** ## (0.034) (0.043) (0.045) (0.045)
##
## age -0.221*** -0.227*** -0.227***
## (0.016) (0.016) (0.016)
##
## female 1.041*** 1.044***
## (0.165) (0.165)
##
## black -0.048
## (0.171)
##
## —————————————————
## Observations 1,000 1,000 1,000 1,000
## ===================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
2.2 Coefficients Interpretation
• In ordered logit models, the coefficients capture the effect on the log odds of moving to the “higher rank”. The exponentiated coefficients indicate the ratio between the odds after and before the given predictor increased by one unit.
• The odds here is defined as the probability of being in a higher category divided by the probability of being in the current or lower category.
Pr(yi>j|Xk+1) 1−Pr(yi≤j|Xk+1)
j k k
= exp(βk)
• To get these odds ratios in R, use exp(coef(your_model_object)) (same as the code you use for getting odds ratio for logistic models).
## odds Ratio exp(coef(ologit4))
## eduy age female black
## 2.3726365 0.7965623 2.8403564 0.9533769
• Note that the term exp(βk) does not depend on j. In other words, the effect of Xk on yi moving to a higher category is the same across j. The effect of moving from the lowest category to higher ones is the same as from the second highest category to the highest category. This is the proportional odds assumption/parallel regression assumption
2.3 Plot Predicted Probability
## dataframe for prediction
predicted_ord <- as.data.frame(Effect(c(“eduy”),
ologit4, xlevels = list(
eduy = seq(3, 24, by = 0.5), age = mean(support_df$age), black = mean(support_df$black), female = mean(support_df$female))
), level=95)
## get predicted yhat, pivot to long form predicted_y_ord <- predicted_ord %>% dplyr::select(eduy, prob.X1, prob.X2, prob.X3) %>% pivot_longer(!eduy, names_to = “level_y”, values_to = “yhat”)
## get predicted upper CI of yhat, pivot to long form predicted_upr_ord <- predicted_ord %>% dplyr::select(eduy, U.prob.X1, U.prob.X2, U.prob.X3) %>% pivot_longer(!eduy, names_to = “level_upr”, values_to = “upr”) %>%

dplyr::select(-eduy, -level_upr)
## get predicted lower CI of yhat, pivot to long form predicted_lwr_ord <- predicted_ord %>% dplyr::select(eduy, L.prob.X1, L.prob.X2, L.prob.X3) %>% pivot_longer(!eduy, names_to = “level_lwr”, values_to = “lwr”) %>% dplyr::select(-eduy, -level_lwr)
## combine to one df for plotting predicted_plot_ord <- cbind(predicted_y_ord, predicted_upr_ord, predicted_lwr_ord)
## plot
figure1 <- predicted_plot_ord %>%
ggplot(aes(x = eduy, y = yhat,
ymax = upr, ymin = lwr, fill = as.factor(level_y), linetype = as.factor(level_y))) +
geom_line() +
geom_ribbon(alpha = 0.3) + labs(title = “Ordered Logit”, x = “Years of Education”, y = “Predicted Probability”) + scale_fill_manual(name = “”, values = c(“#3182bd”, “#31a354”, “#de2d26”), label = c(“Disagree”, “Neutral”, “Agree”)) +
scale_linetype_manual(name = “”, values = c(“dashed”, “dotdash”, “solid”), label = c(“Disagree”, “Neutral”, “Agree”)) +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) figure1
Ordered Logit

Part 3: Multinomial Logit Regression Model
3.1 Model Setup
• Multinomial logit model can be used to predict the probability of a response falling into a certain category among K categories that are not ordered.
• We think of the problem as fitting K − 1 independent binary logit models, where one of the possible outcomes is defined as a pivot, and the K − 1 outcomes are compared with the pivot outcome.
– A binary logistic model is a special case of multinomial logit model – Recall a binary model has the form:
logit(Pr(Yi = 1)) = log Pr(Y = 1)
= logPrPr((YYii = 1)= 0)
= α1 + β1Xi
– which essentially compares the outcome (Yi = 1) with the pivot, reference outcome (Yi = 0)
• Now in parallel, with multiple K outcomes, the K −1 non-pivotal outcomes is assumed to be (assuming the first group is the pivot one):
logPrPr((YYii = 2)= 1) = α2 + β2Xi
logPrPr((YYii = 3)= 1) = α3 + β3Xi

logPrPr((YYii== 1)K) = αK + βKXi • Rewriting each probability, we get:
Pr(Yi = 2) = exp(α2 + β2Xi) × Pr(Yi = 1) Pr(Yi = 3) = exp(α3 + β3Xi) × Pr(Yi = 1)

Pr(Yi = K) = exp(αK + βKXi) × Pr(Yi = 1) • As PKk=1 Pr(Yi = k) = 1, we get:
Pr(Yi = 1) = 1
k k i
Pr(Yi = 2) = 1 + Pexp(Kk=2αexp(k +αβk2X+i)βkXi)

• Multinomial logit model can be estimated using the multinom() function from the nnet package.
## estimate multinomial logit models mlogit1 <- multinom(support_level ~ eduy, data = support_df)
## # weights: 9 (4 variable) ## initial value 1098.612289
## iter 10 value 725.903623
## final value 725.896424
## converged
mlogit2 <- multinom(support_level ~ eduy + age, data = support_df)
## # weights: 12 (6 variable)
## initial value 1098.612289
## iter 10 value 600.038827 ## iter 20 value 599.386590 ## iter 20 value 599.386589
## iter 20 value 599.386589
## final value 599.386589
## converged
mlogit3 <- multinom(support_level ~ eduy + age + female, data = support_df)
## # weights: 15 (8 variable)
## initial value 1098.612289
## iter 10 value 587.970856
## final value 578.365184
## converged
mlogit4 <- multinom(support_level ~ eduy + age + female + black, data = support_df)
## # weights: 18 (10 variable)
## initial value 1098.612289
## iter 10 value 604.789528 ## iter 20 value 578.283131 ## iter 20 value 578.283129
## iter 20 value 578.283129
## final value 578.283129
## converged
stargazer(mlogit1, mlogit2, mlogit3, mlogit4, type=”text”)
##
## =================================================================================================
## Dependent variable:
## ——————————————————————————-
## 2 3 2 3 2 3 2 3 ## (1) (2) (3) (4) (5) (6) (7) (8)
## ————————————————————————————————-
## eduy 0.384*** 0.930*** 0.556*** 1.283*** 0.599*** 1.381*** 0.599*** 1.382*** ## (0.047) (0.057) (0.058) (0.079) (0.061) (0.085) (0.062) (0.085)
##
## age -0.160*** -0.346*** -0.166*** -0.360*** -0.167*** -0.360***
## (0.021) (0.027) (0.022) (0.028) (0.022) (0.028)
##
## female 0.721*** 1.681*** 0.723*** 1.687***
## (0.223) (0.274) (0.224) (0.274)
##
## black -0.019 -0.099
## (0.233) (0.280)
##
## Constant -3.854*** -9.815*** 1.140 0.314 0.658 -1.007 0.670 -0.979
## (0.470) (0.637) (0.788) (0.950) (0.810) (1.000) (0.817) (1.006)
##
## ————————————————————————————————## Akaike Inf. Crit. 1,459.793 1,459.793 1,210.773 1,210.773 1,172.730 1,172.730 1,176.566 1,176.566 ## =================================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
3.2 Coefficients Interpretation
• The exponentiated regression coefficients from the multinomial logit model can be interpreted in terms of relative risk ratios. This makes the interpretation of the coefficients a bit more intuitive, compared to the coefficients from either binary or ordinal logistic regression.
Pr(Yi=k|xi+1)
Relative Risk Ratio = PrPr(Y(iY=1i=|kx|ix+1)i)
Pr(Yi=1|xi)
=
= exp(βk)
• The interpretation for βk is: holding others at constant, for one unit increase of the predictor, the relative risk of falling into the category k, compared with falling into the baseline category, increases by a factor of exp(βk).
• To get the relative risk ratios in R, use exp(coef(your_model_object)) (same as the code you use for getting odds ratio for logistic models).
## get relative risk ratios for the 4th model exp(coef(mlogit4))
## (Intercept) eduy age female black
## 2 1.9535790 1.820853 0.8464970 2.060538 0.9811196
## 3 0.3756124 3.983080 0.6977276 5.400878 0.9056941
3.3 Plot Predicted Probabilities
• We can also plot the predicted effect for multinomial logistic models. For example, we can plot the predicted probabilities for the three possible outcomes (support, neutral, oppose) using the Effect() function.
# Get predicted y values
predicted_mul <- as.data.frame(Effect(c(“eduy”),
mlogit4, xlevels = list(
eduy = seq(3, 24, by = 0.5), age = mean(support_df$age), black = mean(support_df$black), female = mean(support_df$female))
), level=95)
# Get predicted yhat, pivot to long form predicted_y_mul <- predicted_mul %>% dplyr::select(eduy, prob.X1, prob.X2, prob.X3) %>% pivot_longer(!eduy, names_to = “level_y”, values_to = “yhat”)
# Get predicted upper CI of yhat, pivot to long form predicted_upr_mul <- predicted_mul %>% dplyr::select(eduy, U.prob.X1, U.prob.X2, U.prob.X3) %>% pivot_longer(!eduy, names_to = “level_upr”, values_to = “upr”) %>% dplyr::select(-eduy, -level_upr)
# Get predicted lower CI of yhat, pivot to long form predicted_lwr_mul <- predicted_mul %>% dplyr::select(eduy, L.prob.X1, L.prob.X2, L.prob.X3) %>% pivot_longer(!eduy, names_to = “level_lwr”, values_to = “lwr”) %>% dplyr::select(-eduy, -level_lwr)

# Combine to one df for plotting predicted_plot_mul <- cbind(predicted_y_mul, predicted_upr_mul, predicted_lwr_mul)
# Plot
figure2 <- predicted_plot_mul %>%
ggplot(aes(x = eduy, y = yhat,
ymax = upr, ymin = lwr, fill = as.factor(level_y), linetype = as.factor(level_y))) +
geom_line() +
geom_ribbon(alpha = 0.3) + labs(title = “Multinomial Logit”, x = “Years of Education”, y = “Predicted Probability”) + scale_fill_manual(name = “”, values = c(“#3182bd”, “#31a354”, “#de2d26”), label = c(“Disagree”, “Neutral”, “Agree”)) +
scale_linetype_manual(name = “”, values = c(“dashed”, “dotdash”, “solid”), label = c(“Disagree”, “Neutral”, “Agree”)) +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggarrange(figure1, figure2, ncol=2, common.legend = TRUE,
legend=”right”)

Part 4: Conditional Logit Regression Model
4.1 Model Setup
• The probability of person i selecting option j is:
πij = PQexpexp(βT(XβTijX) iq)
q=1
* The Softmax function has several advantages, including the linearity of the relative probability between πij and πiq
• log( πij ) = βTXij − βTXiq
πiq
4.2 Real-world Applications
• The model frequently appear in the Machine Learning literature, commonly referred as a Softmax function.
– In ML, it is most commonly used as the last activation layer of a neural network to normalize the output of a network to a probability distribution over predicted output classes

Figure 1: A Simple Neural Network Model with Softmax Activation Layer
– The intuition of the model is to maximize the probability that the target word wt+j appears in the context window of the central word wt, expressed as P(wt+j|wt)
– Both target and central words are n-dimensional vectors (e.g., 300-dimension) summarized by parameter θ. The “utility” of two words to co-occur is uTwt+jvwt. SGNS models the probability as:
expuTwt+jvwt
P(wt+j|wt,θ) = P exp uTk vwt
k∈V
– where the total “options” of words given the central word is the whole vocabulary list

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Soc-ga 2332 intro to stats lab 11[SOLVED] Soc-ga 2332 intro to stats lab 11
$25