title: Problem Set 2
output:
html_document:
theme: lumen
highlight: pygment
number_sections: TRUE
*The assignment is worth **100 points**. There are **26 questions**. You should have the following packages installed:*
`{r setup, results=hide, message=FALSE, warning=FALSE}
library(tidyverse)
library(cowplot)
library(lfe)
library(stargazer)
`
In this problem set you will summarize the paper [Do Workers Work More if Wages Are High? Evidence from a Randomized Field Experiment](https://www.aeaweb.org/articles?id=10.1257/aer.97.1.298) (Fehr and Goette, AER 2007) and recreate some of its findings.
# Big picture
**1. What is the main question asked in this paper?**
**2. Recall the taxi cab studies where reference dependence is studied using observational data. What can an experimental study do that an observational study cant?**
**3. Summarize the field experiment design.**
**4. Summarize the laboratory experiment design. Why was it included with the study?**
**5. Summarize the main results of the field experiment.**
**6. Summarize the main results of the laboratory experiment.**
**7. Why are these results valuable? What have we learned? Motivate your discussion with a real-world example.**
# Theory
**Suppose the messengers utility function is**
$$
v(e_t, x_t) = gamma w_t e_t g(e_t, x_t)
$$
**where $w_t$ is the wage rate in time $t$, $e_t$ is the messengers effort, $gamma$ is the marginal utility of lifetime wealth, and $g(e_t, x_t) = frac{(theta x_t)e^2}{2}$ is the messengers cost function constant disutility of effort $theta$ and exogenous disutility shock $x_t sim N(0,sigma^2)$. Since $mathbb{E}[x_t] = 0$, you can assume $x_t = 0$.**
**8. Show that the messenger chooses a level of effort so that the marginal benefit of working equals the marginal cost.**
**9. Show that the messenger in equilibrium responds to higher wages with higher effort.**
**10. Write an R function that calculates $e_t^*$ for different levels of $w_t$. Set default values of $theta=gamma=1$. Then use `curve()` to plot the labor supply for $w_t in [0,10]$.**
**11. Now suppose utility is given by**
$$
v(e_t, x_t) =
begin{cases}
gamma(w_t e_t r) g(e_t, x_t) &quad text{if} quad w_t e_t geq r \[1em]
lambdagamma(w_t e_t r) g(e_t, x_t) &quad text{if} quadw_t e_t < r \end{cases}$$**12. Show that how the messenger in equilibrium responds to higher wages depends on the reference point $r$. (Hint: recall there are three cases to consider.)****13. Once more write an R function that calculates $e_t^*$ for different levels of $w_t$. Set default values of $theta=gamma=1$, $lambda = 2$ and $r=3$. Then use `curve()` to plot the labor supply for $w_t in [0,10]$.** # Replication
*Use `theme_classic()` for all plots.*
## Correlations in revenues across firms
*For this section please use `dailycorrs.csv`.*
**14. The authors show that earnings at Veloblitz and Flash are correlated. Show this with a scatter plot with a regression line and no confidence interval. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called `p1`. **
`{r}
# your code here
`
**15. Next plot the kernel density estimates of revenues for both companies. Overlay the distributions and make the densities transparent so they are easily seen. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called `p2`.**
`{r}
# your code here
`
**16. Now combine both plots using cowplot and label the plots with letters.**
`{r}
# your code here
`
## Tables 2 and 3
*For this section please use `tables1to4.csv`.*
### Table 2
**On page 307 the authors write:**
>Table 2 controls for **individual fixed effects** by showing how, on average, the messengers revenues deviate from their person-specific mean revenues. Thus, a positive number here indicates a positive deviation from the person-specific mean; a negative number indicates a negative deviation.
**17. Fixed effects are a way to control for *heterogeneity* across individuals that is *time invariant.* Why would we want to control for fixed effects? Give a reason how bike messengers could be different from each other, and how these differences might not vary over time.**
**18. Create a variable called `totrev_fe` and add it to the dataframe. This requires you to average out each individuals revenue for a block from their average revenue: $x_i^{fe} = x_{it} bar{x}_i$ where $x_i^{fe}$ is the fixed effect revenue for $i$.**
`{r}
# your code here
`
**19. Use `summarise()` to recreate the findings in Table 2 for Participating Messengers using your new variable `totrev_fe`. (You do not have to calculate the differences in means.) In addition to calculating the fixed-effect controled means, calculate too the standard errors. Recall the standard error is $frac{s_{jt}}{sqrt{n_{jt}}}$ where $s_{jt}$ is the standard deviation for treatment $j$ in block $t$ and $n_{jt}$ are the corresponding number of observations. (Hint: use `n()` to count observations.) Each calculation should be named to a new variable. Assign the resulting dataframe to a new dataframe called `df_avg_revenue`.**
`{r}
# your code here
`
**20. Plot `df_avg_revenue`. Use points for the means and error bars for standard errors of the means. Note the following:**
* To dodge the points and size them appropriately, use `geom_point(position=position_dodge(width=0.5), size=4)`
* To place the error bars use `geom_errorbar(aes(x=block, ymin = [MEAN] [SE], ymax = [MEAN] + [SE]),width = .1,position=position_dodge(width=0.5))`
+ You need to replace `[MEAN]` with whatever you named your average revenues and `[SE]` with whatever you named your standard errors.
`{r}
# your code here
`
**21. Interpret the plot.**
### Table 3
**22. Recreate the point estimates in Model (1) in Table 3 by hand (you dont need to worry about the standard errors). Assign it to object `m1`. To recreate this model requires you to control for individual fixed effects and estimate the following equation:**
$$
y_{ijt} bar{y}_{ij} = beta_1 (text{H}_{ijt} bar{text{H}}_{ij}) + beta_2 (text{B2}_{ijt} bar{text{B2}}_{ij}) + beta_3 (text{B3}_{ijt} bar{text{B3}}_{ij}) + (varepsilon_{ijt} bar{varepsilon}_{ij})
$$
**where $text{H}$ is the variable `high`, $text{B2}$ is the second block (`block == 2`) and $text{B3}$ is the third block (`block == 3`).**
`{r}
# your code here
`
**23. Now recreate the same point estimates (ignoring the standard errors again) using `lm` and assign it to object `m2`. You are estimating**
$$
y_{ijt} beta_0 + beta_1 text{H}_{ijt} + beta_2 text{B2}_{ijt} + beta_3 text{B3}_{ijt} + sum_{i=1}^{n} alpha_i text{F}_i + varepsilon_{ijt}
$$
**where $text{F}_i$ is the dummy variable for each messenger (`fahrer`).**
`{r}
# your code here
`
**24. Now use the function [`felm()`](https://www.rdocumentation.org/packages/lfe/versions/2.8-3/topics/felm) from the `lfe` package to recreate Model (1), including the standard errors. Assign your estimates to the object `m3`. You are estimating**
$$
y_{ijt} = alpha_i + beta_1 text{H}_{ijt} + beta_2 text{B2}_{ijt} + beta_3 text{B3}_{ijt} + varepsilon_{ijt}
$$
**where $alpha_i$ is the individual intercept (i.e. the individual fixed effect).**
**Note that the function call works as follows: `felm([y]~[x] | [grouping variable] | 0 | [clustering varaible], [name of data])`**
`{r}
# your code here
`
**25. Compare the estimates in `m1`, `m2` and `m3`. What is the same? What is different? What would you say is the main advantage of using `felm()`?**
**26. Recreate all of Table 3 and use [`stargazer()`](https://cran.r-project.org/web/packages/stargazer/vignettes/stargazer.pdf) to print the results.**
`{r}
# your code here
`
Reviews
There are no reviews yet.