1. (60pts total, equally weighted) Consider a three component mixture of normal distribution with a
common prior on the mixture component means, the error variance and the variance within mixture
component means. The prior on the mixture weights w is a three component Dirichlet distribution.(The data for this problem can be found in Mixture.csv).
p(Yi
|µ1, µ2, µ3, w1, w2, w3, ε2
) = X
3
j=1
wiN(µj , ε2
)
µj |µ0, σ2
0 ∼ N(µ0, σ2
0
)
µ0 ∼ N(0, 3)
σ
2
0 ∼ IG(2, 2)
(w1, w2, w3) ∼ Dirichlet(1)
ε
2 ∼ IG(2, 2),
for i = 1, . . . n.Specifically,
• w1, w2 and w3 are the mixture weight of mixture components 1,2 and 3 respectively
• µ1, µ2 and µ3 are the means of the mixture components
• ε
2
is the variance parameter of the error term around the mixture components.Since we’re building a hierarchical model for the means of the individual component, we have a common
hyperprior, where, µ0 is the mean parameter of this hyperprior, σ
2
0
is its variance parameter. Both of these
have priors as well, but the parameters of those priors are fixed, where µ0 has a Normal prior with mean 0 and
variance 3, σ
2
0 has an Inverse-Gamma prior with shape and rate parameter of (2,2) respectively.Similarly, ε
2
has an Inverse-Gamma prior with shape and rate parameter of (2,2) respectively. While they have the same
parametrisation, they do not share a prior. The mixture weights w1, w2, w3 jointly come from a Dirichlet
distribution, with parameter vector (1, 1, 1). w1, w2, w3, µ1, µ2, µ3, ε2
, µ0 and σ
2
0 are all random variables that
we will estimate when we fit the model.
(a) Let τ = 1/ε2 and ϕ0 = 1/σ2
0
. Derive the joint posterior p(w1, w2, w3, µ1, µ2, µ3, ε2
, µ0, σ2
0
|Y1, …, YN ) up
to a normalizing constant.(b) Derive the full conditionals for all the parameters up to a normalizing constant.
– p(w1, w2, w3|µ1, µ2, µ3, ε2
, Y1, …, YN ) ∝
– p(µ1|µ2, µ3, w1, w2, w3, Y1, …, YN , ε2
, µ0, σ2
0
) ∝
– p(µ2|µ1, µ3, w1, w2, w3, Y1, …, YN , ε2
, µ0, σ2
0
) ∝
– p(µ3|µ1, µ2, w1, w2, w3, Y1, …, YN , ε2
, µ0, σ2
0
) ∝
– p(ε
2
|µ1, µ2, µ3, w1, w2, w3, Y1, …, YN ) ∝
– p(µ0|µ1, µ2, µ3, σ2
0
) ∝
– p(σ
2
0
|µ0, µ1, µ2, µ3) ∝(c) Since neither the joint posterior nor any of the full conditionals involving the likelihood are of a form
that’s easy to sample, we introduce a data augmentation scheme. A common solution is to introduce an
additional set of auxiliary random variables {Zi}
N
i=1 that assign each observation to one of the mixture
components with the probability of assignment being the respective mixture weight. Re-derive the full
conditionals under the data augmentation scheme.(d) In task (c) you derived all the full conditionals, and due to data augmentation scheme they are all in a
form that is easy to sample. Use these full conditionals to implement Gibbs sampling using the data
from “Mixture.csv”.
(e) Given tasks (c)-(d), show traceplots for all estimated parameters, and compute means and 95% credible
intervals for the marginal posterior distributions of all the parameters except the auxiliary variables.
Now suppose you re-run the sampler using 3 different sets of starting values for the parameters, are
your results the same? Justify your reasoning by with visualizations.2. (30pts total, equally weighted) PH Exercise 9.1: The file swim.dat contains data on the amount of time
in seconds, it takes each of four high school swimmers to swim 50 yards. Each swimmer has six times,
taken on a biweekly basis.(a) Perform the following data analysis for each swimmer separately: Write down a linear regression
model of swimming time as the response and week as the explanatory variable. Complete the prior
2
specification by using the information that competitive times for this age group generally range from 22
to 24 seconds.(b) Implement a Gibbs sampler to fit each of the models. For each swimmer j, obtain a posterior predictive
distribution for Y
∗
j
, the time of simmer j if they were to swim two weeks from the last recorded time.
(c) The coach has to decide which swimmer should compete in a swimming meet in two weeks. Use your
posterior predictive distributions, compute P(Y
∗
j = max{Y
∗
1
, . . . , Y ∗
4 }|Y) for each swimmer j, and
based on this make a recommendation to the coach.
Bayesian, Homework, Linear, MATH8050:, Mixture, Models, Regression, solved
[SOLVED] Mixture models and bayesian linear regression math8050: homework 8
$25
File Name: Mixture_models_and_bayesian_linear_regression_math8050:_homework_8.zip
File Size: 621.72 KB
Only logged in customers who have purchased this product may leave a review.
Reviews
There are no reviews yet.