MATH70093– Computational Statistics
Assessed coursework —Autumn 2024
Preparing your Coursework
1. Hand-out date: December 2nd 2024 – 9am.
2. Hand-in date: December 13th 2024 – noon.
3. Please use the Rmarkdown template file on the Statistics MSc 2024-2025 Blackboard page to write your report. Code should be provided in the appendix (which will be done automatically by the template provided). Ensure your submitted file has tidy and well documented code chunks.
4. The report should be properly structured, and should be written using complete sentences. Marks are given both for the content of the report (correctness of code, numerical answers, etc.) and the quality of the presentation (clarity of plots, explanations, etc.).
5. At the beginning of your report you must include the following statement:
“I, [insert CID], certify that this assessed coursework is my own work, unless otherwise acknowledged, and includes no plagiarism. I have not discussed my coursework with anyone else except when seeking clarification with the module lecturer via email or on MS Teams. I have not shared any code underlying my coursework with anyone else prior to submission. ”
6. All coding should be done in ‘Rmarkdown’ . Please ensure your submitted file has tidy and well documented code chunks in appendix.
Submitting your coursework
1. Submit via Blackboard before the deadline a pdf version of your report.
2. The name of your submitted file should begin with CW followed by your CID, e.g. if I were to submit a coursework, I would submit: CW 00830053.pdf.
1. (50% of total marks) Consider the undirected graph in Figure 1 modelling part of the tube network, where the random variables X1, . . . , X5 are independent uniformly distribution with Xi ~ U [0, ai] for a1 = 1, a2 = 2, a3 = 3, a4 = 1, a5 = 2, which represent the (random) travel time along each line.
Figure 1: Tube Network from A (Earl’s Court) to B (South Kensington)
(a) The shortest path from A to B and its associated travel time T will be random, depending on the specific values of the random variables X1, . . . , X5 . Let I = E[T] be the expected minimum travel time from A to B. Show that this can be written as
I = E[H(U)],
with
H(U) = min{a1 U1 + a4 U4 , a1 U1 + a3 U3 + a5 U5 , a2 U2 + a3 U3 + a4 U4 , a2 U2 + a5 U5 }, where U = (U1, . . . , U5 ) with U1, . . . , U5 ~ U [0, 1] are independent random variables.
(b) Using a standard Monte Carlo estimator, approximate the expected length I. Implement this estimator in R. For diferent values of n, compute the estimated mean I , and the estimated 95% confidence intervals, and plot them. Compare with the exact value I = .
(c) We shall attempt to reduce the variance of this estimator using importance sampling, by choosing as importance sampling distributions U1 , U4 ~ Beta(v, 1), and U2 , U3 , U5 ~ U [0, 1] independent, and then introducing appropriate importance sampling weights. Develop such a scheme and implement it in R. Using numerical experiments find a value of v in the range (1, 1.5) which provides a significant reduction in variance. Plot the estimated mean I, along with the estimated 95% confidence intervals as a function of the number of samples n.
(d) Let G(U) = min{a1 U1 + a4 U4 , a2 U2 + a5 U5 }. We know that
a control variate –
H(U) = H(U) + QC(U),
for some C and write down expressions which approximate the optimal value Q = Q* to give minimal variance. Implement this estimator in R. Plot the estimated mean I, along with the estimated 95% confidence intervals as a function of the number of samples n.
(e) Compare the performance of the three approaches in terms of variance and discuss the advantages and disadvantages of each respective approach.
2. (50% of total marks) Download your individual dataset D = {xn }n(5)1 corresponding to your CID and available
on Blackboard. Please state the last two digits of your CID at the beginning of this question. Consider a mixture model of Poisson distributions, i.e., a model of the form.
P (X = x|{λk , {wk = wk ψ(x; λk ) for any x ∈ N
where ψ(·; λ) is the probability mass function of a Poisson distribution with parameter λ; and Σ wk =
1. Assume the following prior distribution on the model parameters: for a fixed K, (w1, . . . wK ) follows a Dirichlet distribution with concentration parameter (1/K, . . . , 1/K) and λ 1, . . . , λK are i.i.d. following a uniform distribution between 0 and 20.
The aim of this question is to perform Bayesian inference of the model parameters (λ1, . . . λK , w1, . . . wK ) given the dataset D and to identify whether a model with two (K = 2) or three (K = 3) components best explain the dataset.
(a) Consider a mixture model of Poisson distributions with K ∈ {2, 3} components. Devise and implement a Metropolis-within-Gibbs algorithm that takes K as an input and produces samples from the posterior distribution p(λ1 , λ2 , w1 , w2 |D, K = 2) if K = 2 or p(λ1 , λ2 , λ3 , w1 , w2 , w3 |D, K = 3) if K = 3.
Precisely describe the algorithm with equations so that one could implement it without looking at your code. For each value of K ∈ {2, 3}, run the algorithm multiple times with diferent starting values and verify graphically that the MCMC algorithm has reached convergence.
(b) Using the outputs of the Metropolis-within-Gibbs algorithm implemented in question (a), estimate the following posterior predictive distributions for K = 2 and K = 3
P(X = x* |D, K) = ∫ · · · ∫ P (X = x* |{λk , {wk p ({λk , {wk dλ 1 . . . dλK dw1 . . . dwK
for all x* ∈ {0, 1, 2, . . . 30}. Write down the equation of your estimates. Produce a plot featuring P(X = x* |D, K = 2) and P(X = x* |D, K = 3) for x* ∈ {0, 1, 2, . . . 30} along with the distribution of the data in D. Comment on the relative suitability of each model to explain your dataset D.
(c) Devise and implement a Reversible Jump Process to jointly infer the number of components, K, and the model parameters given your dataset. You only need to consider a model with two components and one with three components, i.e., K ∈ {2, 3}. Precisely describe the proposed algorithm. Produce graphs to assess the convergence of the MCMC algorithm. From the output of the algorithm, report the posterior probability of the model with two component and the one with three components, i.e. P(K = 2|D) and P(K = 3|D). How does this compare to your answer in question (b)?
Tips: Many between-model moves can be considered but some might lead to a better mixing than others. In the article attached on Blackboard, Richardson and Green proposed between-models move in the context of a mixture of Gaussian distributions. Here is a suggestion of between-model moves inspired by this article and adapted to the mixture of Poisson distributions:
• The combine move starts with a model with 3 components and proposes to move to a model with 2 components. This is done by choosing a pair of components (j1 , j2 ) at random, that are adjacent in terms of the current values of their means (i.e. λj1 < λj2 and the third component is not in [λj1 , λj2 ]). These two components are merged into one component labelled j * such that:
wj * = wj1 + wj2 and λj * = (wj1 λj1 + wj2 λj2 )/wj * . (1)
• The reverse split proposal (from a model with 3 components to a model with 2 components) consists in randomly selecting one component j * (among the three components) and split it into two, labelled j1 and j2 , with weights and means that satisfy (1). This can be done for example by sampling u1 ~ Beta(2, 2) and u2 ~ N(0, 1) and setting wj1 = u1 wj * , wj2 = (1—u1 )wj * , λj1 = λj * —u2 √wj2 /wj1 and λj2 = λj * + u2 √wj1 /wj2 . If the adjacency condition is not satisfied, the move should be rejected.
.
Reviews
There are no reviews yet.