[SOLVED] algorithm Scheme math theory # -*- coding: utf-8 -*-

$25

File Name: algorithm_Scheme_math_theory_#_-*-_coding:_utf-8_-*-.zip
File Size: 489.84 KB

5/5 - (1 vote)

# -*- coding: utf-8 -*-
hw5_yourNameHere.ipynb

Automatically generated by Colaboratory.

Original file is located at
https://colab.research.google.com/drive/1K2sTulUsq1nnqRkDpHwXBuFndLic3yHj

# IFT6269 Homework 5 Sampling Methods
**Due:**Tuesday, December 17, 2019

#### Name:
#### Student ID:
#### Collaborators:

import numpy as np
np.set_printoptions(precision=5, suppress=True)
import matplotlib.pyplot as plt
plt.style.use(seaborn-white)

#
# Code for plotting the results
#! DO NOT MODIFY THIS FUNCTION !
#

def show_matrix(A):
matfig = plt.figure(figsize=(5,5))
plt.matshow(A, cmap=plt.cm.gray, fignum=matfig.number)
plt.colorbar()
plt.show()

def plot_list(data, title=):
plt.figure(figsize=(5,2))
plt.plot(data)
plt.title(title)
plt.show()

## Tasks

0. Get your own copy of this file via File > Save a copy in Drive,
1. Fill your personal information and collaborators at the top of this assignment, and rename the notebook accordingly, e.g., `hw5_thomasBayes.ipynb`
2. Read the instructions provided on each section and cell carefully,
5. Complete the coding exercises in sections **Gibbs sampling** and **Mean field**.

**Important**: You are allowed to collaborate with other students in both the math and coding parts of this assignment. However, the answers provided here must reflect your individual work. For that reason, you are not allowed to share this notebook, except for your submission to the TA for grading. **Dont forget to pin and save the version of the notebook you want to be graded on!**

## Problem setting

Consider the Ising model with binary variables $X_s in {0,1}$ and a factorization of the form:
$$
p(x; eta) = frac{1}{Z_p} exp left( sum_{s in V} eta_s x_s + sum_{{s,t} in E} eta_{st} x_s x_t right).
$$
We consider the 7 $times$7 two-dimensional grid as shown below. Note that we used toroidal (donut-like) boundary conditions to make the problem symmetric. We will consider approximate inference methods to approximate the node marginal moments $mu_s := p(X_s = 1)$ in this model.

drawing

## Gibbs sampling

**Implementation**

Using the update equations you found in the theoretical part of the assignment, implement the Gibbs sampling algorithm (with cyclic sequential traversal of the nodes) for $eta_{st} = 0.5$ for all edges, and $eta_s = (-1)^s$ for all $s in {1, ldots, 49}$ (using the node ordering in the figure).

def gibbs_sampling(X0, burn_in, num_epochs):

Performs Gibbs sampling on the UGM

Inputs:
X0: [7 x 7] matrix representing the initial state of the grid shown
in the picture. X[0, 0] is the variable numbered 1.
burn_in: [int] burn-in period
num_epochs: [int] number of epochs (one epoch amounts to updating
*each* node once)

Returns:
samples: [num_epochs x 7 x 7] tensor of generated samples for each
of the epochs after the burn-in period. Here this corresponds to
num-epoch matrices of size 7 x 7.

# TODO: Dont forget to include these. Its okay to hardcode values here. No
# need to pass them as function parameters.
eta_s = None
eta_st = None

# TODO
samples = np.random.rand(num_epochs, 7, 7)

return samples

**Execution**

Run a *burn-in period* of 1000 epochs (where **one epoch amounts to updating each node once**). For each of the 5000 *subsequent epochs*, collect a sample vector. Use the 5000 samples to form Monte Carlo estimates $hat{mu}_s$ of the moments $mathbb{E}[X_s]$ at each node. Create a $7 times 7$ matrix of the estimated moments.

**Note:** We mentioned in class that every update of a node yields a *different* sample in theory, and that one should normally use *all* the available samples (after sufficient mixing) for a Monte Carlo estimate. In this case, that would be $49$ (one for each individual node update) times $5000$ epochs. But note that using all these samples would give almost the exact same estimates, only differing from the boundary conditions during the first and last epoch. Do NOT use the 495000 different samples!

# Keep this initialization scheme
X0 = 1.0 * (np.random.rand(7, 7) > 0.5)

# We run your Gibbs sampling function
samples = gibbs_sampling(X0, burn_in=1000, num_epochs=5000)

# TODO: Use the samples to perform the Monte Carlo estimates for hat{mu}_s
mu_hat = np.random.rand(7, 7) # this is dummy code!

# Do NOT change these last two lines
print(mu_hat)
show_matrix(mu_hat)

Repeat the experiment $10$ times and output a $7 times 7$ matrix of the empirical standard deviation of your estimate at each node (this gives an idea of the variability of your estimates).

for trial in range(10):
pass #TODO: do stuff here

# TODO: use the result of your 10 trials to compute the empirical standard deviation
# of your estimate at each node
mu_hat_std = np.random.rand(7, 7)

# Do NOT change this last line
print(mu_hat_std)

## Mean field

**Implementation**

Implement the mean field updates you derived in the theoretical part for the same mode. Use the same $eta_s$ and $eta_{st}$ as above. Recall we use the notation $q(X_s =1) = tau_s$. More specifically, do cyclic coordinate descent on $KL(q || p)$, sequentially updating the parameter $tau_s in [0,1]$ for $s in {1, ldots, 49}$.

Let $d(tau, tau):= frac{1}{49} sum_{s=1}^{49} |tau_s tau_s|$ be the average $ell_1$ distance between two parameters. Use $d(tau^{(k-1)}, tau^{k}) <0.001$ as a stopping criterion for convergence, where $k$ counts the number of **epochs**.Compute $KL(q || p)-log(Z_p)$ as a function of the number of epochs both for debugging purpose and monitor progress.”””def distance(tau, tau_prime):”””Computes the average l_1 distance between two parameter configurations”””# TODOreturn 0def kl_minus_log(tau):”””Computes KL(q(tau) || p) – log(Z_p)”””# TODOreturn 0def mean_field(tau0, dist_tol):”””Mean field approximation for the UGMInputs:tau0: [7 x 7] matrix representing the initial value of the parameters tau_s for each state of the grid shown in the picture. tau[0, 0]is the parameter associated to the variable numbered 1.dist_tol: [float] tolerance between epochs. If change in parameterbetween two subsequent *epochs* is less than dist_tol, stop.Returns:tau: [7 x 7] matrix of parameters at convergence.d_hist: [list] list of parameter distance between subsequent epochs.d_hist[0] = d(tau_0, tau_1) and so on.kl_hist: [list] list containing KL(q||p) – log(Z_p).kl_m_log[0] = KL(q(tau_0)||p) – log(Z_p) and so on.”””# TODO: Don’t forget to include these. It’s okay to hardcode values here. No # need to pass them as function parameters.eta_s = Noneeta_st = None# TODOtau = tau0d_hist = []kl_hist = [kl_minus_log(tau)]return tau, d_hist, kl_hist”””**Execution**Run your algorithm and plot the mean field estimated moments. Plot the behavior of $d(tau^{(k-1)}, tau^{k})$ and $KL(q(tau^k) || p) -log(Z_p)$.”””# Keep this initialization for reproducibilitytau0 = 0.5 * np.ones((7, 7))# We run your mean field functiontau, d_hist, kl_hist = mean_field(tau0, dist_tol=0.001)# Do NOT change these last linesprint(tau)show_matrix(tau)plot_list(d_hist, ‘Distance’)plot_list(kl_hist, ‘KL(q||p) – log(Z_p)’)”””Try $5$ different initializations for $tau$ and compute $d(hat{tau}_s, hat{mu}_s)$ between the mean field estimated moments $hat{tau}_s$ and the Gibbs estimates $hat{mu}_s$.”””for trial in range(5):# 1. Random tau init# 2. Run mean_field with said initialization# 3. Print distance between final (mean field) tau and (Gibbs’ estimate) hat{mu}pass”””**Question:** Is the mean field a good approximation here? Does it get stuck in different local minima?**Answer:**”””

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] algorithm Scheme math theory # -*- coding: utf-8 -*-
$25