1 Generator: real inference
The model has the following form:
(1)
N(0,2ID), d < D. (2)
f(Z;W) maps latent factors into image Y , where W collects all the connection weights and bias terms of the ConvNet.
Adopting the language of the EM algorithm, the complete data model is given by
logp(Y,Z;W) = log[p(Z)p(Y |Z,W)] (3)
+ const. (4)
The observed-data model is obtained by intergrating out Z: p(Y ;W) = R p(Z)p(Y |Z,W)dZ. The posterior distribution of Z is given by p(Z|Y,W) = p(Y,Z;W)/p(Y ;W) p(Z)p(Y |Z,W) as a function of Z.
We want to minimize the observed-data log-likelihood, which is
. The gradient of L(W) can be calculated according to the fol-
lowing well-known fact that underlies the EM algorithm:
;W)dZ (5)
. (6)
The expectation with respect to p(Z|Y,W) can be approximated by drawing samples from p(Z|Y,W) and then compute the Monte Carlo average.
The Langevin dynamics for sampling Z p(Z|Y,W) iterates
, (7)
where denotes the time step for the Langevin sampling, is the step size, and U denotes a random vector that follows N(0,Id).
The stochastic gradient algorithm can be used for learning, where in each iteration, for each Zi, only a single copy of Zi is sampled from p(Zi|Yi,W) by running a finite number of steps of Langevin dynamics starting from the current value of Zi, i.e., the warm start. With {Zi} sampled in this manner, we can update the parameter W based on the gradient L0(W), whose Monte Carlo approximation is:
) (8)
(9)
. (10)
Algorithm 1 describes the details of the learning and sampling algorithm.
Algorithm 1 Generator: real inference Input:
- training examples {Yi,i = 1,,n},
- number of Langevin steps l, (3) number of learning iterations T.
Output:
- learned parameters W,
- inferred latent factors {Zi,i = 1,,n}.
1: Let t 0, initialize W.
2: Initialize Zi, for i = 1,,n.
3: repeat
4: Inference step: For each i, run l steps of of Langevin dynamics to sample Zi p(Zi|Yi,W) with warm start, i.e., starting from the current Zi, each step follows equation 7.
5: Learning step: Update W W +tL0(W), where L0(W) is computed according to equation 10, with learning rate t.
6: Let t t + 1.
7: until t = T
1.1 TO DO
For the lion-tiger category, learn a model with 2-dim latent factor vector. Fill the blank part of ./GenNet/GenNet.py. Show:
- Reconstructed images of training images, using the inferred z from training images.
- Randomly generated images, using randomly sampled z.
- Generated images with linearly interpolated latent factors from (2,2) to (2,2). For example, you inperlolate 8 points from (2,2) for each dimension of z. Then you will get a 8 8 panel of images. You should be able to seee that tigers slight change to lion.
- Plot of loss over iteration.
2 Descriptor: real sampling
The descriptor model is as follows:
, (11)
where p0(Y ) is the reference distribution such as Gaussian white noise
(12)
The scoring function f(Y ) is defined by a bottom-up ConvNet whose parameters are denoted by . The normalizing constant Z() = R exp[f(Y )]p0(Y )dY is analytically intractable. The energy function is
. (13)
p(Y ) is an exponential tilting of p0.
Suppose we observe training examples {Yi,i = 1,,n} from an unknown data distribution Pdata(Y ). The maximum likelihood learning seeks to maximize the log-likelihood function
. (14)
If the sample size n is large, the maximum likelihood estimator minimizes the KullbackLeibler divergence KL(Pdatakp) from the data distribution Pdata to the model distribution p. The gradient of L() is
, (15)
where E denotes the expectation with respect to p(Y ). The key to the above identity is that logZ() = E[ f(Y )].
The expectation in equation (15) is analytically intractable and has to be approximated by MCMC, such as Langevin dynamics, which iterates the following step:
, (16)
where indexes the time steps of the Langevin dynamics, is the step size, and U N(0,I) is Gaussian white noise. The Langevin dynamics relaxes Y to a low energy region, while the noise term provides randomness and variability. A Metropolis-Hastings step may be added to correct for the finite step size . We can also use Hamiltonian Monte Carlo for sampling the generative ConvNet.
We can run n parallel chains of Langevin dynamics according to (16) to obtain the synthesized examples {Yi,i = 1,,n}. The Monte Carlo approximation to L0() is
(17)
,
which is used to update .
To make Langevin sampling easier, we use mean images of training images as the sampling starting point. That is, we down-sampled each training image to a 11 patch, and up-sample this patch to the size of training image. We use cold start for Langevin sampling, i.e., at each iteration, we start sampling from mean images.
Algorithm 2 describes the details of the learning and sampling algorithm.
Algorithm 2 Descriptor: real sampling Input:
- training examples {Yi,i = 1,,n},
- number of Langevin steps l, (3) number of learning iterations T.
Output:
- estimated parameters ,
- synthesized examples {Yi,i = 1,,n}.
1: Let t 0, initialize .
2: repeat
3: For i = 1,,n, initialize Yi to be the mean image of Yi.
4: Run l steps of Langevin dynamics to evolve Yi, each step following equation (16).
5: Update t+1 = t +tL0(t), with step size t, where L0(t) is computed according to equation (17).
6: Let t t + 1.
7: until t = T
2.1 TO DO
For the egret category, learn a descriptor model. Fill the blank part of ./DesNet/DesNet.py. Show:
- Synthesized images.
- Plot of training loss over iteration.

![[Solved] STATSM232A project 5- Generator and descriptor](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] STATSM232A project 2-backpropagation code](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.