[SOLVED] matlab chain COMP2610/COMP6261

$25

File Name: matlab_chain_COMP2610/COMP6261.zip
File Size: 282.6 KB

5/5 - (1 vote)

COMP2610/COMP6261
Tutorial 5 Sample Solutions

Tutorial 5: Probabilistic inequalities and Mutual Information

Young Lee and Bob Williamson
Tutors: Debashish Chakraborty and Zakaria Mhammedi

Week 5 (21st 25th August), Semester 2, 2017

1. Consider a discrete variable X taking on values from the set X . Let pi be the probability of each state, with
i = 1, . . . , |X |. Denote the vector of probabilities by p. We saw in lectures that the entropy of X satisfies:

H(X) log|X |,

with equality if and only if pi =
1

|X |
for all i, i.e.p is uniform. Prove the above statement using Gibbs inequality,

which says
|X |
i=1

pi log2
pi
qi
0

for any probability distributions p,q over |X | outcomes, with equality if and only if p = q.
Solution.
Gibbs inequality tells us that for any two probability vectors p = (p1, . . . , p|X |) and q = (q1, . . . , q|X |):

|X |
i=1

pi log
pi
qi
0

with equality if and only if p = q. If we take q to be the vector representing the uniform distribution
q1 = . . . = q|X | =

1

|X |
, then we get

0
|X |
i=1

pi log
pi
1
|X |

=

|X |
i=1

pi log pi +

|X |
i=1

pi log|X | = H(p) + log|X |

with equality if and only if p is the uniform distribution. Moving H(p) to the other side gives the inequality.

1

2. LetX be a discrete random variable. Show that the entropy of a function ofX is less than or equal to the entropy
of X by justifying the following steps:

H(X, g(X))
(a)
= H(X) +H(g(X)|X)
(b)
= H(X);

H(X, g(X))
(c)
= H(g(X)) +H(X|g(X))
(d)

H(g(X)).

Thus H(g(X)) H(X).
Solution.

(a) This is using the chain rule of entropy,
i.e. H(X,Y ) = H(X) +H(Y | X) where Y = g(X)

(b) GivenX , we can determine g(X) since it is fixed, being a function ofX . This means no uncertainty remains
about g(X) when X is given. Thus, H(g(X) | X) = 0 since

x p(x)p(g(X) | X = x) = 0.

(c) This is also using the chain rule of entropy,
i.e. H(X,Y ) = H(Y ) +H(X | Y ) where Y = g(X)

(d) In this case, H(X | g(X)) 0 since the conditional entropy of a discrete random variable is non-negative.
If g(X) has one-to-one mapping with X , then H(X, g(X)) H(g(X)).

Combining parts (b) and (d), we obtain H(X) H(g(X)).

3. Random variables X , Y , Z are said to form a Markov chain in that order (denoted by X Y Z) if their
joint probability distribution can be written as:

p(X,Y, Z) = p(X) p(Y |X) p(Z|Y )

(a) Suppose (X,Y, Z) forms a Markov chain. Is it possible for I(X;Y ) = I(X;Z)? If yes, give an example of
X,Y, Z where this happens. If no, explain why not.

(b) Suppose (X,Y, Z) does not form a Markov chain. Is it possible for I(X;Y ) I(X;Z)? If yes, give an
example of X,Y, Z where this happens. If no, explain why not.

Solution.

(a) Yes; pick Z = Y .
Reason: The data processing inequality guarantees I(X;Y ) I(X;Z). Here we want to verify that
equality is possible. If we look at the proof of the data processing inequality, we just need to find a Z where
I(X;Y |Z) = 0.
For Z = Y , intuitively, conditioning on Z, the reduction in uncertainty in X when we know Y is zero,
because Z already tells us everything that Y can. Formally,I(X;Y ) = I(X;Z) because the random
variables Y and Z have the same distribution. Note: to formally check that Z is conditionally independent
of X given Y , we can check p(Z = z,X = x|Y = y) = p(Z = z|Y = y) p(X = x|Y = y) for all
possible x, y, z. The reason is that the left and right hand sides are zero when y 6= z; and when y = z, they
both equal p(X = x|Y = y) as p(Z = z|X = x, Y = y) = 1 in this case.

(b) Yes; pick X , Z independent, and let Y = X + Z (assuming the outcomes are numeric).
Reason: Z is not conditionally independent ofX given Y ; intuitively, knowingX +Z andX tells us what
Z is. So (X,Y, Z) does not form a Markov chain. However, since X , Z are independent, I(X;Z) = 0.
Since mutual information is non-negative, I(X;Y ) 0 = I(X;Z).

2

4. If X Y Z, then show that

(a) I(X;Z) I(X;Y )
(b) I(X;Y |Z) I(X;Y )

Proof in lecture 9

5. A coin is known to land heads with probability 1
5
. The coin is flipped N times for some even integer N .

(a) Using Markovs inequality, provide a bound on the probability of observing N
2
or more heads.

(b) Using Chebyshevs inequality, provide a bound on the probability of observing N
2
or more heads. Express

your answer in terms of N .
(c) For N {2, 4, . . . , 20}, in a single plot, show the bounds from part (a) and (b), as well as the exact

probability of observing N
2
or more heads.

Solution.
X1, . . . , XN represents N flips, where, independent bernoulli random variable, Xi = 1 represents observing
head from a coin flip and Xi = 0 represents observing tail. Suppose XN = 1N

N
i=1Xi. So, the probability of

observing N
2
heads can be expressed as p(XN 12 ) and p(Xi = 1) =

1
5
for each i.

(a) Using Markovs Inequality,

p(XN
1

2
)

E[XN ]
N
2

=

N
i=1

E[Xi]

N
1
2

=
1
5
1
2

=
2

5

p(XN
N

2
)

2

5

(b) We need to calculate the variance of the bernoulli random variable: V ar(X) = p(1 p)

V ar[Xi] = (
1

5
)(1

1

5
) =

4

25

Using the definition of variance and its properties,

V ar(XN ) = V ar[
1

N

N
i=1

Xi] =

N
i=1 V ar[Xi]

N2
=
N( 4

25
)

N2
=

4

25N

Using Chebyshevs inequality,

p(| XN E[XN ] | )
V ar(XN )

2

p(| XN
1

5
|

3

10
)

4
25N

( 3
10
)2

p(XN
1

2
)

16

9N

(c) The exact probability of a k heads is given by the binomial distribution:

P (X = k) =

(
N

k

)
(
1

5
)k(

4

5
)Nk

3

So, the probability of seeing N/2 or more heads is

P (X N/2) =
N

k=N/2

P (X = k)

=

N
k=N/2

(
N

k

)
(
1

5
)k(

4

5
)Nk

Another way to calculate the exact probability

p(XN
1

2
) = 1 p(XN <12)This can be done in Matlab using (1-binocdf(floor(0.5.*n-0.5), n, 0.2))Here floor(0.5.*n-0.5) simply brings the value of n to an integer less than n/2 for each value of n. Forexample, a value of n=10 would lead floor(0.5.*n-0.5) value of 4, which is what we want.The code for the plot above is included below:1 n = 2 : 2 : 2 0 ;23 % Markov I n e q u a l i t y4 y_m = 2 / 5 ;56 % Chebyshev I n e q u a l i t y7 y_c = 16 . / ( 9 . n ) ;89 % Exac t P r o b a b i l i t i e s10 y_e = 1b i n o cd f ( f l o o r ( 0 . 5 . n0.5) , n , 0 . 2 ) ;1112 p l o t ( n , y_c , k )13 ho ld on ;14 p l o t ( [ 2 2 0 ] , [ y_m y_m ] , m )15 ho ld on ;16 p l o t ( n , y_e , b )17 ho ld on ;18 s e t ( gca , f o n t s i z e , 14)1920 t i t l e ( Markov s bound , Chebyshev s bound and e x a c t p r o b a b i l i t y o f o b e s e r v i n g more t h anN/2 heads )21 y l a b e l ( P r o b a b i l i t y )22 x l a b e l ( Number o f co i n f l i p s (N) )23 l e g end ( Chebyshev , Markov , Exac t ) ;4

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] matlab chain COMP2610/COMP6261
$25