5/5 - (1 vote)

1. Linear Algebra

Name: [Solved] (EE270)Large Scale Matrix Computation, Optimization and Learning-HW #1
Brand: Assignment Chef
SKU: [Solved] (EE270)Large Scale Matrix Computation, Optimization and Learning-HW #1
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

Are the following statements true or false? If true, prove it; if false, show a counterexample.

(a) The inverse of a symmetric matrix is itself symmetric. (b) All 2 2 orthogonal matrices have the following form

2. Divide and Conquer Matrix Multiplication

If A is a matrix, then A²= AA is the square of A.

Show that five multiplications are sufficient to compute the square of a 2 2 matrix

, where a₁,a₂,a₃,a₄are scalars.

Generalize the formula in part (a) to a 2 2 block matrix where

A₁,A₂,A₃,A₄are arbitrary matrices.

Instead of using the classical matrix multiplication (three-loop) algorithm for computing A², we may apply the block formula you derived in (b) to reduce 2n 2n problems to several n n computations, which can be tackled with classical matrix multiplication. Compare the total number of arithmetic operations. Generate 2n2n random A matrices and plot the wall-clock time of the classical matrix multiplication algorithm and the algorithm using the formula in (b) to compute A²for n = 4,,10000 (or as large as your system memory allows). You can use standard packages for matrix multiplication, e.g., numpy.matmul.
Show that if you have an algorithm for squaring an nn matrix in O(n^c) time, then you can use it to multiply any two arbitrary n n matrices in O(n^c) time. [Hint: Consider multiplying two matrices A and B. Can you define a matrix whose square contains AB?] 3. Probability (30 pts)
Random variables X and Y have a joint distribution p(x,y). Prove the following results. You can assume continuous distributions for simplicity.
1. E[X] = EY [EX[X|Y ]]
2. E[I[X C]] = P(X C), where I[X C] is the indicator function^[1] of an arbitrary set C.

var[X] = EY [var_X[X|Y ]] + var_Y[EX[X|Y ]] iv. If X and Y are independent, then E[XY ] = E[X]E[Y ].

If X and Y take values in {0,1} and E[XY ] = E[X]E[Y ], then X and Y are independent.

Show that the approximate randomized counting algorithm described in Lemma 1 of Lecture 2 slides (page 14) is unbiased:

En = n. (1)

Prove the variance formula in Lemma 2 of Lecture 2 slides (page 38) for Approximate Matrix Multiplication AB CR

. (2)

where {p_k}^d_k₌₁are sampling probabilities.

4. Positive (Semi-)Definite Matrices

Let A be a real, symmetric d d matrix. We say A is positive semi-definite (PSD) if, for all x R^d, x^>Ax 0. We say A is positive definite (PD) if, for all x = 06, x^>Ax > 0. We write 0 when A is PSD, and 0 when A is PD.

The spectral theorem says that every real symmetric matrix A can be expressed A = UU^>, where U is a d d matrix such that UU^>= U^>U = I (called an orthogonal matrix), and = diag(₁,,_d). Multiplying on the right by U we see that AU = U. If we let u_idenote the i^thcolumn of U, we have Au_i= _iu_ifor each i. This expression reveals that the _iare eigenvalues of A, and the corresponding columns u_iare eigenvectors associated to _i.

A is PSD iff _i 0 for each i.
A is PD iff _i> 0 for each i. Hint: Use the following representation

UUT = XiuiuiT .

i=1

5. Norms

For p = 1,2,, verify that the functions k k_pare norms. Then, for a vector x Rⁿ, show that

kxk kxk₂ kxk₁ nkxk₂ nkxk

and for each inequlaity, provide an example demonstrating that the inequality can be tight.

For vectors x,y Rⁿ, show that |x^Ty| kx||₂kyk₂with equality if and only if x and y are linearly dependent. More generally, show that x^Ty kxk₁kyk. Note that this implies that; and that these are special cases of Holders inequality.
For A R^mⁿ, show that Trace(A^TA) = ^P_ijA²_ij, and show that ^qP_ijA_ij² is a norm on mn This is the Frobenius norm, denoted kk_F. Show that, in addition to satisfying the definining properties of a norm, the Frobenius norm is a submultiplicative norm, in that

kABk_F kAk_FkBk_F

whenever the dimensions are such that the product AB is defined.

Recall the definiton of the spectral norm of an mn matrix A : kAk₂= ^p_max(A^TA) = _max(A), where _max(A^TA) is the largest eigenvalue pf A^TA and _maxis the largest singular value of A. Show that the Frobenius norm and the spectral norm are unitarily invariant: if U and V are unitary (orthogonal in the real case) matrices, then kU^TAV k= kAk, for = 2,F.

6. Approximate Matrix Multiplication

Here, we will consider the empirical performance of random sampling and random projection algorithms for approximating the product of two matrices. You may use Matlab, or C, or R, or any other software package you prefer to do your implementations. Please be sure to describe what you used in sufficient detail that someone else could reproduce your results. Let A be an n d matrix, with, and consider approximating the product A^TA. First, generate the matrices A from one of three different classes of distributions introduced below.

Generate a matrix A from multivariate normal N(1_d,), where the (i, j)th element of _ij= 2 0.5^|ij|.(Refer to as GA data.)
Generate a matrix A from multivariate t-distribution with 3 degree of freedom and covariance matrix as before. (Refer to as T3 data.)
Generate a matrix A from multivariate t-distribution with 1 degree of freedom and covariance matrix as before. (Refer to as T1 data.)

To start, consider matrices of size nd equal to 50050. (So, you should have three matrices, one matrix A generated in each of the above ways.)

For each matrix, approximate the product (A^TA) with the random sampling algorithm we discussed in class, i.e., by sampling with respect to a probability distribution that depends on the norm squared of the rows of the input matrix. Plot the probability distribution. Does it look uniform or nonuniform? Plot the performance of the spectral and Frobenius norm error as a function of the number of samples.
For each matrix, approximate the product (A^TA) with the random sampling algorithm we discussed in class, except that the uniform distribution, rather than the norm-squared distribution, should be used to construct the random sample. Plot the performance of the spectral and Frobenius norm error as a function of the number of samples. For which matrices are the results similar and for which are they different than when the norm-squared distribution is used ?
Now you will implement the matrix approximation technique on the MNIST dataset for handwritten digit classification. Details about MNIST dataset can be found at http://yann.lecun.com/exdb/mnist/. We provide the dataset in .mat file so that you can easily import it into Matlab by using load(mnist matrix.mat). To import the dataset in Python you can use:

import scipy.io data = scipy.io.loadmat(mnist matrix.mat)

In .mat file you will find one matrix, A R^60000784. For this matrix, approximate the product (A^TA) with the random sampling algorithm we discussed in class, i.e., by sampling with respect to a probability distribution that depends on the norm squared of the rows of the input matrix. Plot the probability distribution. Dos it look uniform or nonuniform? Plot the performance of the spectral and Frobenius norm error as a function of the number of samples.

[1] I[X C] := 1 if X C and 0 otherwise

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] (EE270)Large Scale Matrix Computation, Optimization and Learning-HW #1

1. Linear Algebra

2. Divide and Conquer Matrix Multiplication

4. Positive (Semi-)Definite Matrices

5. Norms

6. Approximate Matrix Multiplication

import scipy.io data = scipy.io.loadmat(mnist matrix.mat)

Reviews

Related products

[Solved] (EE270) Large Scale Matrix Computation, Optimization and Learning -HW #4

[Solved] (EE270) Large Scale Matrix Computation, Optimization and Learning-HW #2

[Solved] (EE270)Large Scale Matrix Computation, Optimization and Learning-HW #3