[Solved] MATH5473 Homework 2-Random Matrix Theory and PCA

$25

File Name: MATH5473_Homework_2-Random_Matrix_Theory_and_PCA.zip
File Size: 452.16 KB

SKU: [Solved] MATH5473 Homework 2-Random Matrix Theory and PCA Category: Tag:
5/5 - (1 vote)
  1. Phase transition in PCA spike model: Consider a finite sample of n i.d vectors x1,x2,,xn drawn from the p-dimensional Gaussian distribution N(0,2Ipp + 0uuT ), where 0/2 is the signal-to-noise ratio (SNR) and u Rp. In class we showed that the largest eigenvalue of the sample covariance matrix Sn

pops outside the support of the Marcenko-Pastur distribution if

or equivalently, if

SNR.

2, that is, 0 can be buried well inside the support Marcenko(Notice that < (1 + )

Pastur distribution and still the largest eigenvalue pops outside its support). All the following questions refer to the limit n and to almost surely values:

  • Find given SNR > .
  • Use your previous answer to explain how the SNR can be estimated from the eigenvaluesof the sample covariance matrix.
  • Find the squared correlation between the eigenvector v of the sample covariance matrix (corresponding to the largest eigenvalue ) and the true signal component u, as a function of the SNR, p and n. That is, find |hu,vi|2.
  • Confirm your result using MATLAB, Python, or R simulations (e.g. set u = e; and choose = 1 and 0 in different levels. Compute the largest eigenvalue and its associated eigenvector, with a comparison to the true ones.)

1

Homework 2. Random Matrix Theory and PCA 2

  1. Exploring S&P500 Stock Prices: Take the Standard & Poors 500 data:

https://github.com/yao-lab/yao-lab.github.io/blob/master/data/snp452-data.mat which contains the data matrix X Rpn of n = 1258 consecutive observation days and p = 452 daily closing stock prices, and the cell variable stock collects the names, codes, and the affiliated industrial sectors of the 452 stocks. Use Matlab, Python, or R for the following exploration.

  • Take the logarithmic prices Y = logX;
  • For each observation time t {1,,1257}, calculate logarithmic price jumps

Yi,t = Yi,t Yi,t1, i {1,,452};

  • Construct the realized covariance matrix R452452 by,

1257

;

=1

  • Compute the eigenvalues (and eigenvectors) of and store them in a descending order by {k,k = 1,,p}.
  • Horns Parallel Analysis: the following procedure describes a so-called Parallel Analysis of PCA using random permutations on data. Given the matrix [Yi,t], apply random permutations i : {1,,t} {1,,t} on each of its rows: Yi,i(j) such that
Y1,1 Y 2,2(1)[Y(i),t] = Y3,3(1) Yn,n(1) Y1,2Y2,2(2)Y3,3(2) Yn,n(2) Y1,3Y2,2(3)Y3,3(3) Yn,n(3) Y1,t Y2,2(t) Y3,3(t) . Yn,n(t)

Define as the null covariance matrix. Repeat this for R times and compute the eigenvalues of r for each 1 r R. Evaluate the p-value for each estimated eigenvalue k by (Nk+1)/(R+1) where Nk is the counts that k is less than the k-th largest eigenvalue of r over 1 r R. Eigenvalues with small p-values indicate that they are less likely arising from the spectrum of a randomly permuted matrix and thus considered to be signal. Draw your own conclusion with your observations and analysis on this data. A reference is: Buja and Eyuboglu, Remarks on Parallel Analysis, Multivariate Behavioral Research, 27(4): 509-540, 1992.

  1. *Finite rank perturbations of random symmetric matrices: Wigners semi-circle law (proved by Eugene Wigner in 1951) concerns the limiting distribution of the eigenvalues of random symmetric matrices. It states, for example, that the limiting eigenvalue distribution of n n symmetric matrices whose entries wij on and above the diagonal (i j) are i.i.d Gaussians

) (and the entries below the diagonal are determined by symmetrization, i.e., wji = wij) is the semi-circle:

,

where the distribution is supported in the interval [1,1].

Homework 2. Random Matrix Theory and PCA 3

  • Confirm Wigners semi-circle law using MATLAB, Python, or R simulations (take, e.g.,n = 400).
  • Find the largest eigenvalue of a rank-1 perturbation of a Wigner matrix. That is, findthe largest eigenvalue of the matrix

W + 0uuT ,

where W is an n n random symmetric matrix as above, and u is some deterministic unit-norm vector. Determine the value of 0 for which a phase transition occurs. What is the correlation between the top eigenvector of W + 0uuT and the vector u as a function of 0? Use techniques similar to the ones we used in class for analyzing finite rank perturbations of sample covariance matrices.

[Some Hints about homework] For Wigner Matrix), the answer is

eigenvalue is

eigenvector satisfies (

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] MATH5473 Homework 2-Random Matrix Theory and PCA
$25