Information Retrieval in High Dimensional Data
Please hand in your solutions via Moodle. Use the attached Jupyter notebook.
Solutions must be handed in by groups. Please state the names of your group members at a prominent place in your submission. (For example, at the beginning of your provided notebook or in a separate text file.)
The Kernel Trick
Task 1: [25 points] On Moodle you will find a Jupyter-Notebook that contains a function for dimensionality reduction via PCA. The function linear_pca expects a data matrix X RpN and a number of PCs k and returns the first k PCA scores for the matrix X.
- Provide code that tests the function with selected images from the provided MNIST training dataset by visualizing the first 2 scores in a scatter plot.
- Complete the function gram_pca such that it has the same functionality as linear_pca but expects a gram matrix K = X>X instead of the data matrix X as its input. Do not assume that K was produced from centered data. Note: It is important to be consistent in notation here. E.g., for a data matrix of 1000 MNIST images, we have X R7841000 and K R10001000.
- Test your implementation and show that gram_pca(dot(X.T,X), k) yields results equivalent to those of linear_pca(X, k).
- There is as an unknown vector space H, equipped with an inner product h,iH and a function
: Rp H,
such that
holds for every x,y Rp. The expression on the right-hand side of the equation is called the Gaussian kernel and is a parameter to choose by hand.
The function gaussian_kernel_pca expects a data matrix X, a reduced dimension number k and a parameter . It returns the first k Kernel PCA scores of the data. In other words, the function returns the first k PCA scores of
(x1),(x2),,(xN),
where xi denotes the i-th data sample/i-th column of the data matrix. The function gaussian_kernel_pca is already written, but for it to work, the function compute_gaussian_gram_matrix must return correct results. Complete compute_gaussian_gram_matrix accordingly.
- Test gaussian_kernel_pca with some MNIST train images and = 1000.

![[Solved] IR-Assignment #3](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[Solved] IR Assignment #2](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.