Expectation Maximization – Assignment 5 – CS6601
Setup
Clone this repository:
Create new conda environment for this assignment. conda env create -f ai_a5_env.yml Activate environment with, conda activate ai_a5_env
Jupyter Notebook:
You will be using jupyter notebook to complete this assignment.
To open the jupyter notebook, navigate to your assignment folder, activate the conda environment conda activate ai_a5_env, and run jupyter notebook.
Project description and all of the functions that you will implement are in solution.ipynb file.
ATTENTION: You are free to add additional cells for debugging your implementation, however, please don’t write any inline code in the cells with function declarations, only edit the section inside the function, which has comments like: # TODO: finish this function.
Grading
The grade you receive for the assignment will be distributed as follows:
1. k-Means Clustering (19 points)
2. Gaussian Mixture Model (48 points)
3. Model Performance Improvements (20 points)
4. Bayesian Information Criterion (12 points)
5. Return your name (1 point)
Note: For this assignment, we do not have any bonuses.
Submission
The tests for the assignment are provided in mixture_tests.py. All the tests are already embedded into the respective ipython notebook cells, so they will run automatically whenever you run the cells with your code. Local tests are sufficient for verifying the correctness of your implementation. The tests on Gradescope will be similar to the ones provided here. You’ll need to ensure that your submissions are sufficiently vectorized so that algorithms won’t time out.
To get the submission file, make sure to save your notebook and run:
python notebook2script.py submit
Once the execution is complete, open autogenerated submit/submission.py and verify that it contains all of the imports, functions, and classes you are required to implement. Only then proceed to the Gradescope for submission.
In your Gradescope submission history, you can mark certain submissions as Active. Please ensure this is your best submission.
Do NOT erase the #export at the top of any cells as it is used by notebook2script.py to extract cells for submission.
You will be allowed 3 submissions every 3 hours on gradescope. Make sure you test everything before submitting it. The code will be allowed to run for not more than 40 minutes per submission. In order for the code to run quickly, make sure to VECTORIZE the code (more on this in the notebook itself).
Resources
1. Canvas lectures on Unsupervised Learning (Lesson 7)
2. The gaussians.pdf in the read/ folder will introduce you to multivariate normal distributions.
3. A youtube video by Alexander Ihler, on multivariate EM algorithm details: https://www.youtube.com/watch?v=qMTuMa86NzU
4. The em.pdf chapter in the read/ folder. This will be especially useful for Part 2 of the assignment.
5. Numpy and vectorization related
Stackexchange discussion
Hackernoon article
Numpy einsum (highly recommended)
Slicing and indexing
Copies and views
Fancy indexing

![[SOLVED] Cs6601 assignment 5 – expectation maximization](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[SOLVED] PRC Assignment JavaFX](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.