, ,

[SOLVED] Cs7641 assignment 4 p0

$25

File Name: Cs7641_assignment_4_p0.zip
File Size: 207.24 KB

5/5 - (1 vote)

Markov Decision Processes
1Assignment Weight
Read everything below carefully as this assignment has changed term-over-term.
2Objective
In some sense, we have spent the semester thinking about machine learning techniques for various forms of function approximation. It’s now time to think about using what we’ve learned in order to allow an agent of some kind to act in the world more directly. This assignment asks you to consider the application of some ofAssignment Project Exam Help the techniques we’ve learned from reinforcement learning to make decisions.
requirements will likely have changed.Add
3Procedure
3.1 The Problems Given to You
You are being asked to explore Markov Decision Processes (MDPs):
1. Come up with two interesting MDPs. Explain why they are interesting. They don’t need to be overlycomplicated or directly grounded in a real situation, but it will be worthwhile if your MDPs are inspired by some process you are interested in or are familiar with. It’s ok to keep it somewhat simple. For the purposes of this assignment, though, make sure one MDP has a ”small” number of states, and the other MDP has a ”large” number of states. The judgement and rationalization of what is “small” and “large” will be up to you. For initial intuition, 200 states is not considered “large”. Additionally, neither of your MDPs you choose should be a grid world problem.
2. Solve each MDP using value iteration as well as policy iteration. How many iterations does it take toconverge? Which one converges faster? Why? How did you choose to define convergence? Do they converge to the same answer? How did the number of states affect things, if at all?
3. Now pick your favorite reinforcement learning algorithm and use it to solve the two MDPs. How doesit perform, especially in comparison to the cases above where you knew the model, rewards, and so on?
What exploration strategies did you choose? Did some work better than others?
Extra Credit Opportunity:
Analysis writeup is limited to 8 pages. The page limit does include your citations. Anything past 8 pages will not be read. Please keep your analysis as concise while still covering the requirements of the assignment. As a final check during your submission process, download the submission to double check everything looks correct on Canvas. Try not wait until the last minute to submit as you will only be tempting Murphy’s Law.
3.2 Acceptable Libraries
The algorithms used in this assignment are relatively easy to implement. Existing implementations are easy to find too. Below are java and python examples.
• bettermdptools (python) https://github.com/jlm429/bettermdptools
• BURLAP (java) http://burlap.cs.brown.edu/
4 Submission Details
sectionSubmission Details
You must submit: Add
• A file named README.txt containing instructions for running your code. We need to be able to get to your code and your data. Providing entire libraries isn’t necessary when a URL would suffice; however, you should at least provide any files you found necessary to change and enough support and explanation so we can reproduce your results on a standard Linux machine.
• A file named yourgtaccount-analysis.pdf containing your writeup (GT account is what you log in with, not your all-digits ID). This file should not exceed 8 pages.
The file yourgtaccount-analysis.pdf should contain:
• A description of your MDPs and why they are interesting.
• A discussion of your experiments.
It might be difficult to generate the same kinds of graphs for this part of the assignment as you did in previous assignments; however, you should come up with some clever way to describe the kinds of results you produce. If you can achieve this visually all the better. However, a note of caution. Figures should remain legible in a 100% zoom. Do not try to squish figures together in specific sections where axis labels become 8pt font or less. We are looking for clear and concise demonstration of knowledge and synthesis of results in your demonstrations. Any paper that solely has figures without formal writing will not be graded. Be methodical with your space.
Note: we need to be able to get to your code and your data. Providing entire libraries isn’t necessary when a URL would suffice; however, you should at least provide any files you found necessary to change and enough support and explanation so we can reproduce your results on a standard linux machine.
5 Feedback Requests
When your assignment is scored, you will receive feedback explaining your errors and successes in some level of detail. This feedback is for your benefit, both on this assignment and for future assignments. It is considered a part of your learning goal to internalize this feedback. We strive to give meaningful feedback with a human interaction at scale. We have a multitude of mechanisms behind the scenes to ensure grading consistency with meaningful feedback. This can be difficult, however sometimes feedback isn’t always as clear as you need. If you are confused by a piece of feedback, please start a private thread on Ed and we will jump in to help clarify.
The phrase ”as long as you participate in this journey of exploring, tuning, and analyzing” is key. We take thisAssignment Project Exam Help
very seriously and you should too.
assignments of other students (even across sections and previous courses), or website repositories.Add What does it mean to be original?
It is well known that for this course we do not care about code. We are not interested in your working out the edge cases in k-nn, or proving your skills with python. While there is some value in implementing algorithms yourselves in general, here we are interested in your grokking the practice of ML itself. That practice is about the interaction of algorithms with data. As such, the vast majority of what you’re going to learn in order to master the empirical practice of ML flows from doing your own analysis of the data, hyper parameters, and so on; hence, you are allowed to steal ML code from libraries but are not allowed to steal code written explicitly for this course, particularly those parts of code that automate exploration. You will be tempted to just run said code that has already been overfit to the specific datasets used by that code and will therefore learn very little. How to cite:
Your README file will include pointers to any code and libraries you used.
If we catch you…
7 Version Control
References

Add

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Cs7641 assignment 4 p0[SOLVED] Cs7641 assignment 4 p0
$25