OMS CS7637 Project Overview
Project Overview (Spring 2022)
Our semester-long class project in involves constructing an AI agent to address a human intelligence test. The project is due at the end of the semester, but there are a number of required milestones to pass along the way. These are (a) to ensure that you are getting an early enough start to have a chance for success and (b) to give you opportunities to see your classmates approaches and possibly incorporate their ideas into your own project.
Copyright By Assignmentchef assignmentchef
This page covers the project as a whole, emphasizing what your end-goal is for the end of the semester.
In a (Large) Nutshell
The CS7637 class project is to create an AI agent that can pass a human intelligence test. Youll download a code package that contains the boilerplate necessary to run an agent you design against a set of problems inspired by theRavens Progressive Matrices test of intelligence. Within it, youll implement the Agent.py file to take in a problem and return an answer.
There are four sets of problems for your agent to answer: B, C, D, and E. Each set contains four types of problems:Basic, Test,Challenge, and Ravens. Youll be able to see the Basic and Challenge problems while designing your agent, and your grade will be based on your agents answers to the Basic and Test problems. The milestones throughout the semester will carry you through tackling more and more advanced problems: for Milestone 1, youll just familiarize yourself with the submission process and data structures. For Milestone 2, youll target the first set of problems, the relatively easy 22 problems from Set B. For Milestone 3, youll move on to the second set of problems, the more challenging 33 problems from Set C. For Milestone 4, youll look at the more difficult Set D and Set E problems, building toward the final deliverable a bit later.
For all problems, your agent will be given images that represent the problem in .png format. An example of a full problem is shown below; your agent would be given separate files representing the contents of squares A, B, C, 1, 2, 3, 4, 5, and 6.
Dont worry if the above doesnt make sense quite yet the projects are a bit complex when youre getting started. The goal of this section is just to provide you with a high-level view so that the rest of this document makes a bit more sense.
Background and Goals
This section covers the learning goals and background information necessary to understand the projects.
Learning Goals
One goal of Knowledge-Based Artificial Intelligence is to create human-like, human-level intelligence, and to use that to reflect on how humans actually think. If this is the goal of the field, then what better way to evaluate intelligence of an agent than by having it take the same intelligence tests that humans take?
There are numerous tests of human intelligence, but one of the most reliable and commonly-used is Ravens Progressive Matrices. Ravens Progressive Matrices, or RPM, are visual analogy problems where the test-taker is given a matrix of figures and asked to select the figure that completes the matrix. An example of a 22 problem was shown above; an example of a 33 problem is shown below.
In these projects, you will design agents that will address RPM-inspired problems such as the ones above. The goal of this project is to authentically experience the overall goals of knowledge-based AI: to design an agent with human-like, human-level intelligence; to test that agent against a set of authentic problems; and to use that agents performance to reflect on what we believe about human cognition. As such, you might not use every topic covered in KBAI on the projects; the topics covered give a bottom-up view of the topics and principles KBAI, while the project gives a top-down view of the goals and concepts of KBAI.
About the Test
The full Ravens Progressive Matrices test consists of 60 visual analogy problems divided into five sets: A, B, C, D, and E. Set A is comprised of 12simple pattern-matching problemswhich we wont cover in these projects. Set B is comprised of 12 22 matrix problems, such as the first image shown above. Sets C, D, and E are each comprised of 12 33 matrix problems, such as the second image shown above. Problems are named with their set followed by their number, such as problem B-05 or C-11. The sets are of roughly ascending difficulty.
For copyright reasons, we cannot provide the real Ravens Progressive Matrices test to everyone. Instead, well be giving you sets of problems which we call Basic problems inspired by the real RPM to use to develop your agent. Your agent will be evaluated based on how well it performs on these Basic problems, as well as a parallel set of Test problems that you will not see while designing your agent. These Test problems are directly analogous to the Basic problems; running against the two sets provides a check for generality and overfitting. Your agents will also run against the real RPM as well as a set of Challenge problems, but neither of these will be factored into your grade.
Overall, by the end of the semester, your agent will answer 192 problems. More on the specific problems that your agent will complete are in the sections that follow.
Each problem set (that is, Set B, Set C, Set D, and Set E) consists of 48 problems: 12 Basic, 12 Test, 12 Ravens, and 12Challenge. Only Basic and Test problems will be used in determining your grade. The Ravens problems are run for authenticity and analysis, but are not used in calculating your grade.
In designing your agent, you will have access to the Basic and Challenge problems; you may run your agent locally to check its performance on these problems. You will not have access to the Test or Ravens problems while designing and testing your agent: when you upload your agent to Gradescope, you will see how well it performs on those problems, but you will not see the details of the problems themselves. Challenge and Ravens problems are not part of your grade.
Note that the Challenge problems will often be used to expose your agent to extra properties and shapes seen on the real Ravens problems that are not covered in the Basic and Test problems. The problems themselves generally ascend in difficulty from set to set (although many people reflect that Set E is a bit easier than Set D.
Finally, note that when submitting your code, Gradescope randomizes the order of possible answers. For example, for the first problem listed earlier on this page, the answer is 5, the plain square. When you submit to Gradesope, the plain square will be randomly assigned a different number: it could be 1, 2, 3, 4, 5, or 6. The content of the image is the same, only the number differs. This is to prevent agents that overfit to the answers: you cannot, for example, write an agent that knows, The answer to problem B-11 is 5 because when you submit, the number is randomly changed.
Details & Deliverables
This section covers the more specific details of the four project milestones, as well as the final project you will submit.
Project Milestones
Your ultimate goal is to submit a final project that attempts all 192 problems. However, to help ensure that you start early and to give you an opportunity to see and learn from your classmates approaches, there are four intermediate milestones. On each of these milestones, your agent will only run against a subset of the full set of problems to allow you to test more efficiently. You will also write a brief report on your current approach for each milestone; the primary purpose of these reports will be to help you get feedback from classmates and see their approaches. For each milestone, you will be graded on a combination of your agents performance and the report that you write; the bars for performance on the milestones are relatively low, however, as the goal is to ensure that you are getting started early.
Each milestone has its own page. In brief, however:
Milestone 1: Set B, Basic Problems only. The goal of this milestone is simply to ensure youve set up your local project infrastructure and familiarized yourself with Gradescope. You will receive 100% of your performance credit as long as your agent answers any problem correctly. Your report will focus on early ideas you have for approaching the project.
Milestone 2: Set B, all problems. The goal of this milestone is to ensure you have started on the early, easier problems early in the semester. As long as your agent can answer 5 (out of 12) Basic B and 5 (out of 12) Test B problems correctly, you will receive full performance credit.
Milestone 3: Set C. The goal of this milestone is to ensure you have generalized your approach out to the more difficult 33 problems by an appropriate time of the semester. As long as your agent can answer 5 (out of 12) Basic C and 5 (out of 12) Test C problems correctly, you will receive full performance credit.
Milestone 4: Sets D and E. The goal of this milestone is to ensure you have looked at all four sets before the final project deadline, so that you may spend the last portion of the semester refining, improving, and writing your final report. As long as your agent can answer 10 (out of 24) Basic D & E and 10 (out of 24) Test D & E problems, you will receive full performance credit.
For each milestone, your code must be submitted to the autograder by the deadline. However, it is okay if your project is still running after the deadline. Note that Gradescope by default counts your last submission for a grade; if you want to count an earlier submission, you must activate that earlier submission.
On each milestone, your grade will be 50% meeting the performance expectations and 50% the report you write up. You will submit your agent to Gradescope and your report to Canvas as a PDF. The four milestones together are 15% of your course grade; each is thus 3.75% of your course grade.
Final Project
The final project will run against all 192 problems. You can submit to the final project throughout the semester to see how your agent is doing so far, but you should make sure to submit to the Milestone submissions as well.
More information about the final project is available on the final project page. For the final project, you will write a longer, more formal, and more complete report on your project. Your score will be based on raw performance on the Basic and Test problems. Like the milestones, performance will be 50% of your grade and your report will be 50% of your grade. Your final project is 15% of your course grade.
Getting Started
To make it easier to start the project and focus on the concepts involved (rather than the nuts and bolts of reading in problems and writing out answers), youll be working from an agent framework in Python. You can get the framework in one of two ways:
Clone it from the master repository with git clone recurse-submodules https://github.gatech.edu/omscs7637/RPM-Project-Code.git
DownloadRPM-Project-Code as a zip file. This method allows you to obtain the code if you are having trouble accessing the Georgia Tech Github site.
You will place your code into the Solve method of the Agent class supplied. You can also create any additional methods, classes, and files needed to organize your code; Solve is simply the entry point into your agent.
The Problem Sets
As mentioned previously, by the final project, your agent will run against 192 problems: 4 Sets of 48 problems. Each of the 4 Sets is broken down into four subsets of 12: 12 Basic, 12 Test, 12 Ravens, and 12 Challenge.
You can see the Basic and Challenge problems and test your agents performance on them locally. You cannot see the Test and Ravens problems, and your agents performance will only be tested when you submit to Gradescope. Your grade will be based solely on the Basic and Test problems.
The Ravens problems are used so that you can see how your agent is performing on the real Ravens test. The Challenge problems are primarily there to expose your agent to certain details that are present in the Ravens problems but not in the Basic problems (such as shapes shaded with diagonal lines).
Within each set, the Basic, Test, and Ravens problems are constructed to be roughly analogous to one another. The Basic problem is constructed to mimic the relationships and transformations in the corresponding Ravens problem, and the Test problem is constructed to mimic the Basic problem very, very closely. So, if you see that your agent gets Basic problem B-05 correct but Test and Ravens problems B-05 wrong, you know that might be a place where your agent is either overfitting or getting lucky. This also means you can anticipate your agents performance on the Test problems relatively well: each Test problem uses a near-identical principle to the corresponding Basic problem. In the past, agents have averaged getting 85% as many Test problems right as Basic problems, so theres a pretty good correlation thereifyoure using a robust, general method.
The Problems
You are provided with theBasicandChallengeproblems to use in designing your agent. The Test and Ravens problems are hidden and will only be used when grading your project. This is to test your agents for generality: it isnt hard to design an agent that can answer questions it has already seen, just as it would not be hard to score well on a test you have already taken before. However, performing well on problems you and your agenthaventseen before is a more reliable test of intelligence. Your grade is based solely on your agents performance on the Basic and Test problems.
All problems are contained within the Problems folder of the downloadable. Problems are divided into sets, and then into individual problems. Each problems folder has three things:
The problem itself, for your benefit.
A ProblemData.txt file, containing information about the problem, including its correct answer and its type.
Visual representations of each figure, named A.png, B. png, etc.
You should not attempt to access ProblemData.txt directly; its filename will be changed when we grade projects. Generally, you need not worry about this directory structure; all problem data will be loaded into the RavensProblem object passed to your agents Solve method, and the filenames for the different visual representations will be included in their corresponding RavensFigures.
Working with the Code
The framework code is available under Getting Started above. You may modify ProblemSetList.txt to alter what problem sets your code runs against locally; this will be useful early in the term when you probably do not need to bother thinking about later problem sets yet. This will not affect what it runs against on Gradescope.
The downloadable package has a number of Python files: RavensProject, ProblemSet, RavensProblem, RavensFigure, and Agent. Of these, you should only modify the Agent class. You may make changes to the other classes to test your agent, write debug statements, etc. However, when we test your code, we will use the original versions of these files as downloaded here. Do not rely on changes to any class except for Agent to run your code. In addition to Agent, you may also write your own additional files and classes for inclusion in your project.
In Agent, you will find two methods: a constructor and a Solve method. The constructor will be called at the beginning of the program, so you may use this method to initialize any information necessary before your agent begins solving problems. After that, Solve will be called on each problem. You should write the Solve method to return its answer to the given question:
22 questions have six answer options, so to answer the question, your agent should return an integer from 1 to 6.
33 questions have eight answer options, so your agent should return an integer from 1 to 8.
If your agent wants to skip a question, it should return a negative number. Any negative number will be treated as your agent skipping the problem.
You may do all the processing within Solve, or you may write other methods and classes to help your agent solve the problems.
When running, the program will load questions from the Problems folder. It will then ask your agent to solve each problem one by one and write the results to ProblemResults.csv. You may check ProblemResults.csv to see how well your agent performed. You may also check SetResults.csv to view a summary of your agents performance at the set level.
The Documentation
RavensProject: The main driver of the project. This file will load the list of problem sets, initialize your agent, then pass the problems to your agent one by one.
RavensGrader: The grading file for the project. After your agent generates its answers, this file will check the answers and assign a score.
Agent: The class in which you will define your agent. When you run the project, your Agent will be constructed, and then its Solve method will be called on each RavensProblem. At the end of Solve, your agent should return an integer as the answer for that problem (or a negative number to skip that problem).
ProblemSet: A list of RavensProblems within a particular set.
RavensProblem: A single problem, such as the one shown earlier in this document. A RavensProblem includes:A Dictionary of the individual Figures (that is, the squares labeled A, B, C, 1, 2, etc.) from the problem. The RavensFigures associated with keys A, B, and C are the problem itself, and those associated with the keys 1, 2, 3, 4, 5, and 6 are the potential answer choices.
A String representing the name of the problem and a String representing the type of problem (22 or 33).
RavensFigure: A single square from the problem, labeled either A, B, C, 1, 2, etc., containing a filename referring to the visual representation (in PNG form) of the figures contents
The documentation is ultimately somewhat straightforward, but it can be complicated when youre initially getting used to it. The most important things to remember are:
Every time Solve is called, your agent is given a single problem. By the end of Solve, it should return an answer as an integer. You dont need to worry about how the problems are loaded from the files, how the problem sets are organized, or how the results are printed. You need only worry about writing the Solve method, which solves one question at a time.
RavensProblems have a dictionary of RavensFigures, with each Figure representing one of the image squares in the problem and each key representing its letter (squares in the problem matrix) or number (answer choices). All RavensFigures have filenames so your agent can load the PNG with the visual representation.
The permitted libraries for this terms project are:
The Python image processing library Pillow (version 8.1.0). For installation instructions on Pillow, see this page.
The latest version of the Numpy library (1.19.1 at time of writing). For installation instructions on numpy, seethis page.
A recent version of OpenCV (4.2.0, opencv-contrib-python-headless). For installation instructions, see this page.
Additionally, we use Python 3.8.0 for our autograder.
Submitting Your Code
This class uses Gradescope, a server-side autograder, to evaluate your submission. This means you can see how your code is performing against the Test and Ravens problems even without seeing the problems themselves. You will have access to separate areas to submit against the Milestone checks and to submit for the final project.
Submitting
To get started submitting your code, go to Canvas and click Gradescope on the left sidebar. Then, click CS7637 in the page that loads.
You will see five project options: Milestone 1, Milestone 2, Milestone 3, Milestone 4, Final Project, and Final Project.
Milestone 1 will run your code only against the Basic B problems.
CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.