- There will be a lot of reading in this assignment. Be patient. Its worth your time!
- Start early. Ask questions. Have fun.
What courses should you take in a given quarter? Answering this question requires balancing your interests, satisfying prerequisite chains, graduation requirements, availability of courses; this can be a complex tedious process. In this assignment, you will write a program that does automatic course scheduling for you based on your preferences and constraints. The program will cast the course scheduling problem (CSP) as a constraint satisfaction problem (CSP) and then use backtracking search to solve that CSP to give you your optimal course schedule.
You will first get yourself familiar with CSP by doing warmup exercises in Problem 0. In Problem 1, you will implement two of the three heuristics you learned from the lectures that will make CSP solving much faster. In problem 2, you will add a helper function to reduce n-ary factors to unary and binary factors. Lastly, in Problem 3, you will create the course scheduling CSP and solve it using the code from previous parts.
- [4 points] Lets create a CSP. Suppose you have n light bulbs, where each light bulb i=1,,n is initially off. You also have m buttons which control the lights. For each button j=1,,m, we know the subset Tj{1,,n} of light bulbs that it controls. When button j is pressed, it toggles the state of each light bulb in Tj (For example, if 3Tj and light bulb 3 is off, then after the button is pressed, light bulb 3 will be on, and vice versa).Your goal is to turn on all the light bulbs by pressing a subset of the buttons. Construct a CSP to solve this problem. Your CSP should have m variables and n constraints. For this problem only, you can use n-ary constraints. Describe your CSP precisely and concisely. You need to specify the variables with their domain, and the constraints with their scope and expression. Make sure to include Tj in your answer.
- [4 points] Lets consider a simple CSP with 3 variables and 2 binary factors:where X1,X2,X3{0,1} and t1,t2 are XOR functions (that is t1(X)=x1x2 and t2(X)=x2x3).
- How many consistent assignments are there for this CSP?
- To see why variable ordering is important, lets use backtracking search to solve the CSP without using any heuristics (MCV, LCV, AC-3) or lookahead. How many times will
backtrack()
be called to get all consistent assignments if we use the fixed ordering X1,X3,X2? Draw the call stack forbacktrack()
. (You should use the Backtrack algorithm from the slides. The initial arguments are x=, w=1, and the original Domain.)In the code, this will beBacktrackingSearch.numOperations
. - To see why lookahead can be useful, lets do it again with the ordering X1,X3,X2 and AC-3. How many times will Backtrack be called to get all consistent assignments? Draw the call stack for
backtrack()
.
- Now lets consider a general case: given a factor graph with n variables X1,...,Xn and n1 binary factors t1,...,tn1 where Xi{0,1} and ti(X)=xixi+1. Note that the CSP has a chain structure. Implement
create_chain_csp()
by creating a generic chain CSP with XOR as factors.Note: Weve provided you with a CSP implementation inutil.py
which supports unary and binary factors. For now, you dont need to understand the implementation, but please read the comments and get yourself familiar with the CSP interface. For this problem, youll need to useCSP.add_variable()
andCSP.add_binary_factor()
.
So far, weve only worked with unweighted CSPs, where fj(x){0,1}. In this problem, we will work with weighted CSPs, which associates a weight for each assignment x based on the product of m factor functions f1,,fm:
where each factor fj(x)0. Our goal is to find the assignment(s) x with the highest weight. As in problem 0, we will assume that each factor is either a unary factor (depends on exactly one variable) or a binary factor (depends on exactly two variables).
For weighted CSP construction, you can refer to the CSP examples we provided in util.py
for guidance (create_map_coloring_csp()
and create_weighted_csp()
). You can try these examples out by running
python run_p1.py
Notice we are already able to solve the CSPs, because in submission.py
, a basic backtracking search is already implemented. Recall that backtracking search operates over partial assignments and associates each partial assignment with a weight, which is the product of all the factors that depend only on the assigned variables. When we assign a value to a new variable Xi, we multiply in all the factors that depend only on Xi and the previously assigned variables. The function get_delta_weight()
returns the contribution of these new factors based on the unaryFactors
and binaryFactors
. An important case is when get_delta_weight()
returns 0. In this case, any full assignment that extends the new partial assignment will also be zero, so there is no need to search further with that new partial assignment.
Take a look at BacktrackingSearch.reset_results()
to see the other fields which are set as a result of solving the weighted CSP. You should read submission.BacktrackingSearch
carefully to make sure that you understand how the backtracking search is working on the CSP.
- [4 points] Lets create a CSP to solve the n-queens problem: Given an nn board, wed like to place n queens on this board such that no two queens are on the same row, column, or diagonal. Implement
create_nqueens_csp()
by adding n variables and some number of binary factors. Note that the solver collects some basic statistics on the performance of the algorithm. You should take advantage of these statistics for debugging and analysis. You should get 92 (optimal) assignments for n=8 with exactly 2057 operations (number of calls tobacktrack()
).Hint: If you get a larger number of operations, make sure your CSP is minimal. Try to define the variables such that the size of domain is O(n).
Note: Please implement the domain of variables as list type in Python (again, you may refer to
create_map_coloring_csp()
andcreate_weighted_csp()
inutil.py
as examples of CSP problem implementations), so you can compare the number of operations with our suggestions as a way of debugging. - [4 points] You might notice that our search algorithm explores quite a large number of states even for the 88 board. Lets see if we can do better. One heuristic we discussed in class is using most constrained variable (MCV): To choose an unassigned variable, pick the Xj that has the fewest number of values a which are consistent with the current partial assignment (a for which
get_delta_weight()
on Xj=a returns a non-zero value). Implement this heuristic inget_unassigned_variable()
under the conditionself.mcv = True
. It should take you exactly 1361 operations to find all optimal assignments for 8 queens CSP thats 30% fewer!Some useful fields:csp.unaryFactors[var][val]
gives the unary factor value.csp.binaryFactors[var1][var2][val1][val2]
gives the binary factor value. Here,var1
andvar2
are variables andval1
andval2
are their corresponding values.- In
BacktrackingSearch
, ifvar
has been assigned a value, you can retrieve it usingassignment[var]
. Otherwisevar
is not inassignment
.
- [8 points] The previous heuristics looked only at the local effects of a variable or value. Lets now implement arc consistency (AC-3) that we discussed in lecture. After we set variable Xj to value a, we remove the values b of all neighboring variables Xk that could cause arc-inconsistencies. If Xks domain has changed, we use Xks domain to remove values from the domains of its neighboring variables. This is repeated until no domain can be updated. Note that this may significantly reduce your branching factor, although at some cost. In
backtrack()
weve implemented code which copies and restores domains for you. Your job is to fill inarc_consistency_check()
.You should make sure that your existing MCV implementation is compatible with your AC-3 algorithm as we will be using all three heuristics together during grading.With AC-3 enabled, it should take you 769 operations only to find all optimal assignments to 8 queens CSP That is almost 45% fewer even compared with MCV!Take a deep breath! This part requires time and effort to implement be patient.Hint 1: documentation for
CSP.add_unary_factor()
andCSP.add_binary_factor()
can be helpful.Hint 2: although AC-3 works recursively, you may implement it iteratively. Using a queue might be a good idea.li.pop(0)
removes and returns the first element for a python listli
.
So far, our CSP solver only handles unary and binary factors, but for course scheduling (and really any non-trivial application), we would like to define factors that involve more than two variables. It would be nice if we could have a general way of reducing n-ary constraint to unary and binary constraints. In this problem, we will do exactly that for two types of n-ary constraints.
Suppose we have boolean variables X1,X2,X3, where Xi represents whether the i-th course is taken. Suppose we want to enforce the constraint that Y=X1X2X3, that is, Y is a boolean representing whether at least one course has been taken. For reference, in util.py
, the function get_or_variable()
does such a reduction. It takes in a list of variables and a target value, and returns a boolean variable with domain [True, False]
whose value is constrained to the condition of having at least one of the variables assigned to the target value. For example, we would call get_or_variable()
with arguments (X1,X2,X3,True), which would return a new (auxiliary) variable X4, and then add another constraint [X4=True].
The second type of n-ary factors is constraints on the sum over n variables. You are going to implement reduction of this type but lets first look at a simpler problem to get started:
- [4 points] Suppose we have a CSP with three variables X1,X2,X3 with the same domain {0,1,2} and a ternary constraint [X1+X2+X3K]. How can we reduce this CSP to one with only unary and/or binary constraints? Explain what auxiliary variables we need to introduce, what their domains are, what unary/binary factors youll add, and why your scheme works. Add a graph if you think thatll better explain your scheme.
Hint: draw inspiration from the example of enforcing [Xi=1 for exactly one i] which is in the first CSP lecture.
- [5 points] Now lets do the general case in code: implement
get_sum_variable()
, which takes in a sequence of non-negative integer-valued variables and returns a variable whose value is constrained to equal the sum of the variables. You will need to access the domains of the variables passed in, which you can assume contain only non-negative integers. The parametermaxSum
is the maximum sum possible of all the variables. You can use this information to decide the proper domains for your auxiliary variables.How can this function be useful? Suppose we wanted to enforce the constraint [X1+X2+X3K]. We would callget_sum_variable()
on (X1,X2,X3) to get some auxiliary variable Y, and then add the constraint [YK]. Note: You dont have to implement the constraint for this part.
In this problem, we will apply your weighted CSP solver to the problem of course scheduling. We have scraped a subset of courses that are offered from Stanfords Bulletin. For each course in this dataset, we have information on which quarters it is offered, the prerequisites (which may not be fully accurate due to ambiguity in the listing), and the range of units allowed. You can take a look at all the courses in courses.json
. Please refer to util.Course
and util.CourseBulletin
for more information.
To specify a desired course plan, you would need to provide a profile which specifies your constraints and preferences for courses. A profile is specified in a text file (see profile*.txt
for examples). The profile file has four sections:
- The first section specifies a fixed minimum and maximum (inclusive) number of units you need to take for each quarter. For example:
minUnits 0maxUnits 3
- In the second section, you
register
for the quarters that you want to take your courses in. For example,register Aut2017register Win2018register Spr2018
would sign you up for this academic year. The quarters need not to be contiguous, but they must follow the exact format
XxxYYYY
whereXxx
is one ofAut, Win, Spr, Sum
andYYYY
is the year. - The third section specifies the list of courses that youve taken in the past and elsewhere using the
taken
keyword. For example, if youre in CS221, this is probably what you would put:taken CS103taken CS106Btaken CS107taken CS109
- The last section is a list of courses that you would like to take during the registered quarters, specified using
request
. For example, two basic requests would look like this:request CS224Nrequest CS229
Not every request must be fulfilled, and indeed, due to the additional constraints described below, it is possible that not all of them can actually be fulfilled.
Constrained requests. To allow for more flexibility in your preferences, we allow some freedom to customize the requests:
- You can request to take exclusively one of several courses but not sure which one, then specify:
request CS229 or CS229A or CS229T
Note that these courses do not necessarily have to be offered in the same quarter. The final schedule can have at most one of these three courses. Each course can only be requested at most once.
- If you want to take a course in one of a specified set of quarters, use the
in
modifier. For example, if you want to take one of CS221 or CS229 in either Aut2017 or Sum2018, do:request CS221 or CS229 in Aut2017,Sum2018
If you do not specify any quarters, then the course can be taken in any quarter.
- Another operator you can apply is
after
, which specifies that a course must be taken after another one. For example, if you want to choose one of CS221 or CS229 and take it after both CS109 and CS161, add:request CS221 or CS229 after CS109,CS161
Note that this implies that if you take CS221 or CS229, then you must take both CS109 and CS161. In this case, we say that CS109 and CS161 are
prereqs
of this request. (Note that theres no space after the comma.)If you request course A and B (separately), and A is an official prerequisite of B based on theCourseBulletin
, we will automatically add A as a prerequisite for B; that is, typingrequest B
is equivalent torequest B after A
. Note that if B is a prerequisite of A, to request A, you must either request B or declare youve taken B before. - Finally, the last operator you can add is
weight
, which adds non-negative weight to each request. All requests have a default weight value of 1. Requests with higher weight should be preferred by your CSP solver. Note that you can combine all of the aforementioned operators into one as follows (again, no space after comma):request CS221 or CS229 in Win2017,Win2018 after CS131 weight 5
Each request
line in your profile is represented in code as an instance of the Request
class (see util.py
). For example, the request above would have the following fields:
cids
(course IDs that youre choosing one of) with value['CS221', 'CS229']
quarters
(that youre allowed to take the courses) with value['Win2017', 'Win2018']
prereqs
(course IDs that you must take before) with value['CS131']
weight
(preference) with value5.0
Its important to note that a request does not have to be fulfilled, but if it is, the constraints specified by the various operators after,in
must also be satisfied.
You shall not worry about parsing the profiles because we have done all the parsing of the bulletin and profile for you, so all you need to work with is the collection of Request
objects in Profile
and CourseBulletin
to know when courses are offered and the number of units of courses.
Well, thats a lot of information! Lets open a python shell and see them in action:
import util# load bulletinbulletin = util.CourseBulletin('courses.json')# retrieve information of CS221cs221 = bulletin.courses['CS221']print cs221# look at various properties of the courseprint cs221.cidprint cs221.minUnitsprint cs221.maxUnitsprint cs221.prereqs # the prerequisitesprint cs221.is_offered_in('Aut2017')print cs221.is_offered_in('Win2018')# load profile from profile_example.txtprofile = util.Profile(bulletin, 'profile_example.txt')# see what it's aboutprofile.print_info()# iterate over the requests and print out the propertiesfor request in profile.requests: print request.cids, request.quarters, request.prereqs, request.weight
Solving the CSP. Your task is to take a profile and bulletin and construct a CSP. We have started you off with code in SchedulingCSPConstructor
that constructs the core variables of the CSP as well as some basic constraints. The variables are all pairs of requests and registered quarters (request, quarter)
, and the value of such a variable is one of the course IDs in that Request or None
, which indicates none of the courses should be taken in that quarter. We will add auxiliary variables later. We have also implemented some basic constraints: add_bulletin_constraints()
, which enforces that a course can only be taken if its offered in that quarter (according to the bulletin), and add_norepeating_contstraints()
, which constrains that no course can be taken more than once.
You should take a look at add_bulletin_constraints()
and add_norepeating_contstraints()
to get a basic understanding how the CSP for scheduling is represented. Nevertheless, well highlight some important details to make it easier for you to implement:
- The existing variables are tuples of
(request, quarter)
whererequest
is aRequest
object (like the one shown above) andquarter
is astr
representing a quarter (e.g.'Aut2017'
). For detail please look atSchedulingCSPConstructor.add_variables()
. - The domain of each variable
(request, quarter)
is the course IDs of the request plusNone
(e.g.['CS221', 'CS229', None]
). When the valuecid
isNone
, this means no course is scheduled for this request. Always remember to check ifcid
isNone
. - The domain for
quarter
is all possible quarters (self.profile.quarters
, e.g.['Win2016', 'Win2017']
). - Given a course ID
cid
, you can get the correspondingCourse
object byself.bulletin.courses[cid]
.
- [5 points] Implement the
add_quarter_constraints()
. This is when your profile specifies which quarter(s) you want your requested courses to be taken in. This is not saying that one of the courses must be taken, but if it is, then it must be taken in any one of the specified quarters. Also note that this constraint will apply to all courses in that request.We have written averify_schedule()
function ingrader.py
that determines if your schedule satisfies all of the given constraints. Note that since we are not dealing with units yet, it will printNone
for the number of units of each course. For profile3a.txt, you should find 3 optimal assignments with weight 1.0. - [9 points] Lets now add the unit constraints in
add_unit_constraints()
. (1) You must ensure that the sum of units per quarter for your schedule are within the min and max threshold inclusive. You should useget_sum_variable()
. (2) In order for our solution extractor to obtain the number of units, for every course, you must add a variable(courseId, quarter)
to the CSP taking on a value equal to the number of units being taken for that course during that quarter. When the course is not taken during that quarter, the unit should be 0. NOTE: Each grader test only tests the function you are asked to implement. To test your CSP with multiple constraints you can userun_p3.py
and changing the constraints that you want to add. For profile3b.txt, you should find 15 optimal assignments with weight 1.0.Hint: If your code times out, your
maxSum
passed toget_sum_variable()
might be too large. - [2 points] Now try to use the course scheduler for the winter and spring (and next year if applicable). Create your own
profile.txt
and then run the course scheduler:python run_p3.py profile.txt
You might want to turn on the appropriate heuristic flags to speed up the computation. Does it produce a reasonable course schedule? Please include your
profile.txt
and the best schedule in your writeup; were curious how it worked out for you!
Want more challenges about CSP? Here we go.
Suppose we have a weighted CSP with variables X1,,Xn with domains Domaini={1,,K}. We have a set of basic factors which depend only on adjacent pairs of variables in the same way: there is some function g such that fi(x)=g(xi,xi+1) for i=1,,n1. In addition, we have a small set of notable patterns P, where each pP is a sequence of elements from the domain.
Let np be the number of times that p occurs in an assignment x=(x1,,xn) as a consecutive sequence. Define the weight of an assignment x to be i=1n1fi(x)pPnp. Intuitively, we multiply the weight by every time a notable pattern appears.
For example, suppose n=4, =7, g(a,b)=5[a=b]+1[ab] and P={[1,3,3],[1,2,3]}. Then the assignment x=[1,3,3,2] has weight (151)(7170)=35.
- [2 points] If we were to include the notable patterns as factors into the CSP, what would be the worst case treewidth? (You can assume each p has a maximum length of n.)
- [6 points] The treewidth doesnt really tell us the true complexity of the problem. Devise an efficient algorithm to compute the maximum weight assignment. You need to describe your algorithm in enough detail but dont need to implement it. Analyze your algorithms time and space complexities. Youll get points only if your algorithm is much better than the naive solution.
5/5 – (1 vote)
This (and every) assignment has a written part and a programming part.
The full assignment with our supporting code and scripts can be downloaded as scheduling.zip.
- This icon means a written answer is expected in
scheduling.pdf
. - This icon means you should write code in
submission.py
.
You should modify the code in submission.py
between
# BEGIN_YOUR_CODE
and
# END_YOUR_CODE
but you can add other helper functions outside this block if you want. Do not make changes to files other than submission.py
.Your code will be evaluated on two types of test cases, basic and hidden, which you can see in grader.py
. Basic tests, which are fully provided to you, do not stress your code with large inputs or tricky corner cases. Hidden tests are more complex and do stress your code. The inputs of hidden tests are provided in grader.py
, but the correct outputs are not. To run the tests, you will need to have graderUtil.py
in the same directory as your code and grader.py
. Then, you can run all the tests by typing
python grader.py
This will tell you only whether you passed the basic tests. On the hidden tests, the script will alert you if your code takes too long or crashes, but does not say whether you got the correct output. You can also run a single test (e.g., 3a-0-basic
) by typing
python grader.py 3a-0-basic
We strongly encourage you to read and understand the test cases, create your own test cases, and not just blindly run grader.py
.
General Instructions
Problem 0: Warmup
Problem 1: CSP solving
Weight(x)=j=1mfj(x)
Problem 2: Handling n-ary factors
Problem 3: Course Scheduling
Reviews
There are no reviews yet.