1
Another kind of racetrack problem
- Robot vehicle, starting point, finish line, walls are the same as in Project 1
Differences:
- Vehicles control system is unreliable
- May move to a slightly different location than you intended
I up to 1 unit in any direction
- You can make bigger changes in velocity I Up to 2 units in any direction
- Dont need to stop exactly on the finish line
- OK to stop at distance 1
Moving the vehicle
- Current state s = (p,z)
- location p = (x,y), nonnegative integers
I velocity z = (u,v), integers
- You choose new velocity z0 = (u0,v0), where
u0 {u, u 1, u 2},
v0 {v, v 1, v 2}.
- If z0 6= (0,0), then the control system may make an error in your position I e = (q,r), where q,r {1,0,1}
- Vehicle moves to location p0 = p + z0 + e = (x + u0 + q, y + v0 + r)
- New state s0 = (p0,z0)
2 = ((5,8),(0,2)) p2, z2
- You choose z3 = z2 + (1,0) = (1,2)
- Control error e3 = (0,1)
- New location p3 = p2 + z3 + e3
= (5,8) + (1,2) + (0,1)
= (6,11)
- New state s3 = (p3,z3) = ((6,11),(1,2))
- The control error doesnt change velocity, just your position I Unrealistic, but it makes the problems easier to solve
3 = ((6,11),(1,2)) p3, z3
- You choose z4 = z3 + (1,1) = (2,1)
- Control error e4 = (1,1)
- New location p4 = p3 + z4 + e4
= (6,11) + (2,1) + (1,1)
= (7,13)
- New state s4 = (p4,z4) = ((7,13),(2,1)) 4 = ((7,13),(2,1)) p4, z4
- You choose z5 = z4 + (1,1) = (3,0)
- Control error e5 = (1,1)
- New location p5 = p4 + z5 + e5
= (7,13) + (3,0) + (1,1)
= (11,14)
- New state s5 = (p5,z5) = ((11,14),(3,0))
- Trajectory is unsafe
- Would have crashed if e5 were (0,1) or (1,1)
- Ideally, you want a strategy that will always keep you from crashing regardless of what control errors occur
Objective
Get to the finish line and stop
I Might not be able to land exactly on the line
I Control errors can prevent that
- OK to get to distance 1
- Need to stop
- Last move needs to have velocity 0, as in Project 1
- Want to get there as quickly as possible without crashing, despite control errors
Strategy
Pretend the control system is an opponent thats trying to make you crash
- Choose moves that will keep you from crashing, regardless of what it does
- Write a game-playing algorithm to do it move by move
- as in chess, checkers, or go
How to do it
- One possibility: alpha-beta game-tree search
- Limited-depth search, static evaluation function
- Another possibility: Monte Carlo rollouts
- Problem: randomly generated paths are very unlikely to go to the goal I I dont think it will work very well
- Another idea: biased Monte Carlo rollouts
- Generate paths randomly, but bias the moves toward good evaluation-function values
I How well this will work, I have no idea
- No way to guarantee you wont crash
Opponent
- Well give you a simple opponent program
- It will try to make you crash, but wont be very intelligent about it
- Warning: dont write a program that just tries to take advantage of the dumb opponent!
- When we grade your program, well use a more intelligent opponent
- Need to choose moves that wont crash, no matter what the opponent does
Other comments
- You may use any of the code I gave you for Project 1, and any of the code you developed for Project 1 I You can modify it if you wish
- Caveat: most of it wont be very useful
- Youll need to write a game-tree-search algorithm and/or a Monte Carlo rollout algorithm
- Youll need a heuristic function
- You can use the one you developed for Project 1
I You can use any of the ones I gave you for Project 1
- g., h walldist
- Caveat: Will a heuristic function for Project 1 work well as a game-tree-search heuristic?
- You might need to make modifications
What you need to submit
- You need to submit a file called py containing a program called main
- Well give you a game environment for running it
I It will simulate turn-by-turn interactions with the opponent
I At each turn, it will run proj2.main(s,f,w)
- s = state, f = finish line, w = list of walls
- Your main program should print (to standard output) a sequence of choices for what velocity to use. Each choice should be a pair of integers (u,v) followed by a linebreak.
(2, 2)
(1, 3)
(1, 2)
(1, 2)
- Keep searching for better and better recommendations
- g., iterative deepening, or additional Monte Carlo rollouts
More about the game environment
- Game environment runs your main program as a separate process I Lets it run for 5 seconds, kills it, reads the last velocity it chose
- After getting your chosen velocity (u,v), it lets the opponent choose what error to use I e = (q,r), where q,r {1,0,1}
- It computes the new state, and checks whether the game has ended I you crash you lose
- you reach the finish line and your velocity is (0,0) you win
I otherwise, game hasnt ended game environment will call your program again, with the new current state
- If the game hasnt ended, it goes to the next turn
- runs your main program again
Files Ill provide
- File on Piazza: project2b code.zip
- Not project2 code.zip that version had an error in it
- sample probs modified version of the test problems from Project 1.
- I removed or modified the ones that were obviously unsolvable.
I Each problem is a list of the form [name, p0, finish, walls]
I If a problems dimensions are so small that the problem is unsolvable, you can call double(p) where p is the problem, to return a problem in which the x and y dimensions are both doubled.
- py two simple opponent programs.
- opponent1 tries to head for the wall
I opponent0 makes moves at random
I By default, env.main uses opponent1
Files Ill provide (continued)
- env.py environment for running your proj2.py file.
Heres what env.main(problem) does:
- If you have a initialize, it launches proj2.initialize(s,f,w), waits 5 seconds, and kills the process if it hasnt exited.
- This is so you can compute some data to use in your main program
I Your proj2.initialize should write the data to a file called data.txt; otherwise the data will be lost when the process exits
- It repeats the following steps until you win or lose:
- Launch main, wait 5 seconds, and kill the process
I Read the last velocity (u,v) in choices.txt. If it isnt legal, return lose I If (u,v) = (0,0) and distance from finish line 1, return win.
I Call the opponent to add an error to the velocity
I Draw the move using turtle graphics
I If the move crashes into a wall, return lose
Files Ill provide (continued)
- proj2 example.py a deliberately stupid version of py I Rename it to proj2.py if you want to use it with env.py.
- I provided it so you can see how py works before you start writing your own proj2.py.
I It can win if its lucky, but usually it eventually crashes
I Youll need to write something that works much better
- Some files used by proj2 example.py
- racetrack example.py modified version of racetrack from Project 1
I fsearch.py and tdraw.py same as in Project 1
Reviews
There are no reviews yet.