Linear Optimal Control (LQR)
Robert Platt Northeastern University
The linear control problem
Given: System:
Given: System:
Cost function:
The linear control problem
where:
The linear control problem
Given: System:
Cost function:
where:
The linear control problem
Given: System:
Cost function:
where:
Initial state:
Calculate:
U that minimizes J(X,U)
The linear control problem
Given: System:
Cost function:
Important problem! Howwdheore:wesolveit?
Initial state:
Calculate:
U that minimizes J(X,U)
One solution: least squares
One solution: least squares
One solution: least squares
where
One solution: least squares
where:
Given: System:
Cost function:
One solution: least squares
where:
Initial state:
Calculate:
U that minimizes J(X,U)
One solution: least squares
Given: System:
Cost function: Initial state:
Calculate: U that minimizes J(X,U)
One solution: least squares Substitute X into J:
Minimize by setting dJ/dU=0:
Solve for U:
What can this do?
Start here
Solve for optimal trajectory:
End here at time=T
Image: van den Berg, 2015
What can this do?
This is cool, but
only works for finite horizon problems doesnt account for noise
requires you to invert a big matrix
Bellman solution
Cost-to-go function: V(x)
the cost that we have yet to experience if we travel along the minimum cost path.
given the cost-to-go function, you can calculate the optimal path/policy Example:
The number in each cell describes the number of steps to-go before reaching the goal state
Bellman solution Bellman optimality principle:
Cost of this time step
(Cost of future time steps)
Bellman solution Bellman optimality principle:
Bellman solution Bellman optimality principle:
Cost-to-go from state x at time t
Cost-to-go from state (Ax+Bu) at time t+1
Cost incurred on this time step
Cost incurred after this time step
Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:
Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:
Then:
Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:
Then:
How do we minimize this term?
take derivative and set it to zero.
Bellman solution
How do we minimize this term?
take derivative and set it to zero.
optimal control as a function of state but: it depends on P_{t+1}
Bellman solution
How do we minimize this term?
take derivative and set it to zero.
How solve for P_{t+1}???
optimal control as a function of state but: it depends on P_{t+1}
Bellman solution Substitute u into V_t(x):
Bellman solution Substitute u into V_t(x):
Bellman solution Substitute u into V_t(x):
Bellman solution Substitute u into V_t(x):
Bellman solution Substitute u into V_t(x):
Dynamic Riccati Equation
Example: planar double integrator
Initial velocity
m=1
b=0.1 u=applied force
Initial position of the puck
Build the LQR controller for: Initial state:
Time horizon: Cost fn:
Goal position
Air hockey table
Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, , P_1
HOW?
Air hockey table
Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, , P_1
Air hockey table
Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, , P_1
Air hockey table
Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, , P_1
Air hockey table
Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, , P_1
Air hockey table
Example: planar double integrator
Step 2:
Calculate u starting at t=1 and going forward to t=T-1
Air hockey table
Example: planar double integrator 1
0.2 0
origin
0 0.2
Example: planar double integrator
u_x, u_y
t
Example: planar double integrator
Example: planar double integrator
origin
0
Example: planar double integrator
origin
0
The infinite horizon case So far: we have optimized cost over a fixed horizon, T.
optimal if you only have T time steps to do the job
But, what if time doesnt end in T steps?
One idea:
at each time step, assume that you always have T
more time steps to go
this is called a receding horizon controller
The infinite horizon case
Time step
Notice that elts of P stop changing (much) more than 20 or 30 time steps prior to horizon.
what does this imply about the infinite horizon case?
Elements of P matrix
The infinite horizon case
Converging toward fixed P
Time step
Notice that elts of P stop changing (much) more than 20 or 30 time steps prior to horizon.
what does this imply about the infinite horizon case?
Elements of P matrix
The infinite horizon case We can solve for the infinite horizon P exactly:
Discrete Time Algebraic Riccati Equation
So, what are we optimizing for now?
Given: System:
Cost function:
where:
Initial state:
Calculate:
U that minimizes J(X,U)
Controllability
A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.
When is a linear system controllable?
Its property of the system dynamics
Controllability
A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.
When is a linear system controllable?
Remember this?
Controllability
What property must this matrix have?
Controllability
This submatrix must be full rank.
i.e. the rank must equal the dimension of the state space
Reviews
There are no reviews yet.