Introduction
Last week we studied continuous time models.
We defined the Wiener process
We described how to understand stochastic differential equations as limits of difference equations.
Copyright By Assignmentchef assignmentchef
I claimed that the benefit of continuous time models is the same as the benefit of using ODEs instead of difference equations: you can do calculus.
The central theme this week is Itos lemma, which is the most important result in stochastic calculus. Itos Lemma is a stochastic version of the chain rule from classical calculus.
Using Itos Lemma we will be able to solve some stochastic differential equations and write their solutions in closed form.
In practice solving ODEs is challenging and solving SDEs is even harder, so there are only a handful couple of SDEs we will be able to solve in closed form.
You will have noticed that differentiation is easy, but integration is hard. You can probably differentiate any function I give you (that is differentiable), but you cant always compute an integral explicitly.
This isnt because you are bad at integration, it is because there are only a certain number of tricks available for integration (substitution, integration by parts, partial fractions, contour integrals) and they cannot always be applied. There is a theory called differential Galois Theory which proves that you cant solve certain integrals using standard functions, just as classical Galois theory shows you cant solve the general quintic equation using only n-th roots.
There is really only one trick for completely solving SDEs and that is Itos Lemma.
Because Itos Lemma is so important I want you to be able to visualise why it is true. So we will plot lots of pictures.
In fact, this weeks course is based on a paper I wrote Coordinate Free Stochastic Differential Equations as Jets which showed how you can draw good pictures of SDEs. I developed these pictures so that I could understand Itos Lemma intuitively.
ODEs are easy to visualise as vector fields. But vector fields on R2 are much more fun to visualise than vector fields on R. The same applies to SDEs. For this reason we will also introduce higher dimensional SDEs this week.
The problem with higher dimensional problems is that calculations quickly get long, boring and fiddly. You will have experienced this when doing vector calculus: calculating the curl of a vector field takes ages by hand!
To save ourselves some boring computations, we will use sympy, a Python package for symbolic mathe- matics which will do all the dull calculations for us.
Learning Outcomes
In summary, this week you will be able to: Solve SDEs using Itos Lemma
Simulate higher dimensional SDEs
Visualise an SDE as a field of curves
Ito (19152008)
Developed the theory of SDEs using the Ito integral
KeyPaper:StochasticIntegralProceedingsoftheImperialAcademy(1944)
Ito wrote his name with a hat accent on the o to indicate a longer vowel. This is tedious to do in a Jupyter
notebook, so in this course I have not done so except. When preparing LaTeX documents write It^o [1]:
Higher Dimensional SDEs
Wt is a Wiener process
We considered the SDE
dXt = a(X,t)dt + b(X,t)dWt where a : R R R and b : R R R are functions.
We interpreted the SDE as meaning that Xt is the limit of the Euler-Maruyama scheme Xt+t = Xt + a(X,t)t + b(X,t)Wt
The standard definition is different and uses the Ito-integral.
Definition: A d-dimensional process Wt Rd is called a d-dimensional Brownian motion with drift and
covariance matrix if
W has independent increments.
The increments Wt+u Wt are normally distributed with mean u and covariance matrix u. Wt is almost surely continuous in t.
Here is a positive definite d d symmetric matrix.
Definition: A d-dimensional Wiener process is a d-dimensional Brownian motion with mean 0 and covari-
ance matrix given by the identity.
Note that the definition of Brownian motion isnt as standardized as that of a Wiener process. This is the definition used in this course.
Lemma: Let Wt1, Wt2, . . . Wtd be independent 1-d Wiener processes, then Wt := (Wt1, . . . Wtd) is a d- dimensional Wiener process with covariance matrix 1d.
Lemma: Let L be the Cholesky-decomposition of a covariance matrix and let Wt be a d-dimensional Wiener process, then Vt := LWt is a d-dimensional Brownian motion with covariance matrix and drift 0.
Lemma: To simulate a d-dimensional Brownian motion, V with drift 0 and covariance matrix on a
t discretegrid{0,t,2t,,Nt = T}wemayusethedifferenceequation: Vt+t = Vt +L
t is a d-dimensional vector of independent standard normal random variables. [2]:
tt where
def simulate_brownian( T, n_steps, cov=np.identity(1)): L = np.linalg.cholesky( cov )
dt = T/n_steps
d = cov.shape[0]
V = np.zeros( [d, n_steps+1] )
eps = np.random.randn( d, n_steps ) for i in range(0,n_steps):
V[:,i+1] = V[:,i] + sqrt(dt)*L @ eps[:,i] return V
P = np.array([[1, rho], [rho,1]]) W = simulate_brownian(1,5000, P) ax= plt.gca() ax.plot(W[0,:],W[1,:]); ax.set_aspect(equal) ax.set_xlabel($x_1$) ax.set_ylabel($x_2$); #+
P = np.array([[1, rho], [rho,1]]) W = simulate_brownian(1,5000, P) ax= plt.gca() ax.plot(W[0,:],W[1,:]); ax.set_aspect(equal) ax.set_xlabel($x_1$) ax.set_ylabel($x_2$); #+
LetWtd beaBrownianmotion. Leta : Rn R Rn andB : Rn R Rn Rd befunctions. Let X0 Rn be given. Then we may interpret the solution of the SDE
dXt =a(Xt,t)dt+B(Xt,t)dWt. as referring to the limit of the discrete time Euler-Maruyama scheme
whereXt R2 andUt,Vt R.
We can write an SDE in vector notation:
Xt+t = Xt + a(Xt, t) t +
tB(Xt, t)(Wt+t Wt)
Note that a is Rn-vector valued so at is an n-vector.
B is (n d)-matrix valued and (Wt+t Wt) is a d-vector so that B(Xt, t)(Wt+t Wt) is an
Very often we write higher dimensional SDEs by writing an SDE for each component. Let us write
Xt =(Ut,Vt)
(U)(Vt 0)(W1) d V = 0 U d W2
It is often more readable to write an equation for each component
dUt = Vt dWt1 dVt = Ut dWt2.
You may choose to have less noise terms than dimensions for your process. So d may be less than n. For example
d U = Vt dWt1 Vt Ut
is a valid SDE. It can be written equivalently as
dUt = Vt dWt1 dVt = Ut dWt1.
Let Rn be a vector and be a positive definite symmetric matrix. Given a vector v write diag(v). For the matrix the components of v on the diagonal.
Continuous time geometric brownian motion is the solution of the SDE where Vnt is an n-dimensional Brownian motion with covariance matrix .
dSt = (diag(S)t ) dt + diag(S)t dVt
You can use this to model stock prices, when it is called the n-dimensional Black-Scholes-Merton model.
We can choose a pseudo-square root, , of and write the equation in terms of a Wiener process as dSt =(diag(S)t)dt+(diag(S)t)dWt
Since is an n-vector diag(S)t is also an n-vector. Since is an n n matrix, diag(S)t is also an n n matrix.
1.1 Example 1dimensional geometric Brownian motion
In the 1-dimensional case we get 1-dimensional continuous time geometric Brownian motion dSt = St(1 dt + dWt1).
We will see in a later video this week how to relate this to the discrete time version of geometric Brownian motion we have already seen.
This is the classical model continuous time model for stock prices used by Black and Scholes for their famous paper on derivative pricing and by Merton for his famous paper on continuous time investment strategies.
So the equation
makes sense in vector notation.
dSt = (diag(S)t) dt + (diag(S)t dWt) Another way to write this SDE is to use to denote elementwise multiplication.
dSt = (S)t dt+(S)t (dWt)
1.2 Example independent geometric Brownian motions
In the Black-Scholes-Merton model, Write St = (St1, St2). Take
= 1 = 1 0 2 0 2
in which we get two idependent 1-dimensional geometric Brownian motions. dSt1 = St1(1 dt + 1dWt1)
dSt2 = St2(2 dt + 2dWt2)
If you are given the values of Wt then you can simulate an SDE using the Euler-Maruyama scheme. If you
are not given the values, you can simulate the increments yourself:
1 .t
Wt:= t . dt
where the it are independent standard normals.
1.3 Example simulating the BlackScholesMerton model.
To simulate the Black-Scholes-Merton model we may write
1 .t
St+t =St +St t+ t . dt
Recall that the mutliplcation between and the vector of values is matrix multiplication. This is written @ in numpy. The elementwise multiplication is written as * in numpy.
We can approximately simulate stocks in the Black-Scholes-Merton model by using the Euler-Maruyama scheme with a large number of steps. One of this weeks group exercises asks you to find a better way of simulating stocks in this model.
def simulate_bsm_euler_maruyama( T, S0, mu, sigma, n_steps ): dt = T/n_steps
n = S0.shape[0]
S = np.zeros([n,n_steps+1]) S[:,0]=S0
epsilon = np.random.randn(n,n_steps) for i in range(0,n_steps):
S[:,i+1]=S[:,i] +
S[:,i] * (mu * dt + sqrt(dt) * sigma @ epsilon[:,i])
S0 = np.array([100, 150])
mu = np.array([0.03, 0.05])
sigma = np.array([[0.2, 0],[0.05,0.05]])
n_steps = 1000
S = simulate_bsm_euler_maruyama(T,S0,mu,sigma,n_steps)
2 Exercises
2.1 Exercise
Prove the lemma below.
Lemma: Let Wt1, Wt2, . . . Wtd be independent 1-d Wiener processes, then Wt := (Wt1, . . . Wtd) is a d-
dimensional Wiener process with covariance matrix 1d.
t = np.linspace(0,T,n_steps+1)
ax = plt.gca()
ax.plot(t,S[0,:]);
ax.plot(t,S[1,:]);
ax.set_title(Two correlated stock prices in the Black-Scholes-Merton model);
2.2 Exercise
Prove the lemma below.
Lemma: Let L be the Cholesky-decomposition of a correlation matrix and let Wt be a d-dimensional Wiener process, then Vt := LWt is a d-dimensional Brownian motion with covariance matrix and drift 0.
2.3 Exercise
Prove the lemma
Lemma: To simulate a d-dimensional Brownian motion, V with correlation matrix on a discrete grid
Simulate 1000 samples of the Euler-Maruyama approximation of a geometric Brownian motion with the parameter values
= 0.03 , = 0.1 0 , S0= 100 0.05 0.05 0.2 150
over a time period of 10 years using 100 steps in the approximation. Draw a scatter plot of S10 against S120. Repeat your simulation with 10000 samples and estimate the covariance of S1 and S2, storing your result in a variable cov_est.
2.5 Exercise
Simulate the deterministic process with initial condition (U, V ) = (1, 0) dUt = Vtdt
dVt = Utdt
for a time period of 2. Perform the simulation with 1000 steps and store the resulting points in two vectors U and V. Plot the values of U against the values of V . What shape do you get as the number of steps tends to 0?
{0,t,2t,,Nt = T} we may use the difference equation: Vt+t + Vt + L tt where t is a d-
dimensional vector of independent standard normal random variables.
2.4 Exercise
2.6 Exercise
Approximate the stochastic process
dUt = VtdWt1 dVt = UtdWt1
with initial condition (U0,V0) = (1,0), where Wt1 is a 1-d Wiener process using the Euler-Maruyama scheme. You should do this by writing a function simulate_process which takes a vector containing the values of W and a vector of the associated times. Plot the values of U against V for a W simulated using Wieners construction. See what happens as the grid size is refined.
3 Visualising ODEs
You can visualise an Ordinary Differential Equation (ODE) as a vector field.
As an example of an ODE in derivative notation consider
dt y t xt We can also write it in differential notation
d(x)(y) =t,
d x = yt dt. yt xt
Or, as we can write it as two equations rather than a vector equation,
dxt = yt dt dyt = xt dt.
The subscript t just means evaluated at time t. This ODE associates a vector
with every point (x, y) R2
You can see the code to draw the vector field in the notebook, it uses the arrow function of matplotlib.
We can approximate the ODE with the Euler scheme which means following the current vector for a small time t.
xt = yt t yt = xt t
In the notebook you can see the code used to find approximate solutions and then plot the resulting trajec- tory. However, in this talk I want to focus on the pictures not how they were generated.
In the limit as t 0 this converges to the solution of the SDE.
In Python, you can read an image as an array of numbers using cv2.imread and then perform all kinds of transformations on it, then display it using matplotlib.pyplot.imshow. For example, I deformed these pictures of Ito by shifting some rows of the array to the right following a cosine pattern.
An important reason why vector fields are a good way of visualising ODEs is that they transform correctly under deformations. If we deform the picture of an ODE, the vector field transforms to a new vector field, this gives us a new ODE. Will the transformed solutions be solutions of this new ODE? Yes they will.
4 Visualising SDEs
So far weve considered SDEs given by the limit of the discrete time scheme
Xt =a(Xt,t)t+b(Xt,t)Wt
But what happens if we consider more general curved schemes? We could consider
Xt = a1(Xt,t)t + a1(Xt,t)(t)2 +
+ b1(Xt,t)Wt + b2(Xt,t)(Wt)2 +
these schemes are curved because they are not linear in t and W .
It turns out that thinking about such schemes is the key to visualising SDEs. We will focus on the limit of stochastic difference equations of the form
Xt+t = (Xt, Wt) 14
where (x, s) is a smooth function satisfying (x, 0) = x. The last condition ensures that if Wt = 0 then Xt = 0 too. So Xt only moves when the Brownian motion moves.
Notice that although this is very general as a scheme involving Wt, there is no t term at all.
The advantage of a curved scheme is that they are easy to visualise. Associated to the SDE for Xt Rn
Xt+t = (Xt, Wt) wemaydefineacurvex :RRn ateachpointxRn by
x(s) = (x, s)
We will write points in R2 as row vectors (x, y). We can then define : R2 R R2 by
((x, y), s) = (x, y) + (y, x)s + 3(x, y)s2
We can now generate a noise process W and solve the difference equation. We refine W using Wieners construction to ensure convergence as t 0.
If we deform the plot just as we deformed the vector field, solutions map to solutions.
A vector field is a good way of drawing an ODE because if you deform the picture to obtain a new vector field, and hence a new ODE, the solutions are just a deformed version of the solutions to the original ODE.
The same holds if we transform a picture of an SDE drawn as a field of curves. When we deform the picture, we get a new field of curves, but the solutions will be just a deformed version of the solutions to the original SDE.
This result is trivial because we define the solution to the SDE in terms of following the curves, so of course if we apply a transform to the curves, we must apply the same transform to the solutions.
In fact, you can think of a vector field as a field of curves where weve just drawn the curves to first order (i.e. as their tangent vectors).
The difference between an ODE and an SDE is that for an ODE you only need to know the curves up to first order to solve the equation; for an SDE you need to know the curves up to second order. We will sketch the reason for this shortly.
For an SDE, the curvature of each x matters. For an ODE curvature is unimportant so a vector field is all you need.
We say that two curves have the same n-jet if their polynomial expansions agree to order n. SDEs are given by 2-jets of curves. ODEs are given by 1-jets of curves.
4.1 Curved schemes and linear schemes
The relationship between curved schemes without a t term and linear schemes with a t term is very interestingandimportant.Wewillseethat(W)2t termsandttermsareessentiallyequivalent.
We can take Taylor series to write this in terms of powers of Wt. Xt = (Xt, Wt)
=(Xt)Wt + 12(Wt)2 +
If we write (x, s) then were to denote partial derivatives with respect to s taken when s = 0:
Lemma: The limit of the curved scheme
as0withX0 =0is
Proof: The case = 1 is obvious. We may write
(X) := (X, 0) s
(X) := 2 (X, 0) s2
Xt+t = Xt + (Wt)
Wt =1 Xt=t =2 0 3
Takingnstepsuptotimet,t= nt.SowecansimulateXt as n ( )
where the i are independent standard normals. We did this as an exercise in week 1!
We can compute the mean of Xt
And the variance
Since the variance is 0, Xt is almost-surely equal to its mean. We have shown the equivalence of the following schemes as t 0
Xt+t = Xt + (Wt)2 is equivalent to Xt+t = Xt + (Wt)3 is equivalent to
( t ) 2 n nE(i )
Var(Xt) = 0
t =2 0 3
n nVar(i )
This relates these curved schemes to the classical, linear, Euler-Maruyama scheme. We are saying that two schemes are equivalent if they have the same solutions as t 0.
As our proof of the Lemma indicates, the scaling of Brownian motion ensures that higher powers of Wt dont have any effect on the scheme and the effect of Wt2 term is deterministic and equivalent to a t term.
One can prove in general that these schemes are equivalent:
Xt = (Xt, Wt)
and Xt = (Xt)Wt + 12(Xt)(Wt)2.
This is equivalent to the linear scheme
Xt = 21(Xt)t+(Xt)Wt Writing this the other way round, the linear scheme
is equivalent to the curved scheme
using the curves
Xt = a(Xt)t + b(Xt)Wt
Xt = b(Xt)Wt + a(Xt)Wt2 + . (x, s) = x + b(X)s + a(X)s2. 19
Xt = t Xt = 0
5 Itos: Let Xt R solve the SDE
dXt =a(Xt,t)dt+b(Xt,t)dWt
with initial condition X0 . Suppose that f : R R is smooth, then (f (X ), Xt ) R2 solves (1)
d(f (Xt )) = f (Xt )a(Xt , t) + 2 f (Xt )b(Xt , t)2 dt + f (Xt )b(Xt , t) dWt dXt =a(Xt,t)dt+b(Xt,t)dWt
with initial condition (f (X0 ), X0 ).
The last equation isnt really interesting as its the one we started with. Since we are understanding SDEs
as numerical schemes, Im listing both equations as you need both to actually simulate f(Xt). The solution to the SDE
with initial condition X0 = W0 is Xt = Wt. Taking f(x) = x2 we compute f(x) = x, f(x) = 2. Hence
by Itos we may write
Equivalently, the solution to the SDE
is Yt = Wt2. Weve solved an SDE!
5.1 Example
Take f(x) = sin(x). So
d(f(X))t = dt + XtdWt. d(W2)t =dt+WtdWt dYt =dt+WtdWt
d(sin(W))t =12sin(Wt)dt+cos(Wt)dWt. Hence Xt = Wt, Yt = sin(Wt) must be the solution to the 2-dimensional SDEs
dYt =12sin(Xt)dt+cos(Xt)dWt
with initial condition (X0 = 0, Y0 = 0). Weve solved another SDE!
Itos Lemma is often described as a stochastic version of the chain rule from classical calculus. To justify
this we make the following deduction about ODEs using the chain rule.
Let Xt R solve the ODE
dXt =a(Xt,t)dt 20
then(f(Xt),Xt)R2 solvestheODEs
d(f(X))t =f(Xt)a(Xt,t)dt
The function f : R R is a deformation of R.
When transforming ODEs we only have to think about the first derivative of f as ODEs only depend on
the 1-jet. This is why you only get f terms in the ODE chain rule.
When transforming SDEs we must consider second order terms, so we get an f term.
If you accept the theory I have described about the correspondence between curved schemes and linear schemes we can use this to prove this version of Itos Lemma.
Proof: Our plan is as follows
We are given an SDE written as the limit of a linear scheme using t and Wt.
We will write this as a curved scheme using Wt and Wt2.
Wewillthenbeaabletowritedownacurvedschemeforthepair(Xt,f(Xt))
We will compute derivatives of this to write it as a linear scheme, once again using t and Wt.
According to our correspondence, the solutions to the SDE
dXt = a(Xt)dt + b(Xt)dWt are given by the limits of the solutions to the difference equation
Xt+t = (Xt,
CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.