Chapman–Kolmogorov equation

In mathematics, specifically in the theory of Markovian stochastic processes in probability theory, the Chapman–Kolmogorov equation (CKE) is an identity relating the joint probability distributions of different sets of coordinates on a stochastic process. The equation was derived independently by both the British mathematician Sydney Chapman and the Russian mathematician Andrey Kolmogorov. The CKE is prominently used in recent variational Bayesian methods.

Mathematical description

Suppose that { f_i } is an indexed collection of random variables, that is, a stochastic process. Let

p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})

be the joint probability density function of the values of the random variables f₁ to f_n. Then, the Chapman–Kolmogorov equation is

p_{i_{1},\ldots ,i_{n-1}}(f_{1},\ldots ,f_{n-1})=\int _{-\infty }^{\infty }p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})\,df_{n}

i.e. a straightforward marginalization over the nuisance variable.

(Note that nothing yet has been assumed about the temporal (or any other) ordering of the random variables—the above equation applies equally to the marginalization of any of them.)

In terms of Markov kernels

If we consider the Markov kernels induced by the transitions of a Markov process, the Chapman-Kolmogorov equation can be seen as giving a way of composing the kernel, generalizing the way stochastic matrices compose. Given a measurable space $(X,{\mathcal {A}})$ and a Markov kernel $k:(X,{\mathcal {A}})\to (X,{\mathcal {A}})$ , the two-step transition kernel $k^{2}:(X,{\mathcal {A}})\to (X,{\mathcal {A}})$ is given by

k^{2}(A|x)=\int _{X}k(A|x')\,k(dx'|x)

for all $x\in X$ and $A\in {\mathcal {A}}$ .^[1] One can interpret this as a sum, over all intermediate states, of pairs of independent probabilistic transitions.

More generally, given measurable spaces $(X,{\mathcal {A}})$ , $(Y,{\mathcal {B}})$ and $(Z,{\mathcal {C}})$ , and Markov kernels $k:(X,{\mathcal {A}})\to (Y,{\mathcal {B}})$ and $h:(Y,{\mathcal {B}})\to (Z,{\mathcal {C}})$ , we get a composite kernel $h\circ k:(X,{\mathcal {A}})\to (Z,{\mathcal {C}})$ by

(h\circ k)(C|x)=\int _{Y}h(C|y)\,k(dy|x)

for all $x\in X$ and $C\in {\mathcal {C}}$ .

Because of this, Markov kernels, like stochastic matrices, form a category.

Application to time-dilated Markov chains

When the stochastic process under consideration is Markovian, the Chapman–Kolmogorov equation is equivalent to an identity on transition densities. In the Markov chain setting, one assumes that i₁ < ... < i_n. Then, because of the Markov property,

p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})=p_{i_{1}}(f_{1})p_{i_{2};i_{1}}(f_{2}\mid f_{1})\cdots p_{i_{n};i_{n-1}}(f_{n}\mid f_{n-1}),

where the conditional probability $p_{i;j}(f_{i}\mid f_{j})$ is the transition probability between the times $i>j$ . So, the Chapman–Kolmogorov equation takes the form

p_{i_{3};i_{1}}(f_{3}\mid f_{1})=\int _{-\infty }^{\infty }p_{i_{3};i_{2}}(f_{3}\mid f_{2})p_{i_{2};i_{1}}(f_{2}\mid f_{1})\,df_{2}.

Informally, this says that the probability of going from state 1 to state 3 can be found from the probabilities of going from 1 to an intermediate state 2 and then from 2 to 3, by adding up over all the possible intermediate states 2.

When the probability distribution on the state space of a Markov chain is discrete and the Markov chain is homogeneous, the Chapman–Kolmogorov equations can be expressed in terms of (possibly infinite-dimensional) matrix multiplication, thus:

P(t+s)=P(t)P(s)\,

where P(t) is the transition matrix of jump t, i.e., P(t) is the matrix such that entry (i,j) contains the probability of the chain moving from state i to state j in t steps.

As a corollary, it follows that to calculate the transition matrix of jump t, it is sufficient to raise the transition matrix of jump one to the power of t, that is

P(t)=P^{t}.\,

Chapman-Kolmogorov in differential form

The differential form of the Chapman–Kolmogorov equation is a representation of the master equation associated with a time-continuous Markov process on a continuous state space. It is obtained under the assumption that the transition dynamics can be decomposed into:

continuous transitions, corresponding to infinitesimal state increments $|x-x'|\ll 1$ ;

discontinuous transitions, corresponding to finite jumps $|x-x'|=O(1)$ .

^[2]

Starting from the general master equation, the contribution of infinitesimal transitions can be expanded using the Kramers–Moyal expansion. If this expansion is truncated at second order, while finite jumps are retained explicitly, one obtains the following differential equation:

${\begin{aligned}{\frac {\partial }{\partial t}}P(x,t|x_{0},t_{0})=&\underbrace {-\sum _{i}{\frac {\partial }{\partial x_{i}}}[A_{i}(x,t)P(x,t|x_{0},t_{0})]} _{\text{Drift term (continuous)}}\\[4pt]&\underbrace {+{\frac {1}{2}}\sum _{i,j}{\frac {\partial ^{2}}{\partial x_{i}\partial x_{j}}}[B_{ij}(x,t)P(x,t|x_{0},t_{0})]} _{\text{Diffusion term (continuous)}}\\[4pt]&\underbrace {+\int \mathrm {d} x'\,[W(x|x',t)P(x',t|x_{0},t_{0})-W(x'|x,t)P(x,t|x_{0},t_{0})]} _{\text{Jump term (disontinuous)}}\end{aligned}}$

The first two terms describe the continuous component of the dynamics and correspond to a generalized Fokker–Planck equation. The integral term accounts for discontinuous transitions and has the standard gain–loss structure of a master equation.

Here:

$A_{i}(x,t)$ are the drift coefficients,
$B_{ij}(x,t)$ is the diffusion matrix (symmetric and positive semi-definite),
$W(x|x',t)$ is the transition rate density for a jump from state $x'$ to $x$ .

Special cases

Several well-known evolution equations arise as special cases of the Chapman–Kolmogorov differential form, depending on which continuous contributions—drift or diffusion—are present.

Wiener process

The Wiener process is a continuous Markov process characterized by pure diffusion, with zero drift and no jumps. Its transition probability density satisfies the diffusion equation

${\frac {\partial }{\partial t}}P(x,t)={\frac {D}{2}}\,{\frac {\partial ^{2}}{\partial x^{2}}}P(x,t),$

which is obtained from the Chapman–Kolmogorov differential form by setting $A(x,t)=0$ and suppressing the jump term.

Fokker–Planck equation

The Fokker–Planck equation describes a Markov process with drift and diffusion, but without jumps. It corresponds to the Chapman–Kolmogorov differential form with nonzero drift coefficient $A(x,t)=\mu (x,t)$ and diffusion coefficient $B(x,t)=2D(x,t)$ , and with the jump term suppressed:

${\frac {\partial }{\partial t}}P(x,t)=-{\frac {\partial }{\partial x}}\!\left[\mu (x,t)\,P(x,t)\right]+{\frac {\partial ^{2}}{\partial x^{2}}}\!\left[D(x,t)\,P(x,t)\right].$

Continuity (deterministic) equation

The Continuity equation describes a deterministic Markov process in which the probability density is transported by a drift field and no stochastic fluctuations are present. It is obtained from the Chapman–Kolmogorov differential form by retaining only the drift term:

${\frac {\partial }{\partial t}}P(x,t)=-{\frac {\partial }{\partial x}}\!\left[\mu (x,t)\,P(x,t)\right].$

This equation expresses probability conservation along deterministic trajectories.

Citations

^ Perrone (2024), pp. 10–11
^ van Kampen, N. G. (2007). "V–VI". Stochastic Processes in Physics and Chemistry (3rd ed.). Elsevier / North-Holland. ISBN 978-0-444-52965-7.