CDS 110b: Stochastic Systems

From Murray Wiki
(Redirected from CDS110b: Stochastic Systems)
Jump to navigationJump to search
CDS 110b Schedule Project

This set of lectures presents an overview of random processes and stochastic systems. We begin with a short review of continuous random variables and then consider random processes and linear stochastic systems. Basic concepts include probability density functions (pdfs), joint probability, covariance, correlation and stochastic response.

References and Further Reading

  • R. M. Murray, Optimization-Based Control. Preprint, 2008: Chapter 4 - Stochastic Systems
  • Hoel, Port and Stone, Introduction to Probability Theory - this is a good reference for basic definitions of random variables
  • Apostol II, Chapter 14 - another reference for basic definitions in probability and random variables

Frequently Asked Questions

Q (2008): Why does E{X Y} = 0 if two random variables are independent

By definition, we have that


E\{X Y\} = \int_{-\infty}^\infty \int_{-\infty}^\infty x y p(x, y)\, dx\, dy


where \(p(x,y)\) is the joint probability density function. If \(X\) and \(Y\) are independent then \(p(x, y) = p(x) p(y)\) and we have


\begin{aligned} E\{X Y\} &= \int_{-\infty|}^\infty \int_{-\infty}^\infty x y p(x) p(y)\, dx\, dy \\ &= \int_{-\infty}^\infty \left( \int_{-\infty}^\infty x p(x)\, dx \right) y p(y)\, dy = \int_{-\infty}^\infty \mu_X y p(y) dy = \mu_X \mu_Y. \end{aligned}


If we assume that \(\mu_X = \mu_Y = 0\) then the result follows. (Alternatively, compute \(E\{(X - \mu_X) (Y - \mu_Y)\}\)

Q (2007): How do you determine the covariance and how does it relate to random processes

The covariance of two random variables \(x\) and \(y\) is given by

\( E\{(x - \mu) (y - \mu)\} = \int_{-\infty}^\infty \int_{-\infty}^\infty (x - \mu) (y - \mu) p(x, y) dx dy \)

For the case when \(x = y\), the covariance \(P(x, y)\) is called the variance, \(\sigma^2\).

For a random process, \(x(t)\), with zero mean, we define the covariance as

\( P(t) = E\{x(t) x^T(t)\}. \)

If \(x\) is a vector of length \(n\), then the covariance matrix is an \(n \times n\) matrix with entries

\( E\{x_i(t) x_j(t)\} = \int_{-\infty}^\infty \int_{-\infty}^\infty x_i x_j p(x_i, x_j; t, t) dx_i dx_j \)

where \(p(x_i, x_j; t, t)\) is the joint distribution desity function between \(x_i\) and \(x_j\).

Intuitively, the covariance of a vector random process \(x(t)\) describes how elements of the process vary together. If the covariance is zero, then the two elements are independent.

Q (2006): Can you explain the jump from pdfs to correlations in more detail?

The probability density function (pdf), \(p(x; t)\) tells us how the value of a random process is distributed at a particular time:


P(a \leq X(t) \leq b) = \int_a^b p(x; t) dx.


You can interpret this by thinking of \(X(t)\) as a separate random variable for each time \(t\)

The correlation for a random process tells us how the value of a random process at one time, \(t_1\) is related to the value at a different time \(t_2\). This relationship is probabalistic, so it is also described in terms of a distribution. In particular, we use the joint probability density function, \(p(x_1, x_2; t_1, t_2)\) to characterize this:


P(a_1 \leq X_1(t_1) \leq b_1, a_2 \leq X_2(t_2) \leq b_2) = \int_{a_1}^{b_1} \int_{a_2}^{b_2} p(x_1, x_2; t_1, t_2) dx_1 dx_2


Given any random process, \(p(x_1, x_2; t_1, t_2)\) descibes (as a density) how the value of the random variable at time \(t_1\) is related (or "correlated") with the value at time \(t_2\). We can thus describe a random process according to its joint probability density function.

In practice, we don't usually describe random processes in terms of their pdfs and joint pdfs. It is usually easier to describe them in terms of their statistics (mean, variance, etc). In particular, we almost never describe the correlation in terms of joint pdfs, but instead use the correlation function:


\rho(t, \tau) = E\{X(t) X(\tau)\} = \int_{-\infty}^\infty \int_{-\infty}^\infty x_1 x_2 p(x_1, x_2; t, \tau) dx_1 dx_2


The utility of this particular function is seen primarily through its application: if we know the correlation for one random process and we "filter" that random process through a linear system, we can compute the correlation for the corresponding output process.

Q (2006): What is the meaning of a white noise process

The definition of a white noise process is that it is a Gaussian process with constant power spectral density. The intution behind this definition is that the spectral content of the process is constant at all frequencies. The term "white" noise comes from the fact that the color "white" comes from having light present at all frequencies.

Another interpretation of the white noise is through the power spectrum of a signal. In this case, we simply compute the Fourier transform of a signal \(F(t)\). The signal is said to be white if it has constant spectrum across all frequencies.

More information

Q (2006): What is a random process (in relation to transfer function)

Formally, a random process is a continuous collection of random variables \(x(t)\). It is perhaps easiest to think first of a discrete time random process\(x_k\). At each time instant \(k\), \(x_k\) is a random variable according to some distribution. If the process is white, then there is no correlation between \(x_k\) and \(x_l\) when \(k \neq l\). If, on the other hand, the value of \(x_k\) gives us information about what \(x_l\) will be, then the processes are correlated and \(\rho(k, l)\) is the correlation function.

These concepts can also be written in continous time, in which case each \(x(t)\) is a random variable and \(\rho(t, s)\) is the correlation function. This takes some time to get used to since \(x\) is not a signal, but rather a description of a class of signals (satisfying some probability measures).

A transfer function describes how we map signals in the frequency domain (see). We can use transfer functions to describe how random processes are mapped through a linear system (this is called spectral response; see lecture notes or text)

More information

Q (2006): what is the transfer function for a parallel combination of \(H_1(s)\) and \(H_2(s)\)?

If two transfer functions are in parallel (meaning: they receive the same input and the the output is the sum of the outputs from the individual transfer functions), the net transfer function is \(H_1(s) + H_2(s)\). Note that this is different than the formula that you get when you have parallel interconnections of resistors in electrical engineering. This is because when two outputs come together in a circuit diagram this restricts the voltage to be the same at the corresponding terminals, whereas in a block diagram we sum the output signals.