This blog entry is mostly a precursor to my next one which is going to be a review of various Monte Carlo (MC) and Quasi Monte Carlo (qMC) methods used to compute integrals of functions of various smoothness. To understand the technical details behind MC techniques, one first needs to understand two basic results in probability theory, the law of large numbers (LLN) and the central limit theorem (CLT). There are excellent resources on the net for LLN and CLT. For example, this and this are highly recommended readings. This blog will play a complementary with figures and animations to gain an intuitive understanding.
— 1. The Law of Large Numbers —
Let be a real-valued random variable and let
be an infinite sequence of independent and identically distributed copies of
. Let
be the empirical averages of this sequence at stage . Then, we have
Theorem 1 (Weak law of large numbers (WLLN): ) Let
have finite first moment. Then, the random variable
, converges in probability to
. In other words,
Theorem 2 (Strong law of large numbers (SLLN): ) Let
have finite first moment. Then, the random variable
, converges in almost surely to
. In other words,
Click here for the proofs of the above theorems. The SLLN implies the WLLN but not the other way around.
The figure above illustrates a sequence of random variables which converge in probability to zero and which does not converge almost everywhere to zero.
The figure above is an example of an almost everywhere converging sequence of random variables which converges to zero.
Under a stronger moment assumption, a much simpler proof can be obtained.
Exercise 1 Using the Chebyshev’s inequality the Borel-Cantelli lemma, and assuming
, prove the strong law of large number.
As in my previous blog, let’s try to understand the WLLN and SLLN from an experimental point of view. The way to interpret LLN is as follows. The underlying sample space is the space of all sequences
of infinite length. In other words,
is the common extension consisting of all possible outcomes of all possible lengths. Given a value of
, let
. Then, the WLLN says that measure of
goes to zero for all
as
. And the SLLN says that the set of all
such that
has measure zero in
.
Example 1 Let each
with equal probability (representing heads and tails of a coin flip). Then the set of all possibilities can be identified with the unit interval
(or
) using binary representation (for example
). The interpretaion of SLLN is that almost surely, any real number you pick will have 0 and 1 distributed equally in its binary representation.
— 2. The Central Limit Theorem —
We need the following definition before stating the CLT.
Definition 3 (Convergence in distribution:) The sequence of random variables
converges in distribution to a random variable
if the cumulative distribution functions
converges pointwise to the cumulative distribution function
of
at all
for which
is continuous.
for all at which
is continuous.
As the figure above illustrates, convergence of random variables in distribution does not imply that the distribution of the random variables converge in any sense. Convergence in distribution is a very weak notion of convergence. In the above figure, the green curves correspond to the pdf and the cdf of the uniform distribution on interval respectively. The blue curves are the analogous curves for the sequence of random variables on
interval with pdf given by
Thus we see an example where the cdf converges but the pdfs do not. We are now in a position to state the Central Limit Theorem.
Theorem 4 (Central Limit Theorem) Let
be a real valued random variable with mean
and variance
. Let
be an infinite sequence of independent and identically distributed copies of
. Let
. Then, if
we have that
converges in distribution to the normal distribution with mean zero and variance one.
Note that using the i.i.d assumption, we see that the mean and variance of is 0 and 1 respectively. What the CLT guarantees is that
in fact converges in distribution to
, the standard Normal distribution. Before proceeding, consider the following figure consisting of histograms for
for various values of
for mean zero and variance one
.
On the top left, is the histogram for , on the top right is the histogram for
, the bottom left is for
and the bottom right is the histogram for
. This figure illustrates the fact that when
,
converges to a deterministic number. When
, the variance of
keeps growing and when
, we have a balanced situation where
converges to a normal distribution. The CLT says tells us that it converges in distribution to the normal distribution. It is also true that the CLT cannot be made stronger, i.e.,
cannot converge in probability or almost sure sense. (See Exercise 1 of Tao’s blog).
Here is an illustration of the CLT working in practice. The random variables are chosen to be uniformly distributed (shifted and scaled to have mean zero and variance one).
Before concluding, we note that there is also a more quantitative version of CLT, called the Berry-Esseen Theorem which also tells us the rate at which converges to
. To be precise, the Berry-Esseen Theorem states
Theorem 5 (Berry-Esseen Theorem) Let
be mean zero and variance one random variable and let
be iid copies of
. Let
be
and
be the distribution of
. Then
Therefore, we see that the rate of convergence of to
is governed by the skewness of
. The Berry-Esseen effect can be seen in practice as shown below.
In the above figure, the random variables are chosen to have a high skew values. The
s are generated using the MATLAB command pearsrnd with parameters chosen to be
mu = 0, sigma = 1, skew = 3, kurt = 12
Compared to the CLT illustration for the uniform distribution, we see that for the skewed distribution, the number of iterations taken before one starts seeing the CLT working is larger. We again remind the reader that convergence of pdf shown in the lower plot in the above two figures are just a coincidence not guaranteed by the CLT.
Tags: berry-esseen, central limit theorem, law of large number







No comments
Comments feed for this article
Trackback link: http://nairanalytics.com/Research_Blog/2011/08/28/the-law-of-large-numbers-and-the-central-limit-theorem/trackback/