This blog entry is mostly a precursor to my next one which is going to be a review of various Monte Carlo (MC) and Quasi Monte Carlo (qMC) methods used to compute integrals of functions of various smoothness. To understand the technical details behind MC techniques, one first needs to understand two basic results in probability theory, the law of large numbers (LLN) and the central limit theorem (CLT). There are excellent resources on the net for LLN and CLT. For example, this and this are highly recommended readings. This blog will play a complementary with figures and animations to gain an intuitive understanding.

** — 1. The Law of Large Numbers — **

Let be a real-valued random variable and let be an infinite sequence of independent and identically distributed copies of . Let

be the empirical averages of this sequence at stage . Then, we have

Theorem 1 (Weak law of large numbers (WLLN): )Let have finite first moment. Then, the random variable , converges in probability to . In other words,

Theorem 2 (Strong law of large numbers (SLLN): )Let have finite first moment. Then, the random variable , converges in almost surely to . In other words,

Click here for the proofs of the above theorems. The SLLN implies the WLLN but not the other way around.

The figure above illustrates a sequence of random variables which converge in probability to zero and which does not converge almost everywhere to zero.

The figure above is an example of an almost everywhere converging sequence of random variables which converges to zero.

Under a stronger moment assumption, a much simpler proof can be obtained.

Exercise 1Using the Chebyshev’s inequality the Borel-Cantelli lemma, and assuming , prove the strong law of large number.

As in my previous blog, let’s try to understand the WLLN and SLLN from an experimental point of view. The way to interpret LLN is as follows. The underlying sample space is the space of all sequences of infinite length. In other words, is the common extension consisting of all possible outcomes of all possible lengths. Given a value of , let . Then, the WLLN says that measure of goes to zero for all as . And the SLLN says that the set of all such that has measure zero in .

Example 1Let each with equal probability (representing heads and tails of a coin flip). Then the set of all possibilities can be identified with the unit interval (or ) using binary representation (for example ). The interpretaion of SLLN is that almost surely, any real number you pick will have 0 and 1 distributed equally in its binary representation.

** — 2. The Central Limit Theorem — **

We need the following definition before stating the CLT.

Definition 3 (Convergence in distribution:)The sequence of random variables converges in distribution to a random variable if the cumulative distribution functions converges pointwise to the cumulative distribution function of at all for which is continuous.

for all at which is continuous.

As the figure above illustrates, convergence of random variables in distribution **does not imply** that the distribution of the random variables converge in any sense. Convergence in distribution is a very weak notion of convergence. In the above figure, the green curves correspond to the pdf and the cdf of the uniform distribution on interval respectively. The blue curves are the analogous curves for the sequence of random variables on interval with pdf given by

Thus we see an example where the cdf converges but the pdfs do not. We are now in a position to state the Central Limit Theorem.

Theorem 4 (Central Limit Theorem)Let be a real valued random variable with mean and variance . Let be an infinite sequence of independent and identically distributed copies of . Let . Then, ifwe have that converges in distribution to the normal distribution with mean zero and variance one.

Note that using the i.i.d assumption, we see that the mean and variance of is 0 and 1 respectively. What the CLT guarantees is that in fact converges in distribution to , the standard Normal distribution. Before proceeding, consider the following figure consisting of histograms for for various values of for mean zero and variance one .

On the top left, is the histogram for , on the top right is the histogram for , the bottom left is for and the bottom right is the histogram for . This figure illustrates the fact that when , converges to a ** deterministic number**. When , the variance of keeps growing and when , we have a balanced situation where converges to a normal distribution. The CLT says tells us that it converges in distribution to the normal distribution. It is also true that the CLT cannot be made stronger, i.e.,

*cannot*converge in probability or almost sure sense. (See Exercise 1 of Tao’s blog).

Here is an illustration of the CLT working in practice. The random variables are chosen to be uniformly distributed (shifted and scaled to have mean zero and variance one).

Before concluding, we note that there is also a more quantitative version of CLT, called the Berry-Esseen Theorem which also tells us the rate at which converges to . To be precise, the Berry-Esseen Theorem states

Theorem 5 (Berry-Esseen Theorem)Let be mean zero and variance one random variable and let be iid copies of . Let be and be the distribution of . Then

Therefore, we see that the rate of convergence of to is governed by the skewness of . The Berry-Esseen effect can be seen in practice as shown below.

In the above figure, the random variables are chosen to have a high skew values. The s are generated using the MATLAB command pearsrnd with parameters chosen to be

mu = 0, sigma = 1, skew = 3, kurt = 12

Compared to the CLT illustration for the uniform distribution, we see that for the skewed distribution, the number of iterations taken before one starts seeing the CLT working is larger. We again remind the reader that convergence of pdf shown in the lower plot in the above two figures are just a coincidence not guaranteed by the CLT.

**Tags:** berry-esseen, central limit theorem, law of large number

## No comments

Comments feed for this article

Trackback link: http://nairanalytics.com/Research_Blog/2011/08/28/the-law-of-large-numbers-and-the-central-limit-theorem/trackback/