Thursday, April 01, 2010

Gaussian FAQ

I found this on my computer and can't seem to find it on the web, so I'm putting it here in case other's would benefit. Note: I didn't write this.




====================John D'Errico, 24 Mar 1995=======ssm
From: John D'Errico
Subject: Gaussian FAQ
Message-ID: <3kv7m3$61f@thetimes.pixel.kodak.com>

This is a FAQ which attempts to cover questions about the normal
(Gaussian) distribution. I will attempt to add or enhance topics
as I have the time and people make suggestions.


Questions to be answered in this FAQ are:

1. What is the normal (Gaussian) density function?

2. What is the cumulative normal distribution function?

3. How is ERF related to the cumulative normal distribution function?

4. What is a good approximation to the cumulative normal function?

5. What is a good approximation to the inverse of the cumulative
distribution normal function?

6. How do you generate a normally distributed random variable?

7. If X is a normally distributed random variable with mean = 0
standard deviation = 1, how do you convert it into one with mean = mu
and standard deviation = sigma?

8. How do you generate bivariate or multivariate random normal
variates with a specified correlation matrix? Covariance matrix?

9. If X is a normally distributed random variable, what is the
distribution of f(X)?

10. What are some good references on these topics?



======================================================================

Answers to questions 1-10:

1. What is the normal (Gaussian) density function?

The standard (mean=0 and variance=1) normal density function is

1 -(1/2)*x^2
Z(x) = ---------- e
sqrt(2*pi)

For mean=mu and variance=sigma^2
[ correction, inserted (1/2) below; 17-aug-00/rfu ]

1 -(1/2)((x-mu)/sigma)^2
Z(x) = ---------------- e
sqrt(2*pi*sigma)

The normal distribution with mean=mu and variance=sigma^2 is
customarily referred to as N(mu,sigma^2). The standard normal
is N(0,1).


----------------------------------------------------------------------

2. What is the cumulative normal distribution function?

The cumulative normal, referred to here by P(X), is simply the
integral of Z(x) dx from minus infinity to X.

We will use Q(X) to refer to the integral of Z(x) dx from X to
plus infinity.

P(X) + Q(X) = 1


----------------------------------------------------------------------

3. How is ERF related to the cumulative normal distribution function?

Since some systems do not provide a cumulative normal function but do
provide ERF, this can be useful if you know the transformation between
them.

ERF(X) = integral [2/sqrt(pi)]*exp(-x^2) dx from 0 to X.

P(X) = [ERF(X/sqrt(2)) + 1]/2


----------------------------------------------------------------------

4. What is a good approximation to the cumulative normal function?

While it is not possible to write a solution to this in a simple,
closed form expression not involving infinite sums, there are many
good approximations available. It is not necessary to resort to
numerical integration techniques such as trapezoidal or Simpson's
rule. In fact, far more accuracy can be achieved through one of the
approximations which can be found in the references listed below.

One simple approximation to P(X), the cumulative normal, is:
(from Abramowitz and Stegun)

Let t = 1/(1+pX)

where p=.33267

then

P(x) = 1-Z(x)*[a_1*t + a_2*t^2 + a_3*t^3]

where

a_1 = 0.4361836
a_2 = -0.1201676
a_3 = 0.9372980

The error in this approximation is less than 0.00001 for all X.

Other approximations can be found in Abramowitz & Stegun; Press, et al;
and in Hart, et al. In particular, Hart has a very extensive set of
approximations, some with accuracy over 20 significant digits within
a given region.


----------------------------------------------------------------------

5. What is a good approximation to the inverse of the cumulative
normal distribution function?

Mathematically we can write this as: Find X such that Q(X) = p for
any 0 < p < 1. (Note: this computes an upper tail probability.)

Again, it is not possible to write this as a closed form expression,
so we resort to approximations. Because of the symmetry of the normal
distribution, we need only consider 0 < p < 0.5. If you have p > 0.5,
then apply the algorithm below to q = 1-p, and then negate the value
for X obtained. (This approximation is also from Abramowitz and
Stegun.)

t = sqrt[ ln(1/p^2) ]

c_0 + c_1*t + c_2*t^2
X = t - ------------------------------
1 + d_1*t + d_2*t^2 + d_3*t^3

c_0 = 2.515517
c_1 = 0.802853
c_2 = 0.010328

d_1 = 1.432788
d_2 = 0.189269
d_3 = 0.001308

See Abramowitz and Stegun; Press, et al.


----------------------------------------------------------------------

6. How do you generate a normally distributed random variable?

There are several good ways to do this, some better than others.
It is assumed that the user has available a good uniform pseudo-
random deviate generator available. This is the starting point
for all such approaches.

The first algorithm is to make use of the central limit theorem.
This says that the sum of a set of independent, identically
distributed (commonly abbreviated as iid) random numbers will
approach normality as enough of them are added together.

So, simply generate n uniform [0,1] random numbers, and sum them
up. Then shift and scale this sum to have the appropriate mean
and variance to give us a roughly N(0,1) random variable. Thus,

sum(X_i) - n/2
Z = --------------
sqrt(n/12)

Note that if we choose n = 12, we need not even do a divide,

Z = sum(X_i) - 6

While the method described above is quite simple to program, it
does a poor job of generating random deviates in the tails of the
distribution. There are far better ways to do accomplish this
task. A logical way to generate a normal random deviate is to
look back to the answer to question (5).

Simply generate a single uniform [0,1] random deviate X, and then
invert the cumulative normal at that point. If the inverse is an
accurate one, then that number will be a pseudo-random normal deviate.
This method will be as fast and as good as is the approximation
you choose for the inverse of the cumulative normal.

The other common method, which generates pairs of normal deviates
at a time (or if you want, bivariate normal deviates) is the Box-
Muller transformation. See Numerical Recipes (Press, et. al.) for
an explanation of this method.


----------------------------------------------------------------------

7. If X is a N(0,1) distributed random variable, how do you convert
it into N(mu,sigma^2)?

Referring back to (1), we see that a simple transformation will
accomplish this.

Z = mu + sigma*X


----------------------------------------------------------------------

8. How do you generate multivariate random normal variates with
a specified correlation matrix and set of variances?

Lets start with a simple case, just a bivariate normal. We will
assume s1 and s2 are the standard deviations, c is the desired
correlation between them (-1 <= c <= 1) and mu1 and mu2 are the
desired means.

First generate two unit normal random deviates, N1 and N2.

X = mu1 + s1*N1

Y = mu2 + c*s2*N1 + sqrt(1-c^2)*s2*N2

X and Y will be bivariate random normal deviates with the desired
distribution parameters.


Now for the more general case of a multivariate normal where "multi"
means larger than two.

Suppose we have a correlation matrix C. We need to convert this to a
covariance matrix, S, first.

Let D be a diagonal matrix with the desired variable standard
deviations on its diagonal. Then

S = D C D

Form the Cholesky factorization of S (actually any matrix "square
root" of s will do.) This is a factorization of S in terms of a lower
triangular matrix L such that

S = L' L

This factorization always exists whenever S is positive definite
(non-singular in this case, since we started from a supposedly
valid correlation matrix.)

Generate an n by p array, A, of N(0,1) random deviates, where p is
the number of variables and n is the number of multivariate sample
you wish to end up with. Then

B = A L'

is an array of normally distributed random deviates where each row
is an event with roughly the correlation/covariance matrix you wanted.
(It won't be exactly that because of pseudo-random variability.)

Once you have generated a set of random deviates with the desired
correlation matrix and mean vector zero, simply add a vector offset
to obtain any desired mean.

If the correlation/covariance matrix supplied is singular, a similar
approach can still be used, but now we must define L using a different
type of matrix square root. This can be achieved using an eigenvalue
decomposition of the matrix S.


----------------------------------------------------------------------

9. If X is a normally distributed random variable, what is the
distribution of f(X)?

There are several simple cases we can cover. If X and Y are N(0,1)
independent R.V.s, then

aX + bY is a normal random variable for constants a and b.

X/Y is a Cauchy random variable (and its moments are undefined.)

e^X is a lognormal random variable.

X^2 + Y^2 is a Chi-squared random variable with two degrees of
freedom.

sqrt(x^2 + y^2) is a Rayleigh random variable.

These distributions along with many others can all be found in
Johnson, et al.

The general problem of determining the distribution of a function of
one or more random variables is known as statistical tolerancing.
This can be done by several techniques, Monte Carlo methods, first
or second order Taylor series approximations, and Taguchi style
methods (actually related to Gaussian quadrature) to name a few.
A common approach it to estimate the first few (generally either
2 or 4) moments of the transformation in question and then find a
distribution which has those same moments. A family of distributions
which can be used to match the first 4 moments is the Pearson system
(see Johnson, et al.)

See D'Errico & Zaino for a discussion of statistical tolerancing.


----------------------------------------------------------------------

10. What are some good references on these topics?

The references listed here are not necessarily the best possible ones
for this subject.

1. Abramowitz and Stegun, "Handbook of Mathematical Functions",
Dover Publications, 1965 (originally published by the National
Bureau of Standards, 1964)

2. Press, Flannery, Teukolsky, Vetterling, "Numerical Recipes",
Cambridge University Press, 1986

3. Hart, et al, "Computer Approximations", Wiley, 1968

4. Johnson, Kotz, & Balakrishnan, "Continuous Univariate
Distributions", Volume 1, Wiley, 1994

5. Kalos & Whitlock, "Monte Carlo Methods", Wiley, 1986

6. Knuth, "The Art of Computer Programming, Volume 2, Seminumerical
Algorithms", Addison-Wesley, 1969

7. D'Errico & Zaino, "Statistical Tolerancing Using a Modification
of Taguchi's Method", Technometrics, Vol 30, no. 4

======================================================================

John D'Errico, derrico@pixel.kodak.com

Any errors or omissions are my fault and mine alone.

======================================================================