|
Binomial Approximation to the Hypergeometric Distribution
A random variable X that has a hypergeometric distribution
with parameters N, n and k has the
following probability mass function:
The values of E(X) and Var(X) are
The hypergeometric distribution with parameters N, n
and k is the probability distribution of the random
variable X, whose value is the number of successes in a sample of
n items from a population of size N that has k 'success' items
and N - k 'failure' items. Like the binomial
distribution, the hypergeometric distribution with parameters N,
n and k is also the sum of n Bernoulli variables,
with the ith Bernoulli variable having the value 1 if the n
Bernoulli variables are no longer independent of each other; in
fact, their parameters pi may differ from one another, since p,
the probability of getting a success, depends on the number of
successes already drawn in the previous (i-1) objects. object is a success, 0 otherwise. However, the
If the sample size n is small relative to N,
then the probability of the object being a success will vary just slightly for
different values of i. In this case, the hypergeometric
distribution with parameters N, n and k will be
the sum of n (almost) independent Bernoulli variables with
parameter p = k / N. Thus, it can be approximated by the
binomial distribution with parameters n and p = k /
N.
The mean and variance of a random variable X having the
binomial distribution above is
These values are the same as the mean and variance of the
hypergeometric distribution above, except that the values for the
variances differ by the factor . term has a close to 1 for n small relative to N
|