Login

Welcome, Guest. Please login or register.

April 20, 2024, 02:10:36 am

Author Topic: Unit 4 Common statistical distributions  (Read 1118 times)

0 Members and 1 Guest are viewing this topic.

RuiAce

  • ATAR Notes Lecturer
  • Moderator
  • Great Wonder of ATAR Notes
  • *****
  • Posts: 8814
  • "All models are wrong, but some are useful."
  • Respect: +2575
Unit 4 Common statistical distributions
« on: October 18, 2019, 06:12:27 pm »
+5
Remember to register here for FREE to ask any questions you may come across in your QCE studies!

There are two main statistical distributions you need to familiarise yourself with in this course. Note that your graphics calculator will definitely have a method of computing any required probabilities with these distributions for you!

Binomial distribution
The binomial distribution models the situation where a success/failure experiment is attempted \(n\) times, each attempt independent of one another. The probability of one success is \(p\).

It is a discrete distribution, with parameters \(n\) and \(p\) as above. The probability function (probability mass function) of a random variable \(X\) with a binomial distribution is
\[ P(X=x) = \binom{n}{x}p^x (1-p)^{n-x}, \]
where \( \binom{n}{x} =\frac{n!}{x!(n-x)!}\). Here, \(x\) counts the number of successes we require.

The corresponding mean and variance are
\[ E(X) = np,\quad \operatorname{Var}(X) = np(1-p). \]
A classic use of a binomial random variable is in tossing a regular six-sided die. If we wish to count the number of 6's we make out of 100 tosses, we can model with a binomial random variable \(X\) with \(n=100\) and \(p=\frac16\). Then we would have expectation and variance
\[ E(X) = 100\times\frac16 = \frac{50}3\text{ and }\operatorname{Var}(X) = 100\times\frac16\times\frac56 = \frac{125}{9} \]
on the number of 6's obtained. For example, if we require the probability of rolling two 6's, we'd in theory have to compute
\[ P(X=2) = \binom{100}{2} \left( \frac16\right)^2 \left( \frac56\right)^{98}. \]
(Let's leave that as something for the calculator.)

If you're interested in a (somewhat huge) challenge, check out the first question here and the reply two posts below! This is more of a probabilistic question with the binomial distribution, but it uses conditional probability.

Normal distribution
The parameters of the normal distribution are literally its mean \(\mu\) and variance \(\sigma^2\). So if \(X\) follows a normal distribution with parameters \(\mu\) and \(\sigma^2\), we immediately have
\[ E(X) = \mu,\quad \operatorname{Var}(X) = \sigma^2. \]
The density of a normal random variable \(X\) with mean \(\mu\) and variance \(\sigma^2\) is
\[ f(x) = \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} .\]
(Some common alternate ways of writing the density function are \( f(x) = \frac{1}{\sqrt{2\pi}\sigma }e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) and \(f(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac12 \left( \frac{x-\mu}{\sigma} \right)^2}\).)

This function cannot be integrated using common mathematical antiderivatives and hence its cumulative distribution function \(F(x)\) cannot be properly expressed. Hence, any probabilities involving the normal density must require information given in advance, or computed on the calculator. (Your calculator's answers will always be accurate enough for use.)

Note that for the normal distribution, you may be required to use your calculator to compute quantiles as well. For values of \(\alpha\) between 0 and 1 (i.e. \(\alpha \in (0,1)\)), the upper \(\alpha\)-th quantile \(t_\alpha\) is defined to be the solution of
\[ P(X> t_\alpha) = \alpha, \]
or equivalently
\[ 1 - F(x) = \alpha. \]
The lower \(\alpha\)-th quantile instead satisfies \( P(X< t) = \alpha\). Your syllabus seems to prefer the upper quantile in the glossary.

In the real world, sometimes statisticians need to make a judgement on when the normal distribution is valid. But there are some common examples nonetheless:
- Heights
- Weights
- Errors in measurement
- Standardised testing
- Normal approximations (e.g. approximation to the binomial)
- Approximate distribution of the sample mean (explored in spesh)

It is likely that you'll be told in the exam when to use the normal distribution.