Think again about two groups: Do they
have
a common mean?
H0 : µ1 - µ2
= 0 same as µ1 = µ2 , "no
difference"
In the Pooled-sample t distribution, we had:
--on the bottom of the
fraction
the estimate of the common standard deviation got by using each
data point once, taking the sum of squared deviations from the
individual
group means, dividing by (n1
+ n2- 2);
If we stop here instead of taking the square
root, we have what is called MSE, the Mean Squared Error:
it is the SSE (Sum of squares of the Error)
divided
by the degrees of freedom (n1
+ n2- 2).
The name "Error" is often replaced (as in SPSS)
by "Within Groups"
--on the top of the fraction we had xbar1
- xbar2 . A little fussing led to a t distribution when the
pop. means were =.
If we make the null hypothesis assumption that
the two groups have a common population mean µ,
we can think about the differences
(xbar1 -
µ)
and (xbar2 -
µ).
Since we don't know µ, substitute for it the
overall
sample mean xbarbar gotten by adding all the observations from
both
groups and dividing by (n1 + n2):
If µ is really the common mean,
then
(xbar1 - xbarbar)2
and (xbar2 - xbarbar)2
properly weighted, should give an
estimate
of the common variance.
The weighting is n1(xbar1
- xbarbar)2 + n2(xbar2
- xbarbar)2
:
This is called the SSG (Sum of Squares Between Groups).
Then divide by 2, the number of groups, and get
the MSG (Mean Squares Between Groups).
Take the fraction MSG/MSE. If
there
is a common mean, this should average around 1, since top and bottom
both
estimate the common variance. Its exact behavior follows the F
distribution family.
If there is not a common mean, then
the numerator MSG, which looks at the distances of the group means from
the common mean, will be bigger than expected, and the
MSG/SSG
ratio will be bigger than expected. The P-value is the
right
tail of the F distribution.
More groups than 2? Just add in more
terms.
The not-nice thing is that Ha is "not all groups have equal
means" (notice that's not the same as "all groups have different
means".
Sorting out how different the groups are is messy.
| Sievers home | Math251-Fall05/anova.htm | 12/5/05 |