Hand in |
Read, to discuss |
Optional (more practice)
|
Exam 4's not finished--sorry!.
Final exam: Wed. Dec. 17, 9-12 a.m. If this is a problem for you,
please email me soon.
Alternative--Monday afternoon, Tuesday
morning/afternoon?? (Email your possibilities; I'll pick one!)
Full exam schedule is at http://www.wells.edu/pdfs/finals.pdf
Registrar's page with link to this and other
good stuff: http://www.wells.edu/academic/regist.htm
"Statistics means
never having to say you're
certain."
Confidence interval Estimation made our best guess at an
unknown population mean.
Testing will investigate a claim made that the
unknown
mean is actually a particular value.
~~~~~~~~~~~~~~~~
Ch. 15: "Significance tests use
an elaborate
vocabulary, but the basic idea is simple: an outcome that would
"rarely" happen if a claim were true--is good evidence that the claim
is
NOT true." (p.363 top)
Need machinery to analyze less "obvious" results--build in
effect
of
standard deviation and sample size
Shoeboxes
(white and
yellow
slips): Take a sample of size 4 from each,
record,
return numbers.
I claim the
mean value for both shoeboxes is µ = 20.
Am I telling you the truth? I can't remember for sure. I do
know that the distribution in the box is normal, standard
deviation
is 4.
I do remember that if µ
is not 20, then it is greater than 20. µ > 20.
Take a sample of size 4, find
xbar. Once for each shoebox! (should have found xbar
already)
How far from 20 is it?
far enough that I believe the mean is not 20??
<>Measure your xbar's
distance from 20 in standard deviations of Xbar's. (That is, find
z for xbar, assuming µ = 20. Note s.d. for sampling dist of xbar
is 2 (why?) ). Example: If I got an
xbar = 24, z = (24-20)/2 = 2
s.d.'s above mean.
Is this a far-out value of z? Look
in the normal table to see how much probability is in the tail to
the right of it--gives a measure of far-out-ness independent of
distribution ("P-value"). Prob.
above 2 is about half of 5%, or .025, more exactly (1-.9772) =
.0228. IF the mean is really 20, I would see an xbar as
high as 24 (or higher) about 2 to 3 in a hundred times. So xbar =
24 is pretty strong evidence that real mean isn't 20.
Your shoebox
results: Do (one of) yours now, if you haven't. If you have, look
around for someone who needs help.
Write your xbars (if
you haven't) , z's, p-values, p<.10 (one on each paper--yellow or
white)
and make a dot for each on the circulating dotplot.
See Day 35 for the rest of the
notes: Brief overview:
The game:
Before taking data, define
H0: "Null hypothesis" A claim or statement about
the population we would like to show is NOT true.
Stated usually as: A parameter = a
particular value. H0: µ =1000
hrs. ("Average lightbulb life".) H0: µ =20 (shoebox mean=20)
Ha: "Alternative hypothesis" A claim or statement about
the population we are trying to find evidence FOR.
Stated usually as: The parameter is >, or
<, (one-tail tests) -- or NOT = the particular value.
(two-tail)
Ha: µ >
1000 hrs. (Or Ha: µ
< 1000 hrs. Or Ha: µ
Not = 1000 hrs.)
Ha:
µ >20 (shoebox mean >20)
Take data. Calculate test statistic,
usually based on one that estimates the parameter in the
hypotheses. For µ, test statistic is the z-score of xbar,
so a big z-score number means that xbar is far from µ.
Is it an unlikely
result if H0 is true? Then that is
evidence
against
H0.
Measuring the strength of the evidence against H0 (a
common measuring stick for all distributions and parameters):
P-value of
a test: The probability, computed assuming that H0 is true,
that the observed outcome would take a value as extreme or more
extreme than what we actually observed (if we could repeat
taking-data again). p. 368.
Size of tail(s) farther out than the observed value.
The smaller the P-value, the stronger the data's
evidence against H0 ( for Ha).
For a test of µ , using xbar (sigma
known),
the P-value is
--the area of the tail beyond the observed xbar, in the
direction of Ha (one tail)
(--or twice that area (two-tail).) We'll go over two-tail P-values in more detail after
doing one-tail.
We usually calculate it by standardizing the observed xbar (assuming
H0 true) and looking in the normal table. (p. 369 on)
<>Applet:
P-value of a test of significance automates this. (Uses "raw"
scale of xbars, rather than z-scores). How
to. Use as check, guide.
Continue with new
material ..!
Start with understanding "null and alternative hypothesis,
p-value." Those are the foundation. Then
A "Significance level" alpha is a probability level we decide
on in advance as being the "rarely" amount that will push us over
into believing (well, sort of) that the H0 claim is
not true. (Historically older language than P-value)
We tend to use simple benchmark numbers for it, like .10 (1 in 10), .05 (1 in
20), .01 (1 in 100).
When the P-value is less than (or equal to) a particular significance
level alpha (say .05), we say,
"The results are significant at the alpha = .05 level,"
or "The results are significant (P< .05)" . Giving actual P is better,
if you can.
Lightbulbs: One-sided: .0228 = P-value. More than 2% and less than
3% chance of getting a result this far out (in this direction) if we did it
again.
"Significant at the alpha
=.03 level. Also at the alpha = .05 level" (P-value says,
rarer than these levels)
"Not significant at the
alpha = .02 level. Also not significant at the alpha = .01 level"
(P-value says, more common than these levels)
Two-sided: .0456 = P-value. (Barely)
less than 5% chance of getting a result this far out if we did it again.
"Significant
at the alpha = .05 level. (Also at alpha = .10). Not significant
at the alpha = .04 level. Nor .01 level.
Applet: Statistical Significance
You can pick the alpha you desire, and see if your x-bar lies outside the "alpha"
barrier(s). (approach of p. 376-79) But P-value is more informative.
A particular scientific discipline may have a commonly accepted set of
benchmarks, and language to go with it. (I think I remember
.05 = "significant", .01 = "highly significant" in psychology?)
We will be less doctrinaire, use the language "significant at the alpha = ___
level."
(However, "nobody" uses a significance level less rare than .10, 1 in
10).
We finished this; the Internet connection failed so couldn't demonstrate the Applet. Next time, HW repair, using the Applets, more on P-value/significance, and using Table C. "In practice" (Ch 16) things to watch out for.
| Sievers home | Math151-Fall08/Dayf37.htm | 2:30pm | 11/24/08 |