| Hand in Monday: Sketching xbars for H0, p-value p. 323, 6.25 SSHA 6.26 Spending on housing - - - - - - - - - - - - - - - - - - - Stating null and alternative hypotheses p. 325 6.27, 28, 29, 30 - - - - - - - - - - - - - - - - - - - Calculating p-value (one-sided), relating to Sig. level p. 328, 6.31 and 32 (extending 6.25 and 26) 6.33 restating jargon - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - READ these: Decide which are One-sided and do those. All will be part of Day 34, to hand in Wednesday. Calculating p-value (one or two-sided), using z test statistic, relating to Sig. level p. 333, 6.34 price reduc. on coffee 6.35 crankshafts true? Use your calculator to find the sample mean. 6.36 cola? Use your calculator to find the sample mean. - - - - - - - - - - - - - - - - - - - - - - More p-values p.341, 6.44 CEO pay. Keep a copy of your z test statistic for use in 6.48 next time. p. 343, 6.54 knife edge .05 p. 345, 6.55 and 56 effect of n = = = = = = = = = = = = = = = = *These will MAYBE be part of Monday's assignment (& on the exam) *p. 342, 6.52 1% vs 5% * 6.53 define stat. signif. p. 341, *6.46, general z statistic, significance,(6.49 will be assigned too.) p. 342 *6.50 patent protection; another z. |
Read, to discuss |
Optional (more practice) Stating null and alternative hypotheses |
>> CI
quiz If you missed
the quiz Wednesday, you may take the quiz, today after class or
by arrangement.
>>HW questions?
Cautions on Confidence intervals:(pp. 312-13) Our formula depends on SRS.
Nonresponse or other selection bias can destroy our conclusions.
Outliers, skewness can
mess us up. Nonnormality can mess us up, esp. if sample size is < 15.
These cautions will hold
for Significance Testing also.
Significance
testing
Introduction Day32
Example: H0:
µ =1000 hrs. (Average
lightbulb life.) Design a
competing bulb: Show it's better.
Ha:
µ > 1000 hrs.
Sample of size n = 25.
Population sigma = 150 hrs. Get xbar = 1060 hrs. Are
these bulbs better?
z
= (1060-1000)
÷
(150/5) = 2.
P(Z
> 2) = .0228 More than 2% and less than 3%
chance of getting a result this high if we did it again.
A "Significance level" alpha is a probability level
we
decide on in advance as being the "rarely" amount that
will
push us over into believing (well, sort of) that the H0
claim is not true. (Historically older
language
than P-value)
We tend to use simple benchmark numbers for it, like .10 (1 in 10),
.05 (1 in 20), .01 (1 in 100).
When the P-value is less than (or equal to) a particular
significance
level alpha (say .05), we say,
"The results are significant at the alpha = .05
level," or "The results are significant (P< .05)"
A particular scientific discipline may have a commonly accepted set
of benchmarks, and language to go with it. (I think I
remember
.05 = "significant", .01 = "highly significant" in psychology?)
We will be less doctrinaire, use the language "significant at the alpha
= ___ level."
(However, "nobody" uses a significance level less rare
than .10, 1 in 10).
Back to lightbulb, H0:
µ =1000 hrs. (Average
lightbulb life.) Competing bulb:
Show it's better.
Ha:
µ > 1000 hrs. (one-sided)
P(Z > 2) = .0228 More than
2% and less than 3%
chance of getting a result this high if we did it again.
"Significant at the alpha =.03 level. Also at the alpha = .05
level"
"Not significant at the alpha = .02 level. Also not significant
at the alpha = .01 level"
Start here Monday
Shoebox results: From Last year's
dotplot
White #s
(green box) 2/17 = 11.8% of xbars found are significant. at 10%
Yellow #
(red top box) 13/16 are sig. at 10% If µ is bigger
than
20 by a goodly amount, the test successfully detects this.
(this year?)
2-sided
(2-tailed) test:
H0: "Null hypothesis" A
claim or statement about the population we would like to show
is
NOT true.
H0:
µ =1000 hrs. (Average lightbulb life.)
Ha: "Alternative hypothesis"
A claim or statement about the population we are trying to
find
evidence FOR. A value either much bigger than or much
smaller
than the H0 value is evidence against H0 &
for
Ha.
Ha: µ Not = 1000 hrs. (Quality
control
on assembly line--find if it is "off" either way.)
Sample of size n = 25.
Population sigma = 150 hrs. Suppose xbar = 940 hrs. z
= (940-1000)
÷
(150/5) = - 2
P(Z
< - 2) = .0228
P-value: We measure the
probability
of seeing something (again) as extreme as the observed
value
(or more so).
So you need to measure the P-value symmetrically
both directions from the observed value--so the P value is double
what it would be for a one-sided test. P-value
is approximately 5%; more precisely, 2·.0228 =
.0456
Our test is just
barely significant at the .05 level; it is
significant at the .06 level, the .10 level. It's not significant
at the .02 level or "higher".
Meaning
of "significance"
(note--"High" significance means small alpha or P-value.)
Question: How do we know that .05 is
"significant?"
(.05
is 1 in 20 chance of seeing the result by "dumb luck" if the null
hypothesis
is true.) Read sec. 6.3, pp. 343-345
>>Significance levels vary by field of
study;
different fields have different "customarily acceptable" levels.
In reality, no
sharp border between "significance" and "not significant"
>>How small a P is "convincing evidence"
against
H0? In practice...
How
plausible is H0? Ha? Strong evidence
needed to reject "conventional wisdom."
How
expensive (mentally, economically) will abandoning H0 be?
>>"Statistically Significant" doesn't
always
mean "Important." (e.g. medicine: "Clinically significant.") Big
enough sample sizes will allow you to distinguish even small
differences.
| Sievers home | Math151-Fall04/Dayf33.htm | 9pm | 11/12/04 |