| Quiz Makeup! Open book: redo
the problems you missed, finding the appropriate formulas from text or
webpages, hand in Friday by 2:30, with your original
quiz. If you didn't get your quiz back today, pick it up from outside
my door. If you're lost on what I was asking for, contact me for hints.
Hand in: 6.104 (p. 443) Plot n on the
x-axis and
z on the y-axis. Plot n on the x-axis and P-value on the
y-axis.
6.67 (one-sided), 6.66 (two-sided)
Table D
6.47 CI <==> sig
|
Read,
discuss
6.63c sig. 6.58, 9 Applet exploration of xbars and alphas 6.49"significant"
6.82 n and P: answers are .3821, .1711, .0013. 6.84 P, alpha
|
Optional
6.46 CI<==>sig
|
Questions on Testing HW? Day
31
Takehome exam will be available Friday.
Quiz returned. Many did pretty well, except too many people blanked on the binomial formula and especially the proportions. (Almost) Everyone remembered to sum the variances for X-Y! Makeup! Open book: redo the problems you missed, finding the appropriate formulas from text or webpages, hand in Friday by 2:30, with your original quiz. If you didn't get your quiz back today, pick it up from outside my door. If you're lost on what I was asking for, contact me for hints.
Significance tests use
an elaborate vocabulary, but the basic idea is simple: a result that
would
"rarely" happen if a claim were true--is good evidence that the claim
is
NOT true. Notes Day 30
Look at the results from the
shoeboxes:
Notice that there is always a chance of getting a
"somewhat
unusual" result from a population where the null hypothesis is
true.
And if the actual mean is not extremely different from the null, a
result
may not be detectably different from the null-hypothesis results.
A "Significance level" alpha
is a probability
level we decide on in advance as being the "rarely"
amount
that will push us over into believing (well, sort of) that the
H0
claim is not true.
Simple benchmarks: .10 (1 in 10), .05 (1 in 20), .01 (1 in 100).
When the P-value is less than (or equal to) a particular
significance
level alpha (say .05), we say,
"The results are significant at the alpha = .05
level," or "The results are significant (P < .05)" , or "Reject the
null hypothesis at level alpha = .05" Details Day
31
Try Applet: Statistical
Significance : Same setup as "P-value" applet only instead
of giving P-value it shows whether the xbar is in the "significant"
"rejection" region.
What if you don't have the Z-table but
only have the t-table (Table D)?
What if you have a demanded level of
significance,
alpha?
Table D: a
limited
list of probabilities across the top row:
= Right tail values for the bell curve distribution.
The
value in the bottom (z*) row under p is the corresponding
standard
normal value.
"z*
is the upper p critical value of the standard normal
distribution."
Do this: Find your z from
the data. Make a sketch of the normal curve and mark z on it.
Mark
the direction(s) of Ha.
(If your z is in the direction
of Ha , continue. Otherwise the results are hopelessly
not significant: you can quit.)
Find the two z*'s in Table D that bracket your
z
(ignore minus sign, using symmetry of Normal).
Find the corresponding
p's.
e.g. z =2.111
p
.02 .01
z* 2.054 \/
2.326
z = 2.111 So the
P-value for
your z is: between those 2 p's (one sided test)
between double those 2 p's (two sided test)
(Some versions of the table add another top line, for two-sided tests:
just Double the one-sided values)
Test is significant at the
bigger bracketing probability; not sig. at the smaller.
One sided: P-value
is less than .02 and greater than .01
Significant at the .02 level,not
at the .01 level
Two sided: P-value
is less than .04 and greater than .02
Significant at the .04 level,not
at the .02 level
If you have a specific demanded
significance
level, compare it with these levels.
If a test is significant at level b, then it is
significant
at every level bigger than b.
If a test is Not significant at level d, then it is Not
significant
at every level smaller than d.
"Significant at a":
probability of getting my results (again) by chance (if H0
is
true) is less than (or =) a.
Results
Significant at
Not significant at
p bigger
.10 .05
.01 .005 .001 smaller
/\
P-value
z-value (one-sided)
z* smaller
1.282 1.645 |
2.326 2.576 3.091 bigger
You
can compare z directly to z* for your desired alpha. The 2-sided is a
bit
tricky.
(2-sided: Split the alpha in 2, then find the z*. Don't
halve or double z's--it doesn't work!)
CI's and Two-sided
tests (pp. 413-14):
Your 95% CI
doesn't
include µo <==> Reject Ho
=
µo at the alpha = .05
level (Seems like common sense.)
Start here Friday, Yes Sec. 6.3
>>Don't do inference on data that
doesn't
look like probability-model data (All that bias, design flaws
stuff
was for this!) and check the data for weirdness (Ch. 1)
>>(Not in text any more?) How small a P is
"convincing
evidence" against H0?
In practice...beyond
the formal testing.
How
plausible is H0? Ha? Strong evidence
needed to reject "conventional wisdom."
How
expensive (mentally, economically) will abandoning H0 be?
(May need more than one set of data;
replicate, recast, refine.)
>> In reality,
no sharp border between "significance"
and "not significant"
>>"Statistically Significant" doesn't
always
mean "Important." (e.g. medicine: "Clinically significant.") Big
enough sample sizes will allow you to distinguish even small
differences.
>> Lack of significance--doesn't
prove
H0 true. Best we can do: "data are consistent
with (not inconsistent
with) H0 "
>>You cannot legitimately test a
hypothesis
on the same data that first suggested that hypothesis. Every
data set will turn up with some unusual pattern if you examine it hard
enough.
(If you
must explore and confirm with the same data set, one way is to
(randomly)
take half the data set, explore and generate hypotheses; then use the
other
half for confirmatory tests. You can use P-value to describe
unusualness, but be wary of making decisions with it if you didn't
expect
that particular unusualness.)
>>Multiple Tests: beware!
If you do 100
tests and use the alpha = .05 significance level for each, then
the structure of testing requires this:
When all 100 null
hypotheses H0 are true, out of your 100, about 5 of
the
100 (.05) will give "significant" results by chance alone (falsely
indicating the alternative hypothesis is to be preferred.)
Try Applet: Statistical
Significance with some randomly generated "shoebox" data,
true mean 20, alpha =.10, and see that you get "significant" results
about 1 in 10 times.
Moral: if you use the
testing mechanism as a screening instrument for many questions,
a proportion will give falsely significant results. You
can't
accept the results from such multiple tests as good evidence, only as
indicating
questions requiring further, more specific study. The game gives you
one
shot, not a hundred shots. (Dr. Pericak-Vance, Sci. Colloq. Sept.
28--screening for disease-causing genes)
- - - - - - - - - - -
Statistical inference in
a nutshell:
Am I surprised (If Hois
true)? (Do I reject null?)
How surprised? (give P-value)
What would not surprise me? (confidence
interval--estimate
the actual value)
(IPS: Testing is over-used, Confidence interval estimation
under-used)
Next: Brief look at issues of 6.4; then Ch. 7
| Sievers home | Math251-Fall07/Day2s32.htm | 10:30am | 11/07/07 |