| Hand in (All D&V)NOTHING:
But read and try, at least to the + + + line.
Chapter 20, p. 386: 1, 2, Hypotheses 3 Negatives (meaning of P) 4 Dice (Meaning of P) Problem is badly written! The null hypothesis they have in mind is that the die is fair; we want to collect evidence to assess the strength of the seller's claim that it is loaded. (Saying "we don't believe " his claim makes it look like his claim that it's loaded should be the null, and we want to assess the evidence refuting that claim.) 5, 6 Relief, Cars (Conclusions from P), 9 Dowsing (doing a test) 13 Pollution (doing it) 19 Women executives 22 Acid rain (inc. CI) A. a) Use your greeen shoebox result to do a One-sided test against the null hypothesis p = .5, with alternative HA: p < .5. + + + + + + + + + + b) Use your greeen shoebox result to do a Two sided test against the null hypothesis p = .5. 7, 8 Find the mistakes From ActivStats: MRA-304-2: Kerrich Coin Toss While he was a prisoner of the Germans during World War II, the British statistician John Kerrich tossed a coin 10,000 times. He got 5067 heads. Take Kerrich's tosses to be an SRS from the population of all possible tosses of his coin. If the coin is perfectly balanced, p = 0.5. Is there reason to think that Kerrich's coin was not balanced? |
Read,
to discuss |
Op
tion al |
Homework questions? Day
31
Sample
size for desired ME and C Day
30
Level C confidence interval estimate
of
population proportion p:
"One -proportion z-interval"
Start here Wednesday:
Why this ME "works".
Day
30
Lots of machinery and vocabulary:
NULL Hypothesis Ho : (Straw
man we collect evidence against. Status
quo.)
Assume Ho is true. Look at evidence
(data). Is it inconsistent with Ho ? Then Reject
Ho .
(How inconsistent with Ho is the data?
a little, somewhat, very? how do we measure it? Turn into numbers---)
Ho : a specific model for the population, with
a specific parameter value.
example (suppose I hadn't told you...): Green shoebox
is full of 0's and1's. I tell you Equal numbers.
Ho : p = .5 (proportion
of 1's is 50%) po for a general label.
Is your sample (n = 30) far enough
away from .5 to say that I'm lying? Suppose you believe I undersupplied
1's:
HA : p < .5 (one-sided
alternative
hypothesis: What you
hope /fear /would
like to prove)
How do we measure "far enough away?"
IF Ho
is true: how far out (weird) is your p-hat?
IF p
= .5, how far from the "real" p is your p-hat?
po = .5
Distribution of p-hats is approx. N(
),
N( .5, sqrt(.5 ·.5/30)), N( .5, .091) (Usual assumptions.)
Suppose you got 12 1's. p-hat
= .4. IF p
= .5, p-hat = .4 has a z-score of -.1/.091 = - 1.095 ~ -1.10 Sketch
the Normal.
If you know your z-scores,
this is meaningful. A more universal measure is the
P-value: The
probability, assuming Ho
is true, of observing the result we have (or one more extreme)--if
we could do the experiment again... Strength of evidence against
Ho
(thus
for
HA)
In our example: The probability of getting a p-hat of
.4 or below, IF p = .5.
Sketch on the curve.
The "tail" below z = -1.10. From normal table, .1357 ~ 13.6%.
So
P-value = .136. Not so unusual;
happens more than 1 in ten times (13-14 in a hundred). Suggestive
but not "significant" by most people's standards.
Start here Friday
Example: U.S. Consumer Product Safety Comm. says 90%
of American homes have smoke detector(s). Fire department runs a big publicity
campaign to raise awareness and use. Have they raised
the level in this city? Data: Building inspectors visit 400
(random) homes, find 376 have detectors. p is
the proportion of detectors in the population (the whole city)
--Hypotheses: Ho
: p = .9 (unchanged after campaign),
HA : p > .9 (raised
after campaign)
--Model: Want to do One-proportion z-test.
Independence? Yes, random sample. n = 400. Population (city) > 4000
homes (10% rule). .9x400 = 360, .1x400 = 40 so success/failure rule
met. Sample proportion p-hat modeled by Normal OK.
--Mechanics: n = 400, successes x =
376. p-hat = .940. SD(p-hat) = sqrt(.9
·.1/400)= .015 since we're assuming Ho
true. z=(.940-.9)/.015 = .04/.015 = 2.67.
One-sided alternative, evidence for it is to the right. So P-value
is proportion in the tail above z = 2.67; P=P(z> 2.67) = 1 - .9962
= .0038 ~ 0.4%
--Conclusions: P-value is very low: Strong
evidence that the city proportion has risen. (Reject Ho)
Detail: IF Ho : p = .9 is true (the city
proportion is really unchanged) THEN the chances of us seeing a proportion
as high as .94 in another sample of 400 is only about 4 in a thousand.
(We prefer to believe that the alternative hypothesis is true than to believe
that we got such a "lucky" sample.)
Suppose (as here) we do Reject Ho in favor of HA.
We are pretty sure the true p is different from po but by how
much?
(The "size of the effect" = ptrue
- po.) Still need an estimate of the true p. Use a CI
to estimate ptrue.
--Estimate: How much has the proportion
changed? Confidence Interval. Now use SE(p-hat)= sqrt(.94
·.06/400) = .0119 since were not assuming the
null hypothesis is true. 95% CI 's ME = 1.96 · .0119 = 0.023324~
.023.
95% CI: .94 + .023, I'm 95%
confident the real proportion is between 91.7% to 96.3%.
Statistical inference in
a nutshell:
Am I surprised (If Hois
true)? (Do I reject null)
How surprised? (give P-value)
What would not surprise me? (confidence interval)
More about the
Alternative hypothesis: Null
hypothesis is often a particular parameter value. Alternative is
something "different."
Why? are you doing a test. Back to
shoebox:
HA
: p < .5 You have reason to believe I skimped on the
1's. One-sided
OR HA : p > .5
You have reason to believe I put in more 1's than 0's. One-sided
OR HA : p not = .5
You believe the 0's and 1's are not equal, but don't know which way.
Two-sided.
P-value concept needs refining:
For One-sided alternatives,
P-value is the single "tail" beyond our observed
statistic, in the direction of the alternative hypothesis.
For a Two-sided alternative,
P-value is "double the tail" beyond our observed
statistic, because we could be "as or more extreme" in either direction!
(Measuring how weird our observation is, if Ho is
the case.)
Example: Shoebox, 12/30 1's. We got z = - 1.10.
If your alternative is HA :
p not = .5,
there is probability .136 below z = - 1.10, and probability
.136 above z = + 1.10,
so the P-value = 2· .136 = .272. 1
in 4? Not unusual at all. Can't reject Ho
.
Example. Look at Therapeutic touch.
Why?
are you testing.
Activstats 20-2, activity 2, HA
: p not = .5 The detection is different from chance
(better,
or worse)
Step-by-step p. 390, Ch. 21
HA : p > .5 Can detect
"energy field" better than chance.
n = 150, observed successes 70. p-hat = .467.
(notice, less able to detect than just chance)
z-score is - .825 if you
round SD to .04(text), -.809 if you round to .0408 (AS). (so
small differences in normal table.)
HA : p not
= .5 There is .2090 to the left of -.81; approximately .21 .
Two-sided P value is .21 + .21 = .42. No evidence for ability different
from chance.
HA : p > .5
The statistic is actually on the wrong side of the middle.
So the P-value is still the "up" side; the probability of seeing a p-hat
greater than that observed. So the P-value is 1- left tail
value. Using the above, P = 1-.21 = .79. The chance of seeing
a p-hat greater than what we observed (if they can't detect better than
chance) is more than 3 out of 4. Unsurprising result, no evidence
that they can detect energy field.
One or two sided? p. 390 Statististicians
differ philosophically. Some much prefer 2 sided all the time. ("How
can we really know which way things are changed/different?" If it's
clear you're expecting to see / looking to prove a particular direction,
many (most? Me.) use one-sided. (D differs from V, I think.)
Say up front. DON'T decide on one-sided/which side after you've
seen the data. That's cheating, statistically.
Skip for now "A Better Confidence Inteval..." p. 383.
We do a lot of approximating to get our CI's. This turns out to give trouble,
especially for p's closeish to 0 or 1. This is a nice "fix;" relatively
new (1998). Adds nothing to conceptual understanding.
| Sievers home | Math151-Sp05/Days32.htm | 5pm | 4/20/05 |