Math 151 , Spring 2005, Day 32 Wednesday, April 20 Hit reloadAfter class, really

Exam 3, Friday April 29 (Day 36)  Covers work Days 24 ? thru Monday Day 34
Day 32(WEdnesday, April 20): Reading: Chapter 20+21 thru p. 392 (Activstats is good here too.)  Then continue (Alpha levels) through 397.  Lightly through Power . Read What can go wrong p. 401 and the rest. (SPSS won't do proportion computations, but some other programs do; it's good to have an idea what you might see, p. 402.)
Please respond to my email about the textbook choice!
Hand in (All D&V)NOTHING: But read and try, at least to the + + + line.
Chapter 20, p. 386: 
1, 2, Hypotheses
Negatives (meaning of P)
Dice (Meaning of P) Problem is badly written!  The null hypothesis they have in mind is that the die is fair; we want to collect evidence to assess the strength of the seller's claim that it is loaded.  (Saying "we don't believe " his claim makes it look like his claim that it's loaded should be the null, and we want to assess the evidence refuting that claim.)
5, 6 Relief, Cars (Conclusions from P), 
9 Dowsing (doing a test)
13 Pollution (doing it)
18  I like others better
19 Women executives
22 Acid rain (inc. CI)
A. a) Use your greeen shoebox result to do a One-sided test against the null hypothesis p = .5, with alternative HA: p < .5.
+ + + + + + + + + +
    b) Use your greeen shoebox result to do a Two sided test against the null hypothesis p = .5.
7, 8 Find the mistakes
From ActivStats:
 MRA-304-2:  Kerrich Coin Toss  While he was a prisoner of the Germans during World War II, the British statistician John Kerrich tossed a coin 10,000 times.  He got 5067 heads.  Take Kerrich's tosses to be an SRS from the population of all possible tosses of his coin.  If the coin is perfectly balanced, p = 0.5.  Is there reason to think that Kerrich's coin was not balanced?
Read,
  to 
discuss 
Op
tion
 al 
Please respond to my email about the textbook choice!  I have (a little) over 50% response---
Do-overs for Exam 2 are due Monday!  See Comments link from the day I handed back exams.
You found the 68% and 95% CI's for your sample., Please add your results to our list:  (with your initials)
# of 1's, p-hat, SE(p-hat), p-hat + SE, ME for 95% = 1.96SE, p-hat + 1.96SE
Also draw your 68% CI on the graph circulating    |------o------|

Homework questions? Day 31
 Sample size for desired ME and C Day 30
Level C confidence interval estimate of population proportion p:
 "One -proportion  z-interval"


Start here Wednesday:
  Why this ME "works". Day 30



Tests:  (Chapter 20, for proportions) You have a hypothesis about the world. And some data.
Does the data lend support to the hypothesis, or is the data inconsistent with the hypothesis?
      (Retain / fail to reject the hypothesis)                       (Reject the hypothesis)
Easier to reject a hypotheses than to show that it's true.

Lots of machinery and vocabulary:
NULL Hypothesis Ho : (Straw man we collect evidence against. Status quo.)
Assume Ho is true.  Look at evidence (data).  Is it inconsistent with Ho ? Then Reject Ho .
  (How inconsistent with Ho is the data?  a little, somewhat, very?  how do we measure it?  Turn into numbers---)

Ho : a specific model for the population, with a specific parameter value.
example (suppose I hadn't told you...):  Green shoebox is full of 0's and1's.  I tell you Equal numbers.
Ho : p = .5 (proportion of 1's is 50%)  po for a general label.
    Is your  sample (n = 30) far enough away from .5 to say that I'm lying? Suppose you believe I undersupplied 1's:

HA : p < .5  (one-sided alternative hypothesis:  What you hope /fear /would like to prove)
How do we measure "far enough away?"
IF Ho  is true: how far out (weird) is your p-hat?
IF  p = .5, how far from  the "real" p is your p-hat?  po = .5
                 Distribution of p-hats is approx.  N(), N( .5, sqrt(.5 ·.5/30)),  N( .5, .091)  (Usual assumptions.)
    Suppose you got 12 1's.  p-hat = .4. IF  p = .5, p-hat = .4 has a z-score of -.1/.091 = - 1.095 ~ -1.10 Sketch the Normal.
        If you know your z-scores, this is meaningful.  A more universal measure is the
P-value:  The probability, assuming Ho  is true, of observing the result we have (or one more extreme)--if we could do the experiment again...  Strength of evidence against Ho (thus for HA)
   In our example: The probability of getting a p-hat of .4 or below, IF p = .5. Sketch on the curve.
                   The "tail" below z = -1.10.  From normal table, .1357 ~ 13.6%.  So
    P-value = .136.  Not so unusual; happens more than 1 in ten times (13-14 in a hundred).  Suggestive but not "significant" by most people's standards.
Start here Friday
Example:  U.S. Consumer Product Safety Comm. says 90% of American homes have smoke detector(s). Fire department runs a big publicity campaign to raise awareness and use.  Have they raised the level in this city?  Data:  Building inspectors visit 400 (random) homes, find 376 have detectors.   p is the proportion of detectors in the population (the whole city)
--HypothesesHo : p = .9  (unchanged after campaign),    HA : p > .9  (raised after campaign)
--Model:  Want to do One-proportion z-test.  Independence? Yes, random sample. n = 400. Population (city) > 4000 homes (10% rule).  .9x400 = 360, .1x400 = 40 so success/failure rule met.  Sample proportion p-hat modeled by Normal OK.
--Mechanics:    n = 400, successes x = 376.  p-hat = .940.   SD(p-hat) = sqrt(.9 ·.1/400)= .015 since we're assuming Ho true.  z=(.940-.9)/.015 = .04/.015 = 2.67.   One-sided alternative, evidence for it is to the right.  So P-value is proportion in the tail above z = 2.67; P=P(z> 2.67) = 1 - .9962 = .0038 ~ 0.4%
--Conclusions:  P-value is very low:  Strong evidence that the city proportion has risen.  (Reject Ho)
Detail:  IF Ho : p = .9  is true (the city proportion is really unchanged) THEN the chances of us seeing a proportion as high as .94 in another sample of 400 is only about 4 in a thousand.  (We prefer to believe that the alternative hypothesis is true than to believe that we got such a "lucky" sample.)

Suppose (as here) we do Reject Ho in favor of HA.  We are pretty sure the true p is different from po but by how much?
    (The "size of the effect" = ptrue - po.)  Still need an estimate of the true p. Use a CI to estimate ptrue.
--Estimate:  How much has the proportion  changed?  Confidence Interval.  Now use SE(p-hat)= sqrt(.94 ·.06/400) = .0119 since were not assuming the null hypothesis is true.  95% CI 's ME = 1.96 · .0119 = 0.023324~ .023.
95% CI:  .94 + .023,  I'm 95% confident the real proportion is between 91.7% to 96.3%.

Statistical inference in a nutshell:
Am I surprised (If Hois true)? (Do I reject null)
    How surprised? (give P-value)
  What would not surprise me?  (confidence interval)

More about the
Alternative hypothesis:  Null hypothesis is often a particular parameter value.  Alternative is something "different."
   Why? are you doing a test.   Back to shoebox:
       HA : p < .5  You have reason to believe I skimped on the 1's.         One-sided
OR  HA : p > .5   You have reason to believe I put in more 1's than 0's.  One-sided
OR  HA : p not = .5  You believe the 0's and 1's are not equal, but don't know which way. Two-sided.

P-value concept needs refining:
    For One-sided alternatives, P-value is the single "tail" beyond our observed statistic,  in the direction of the alternative hypothesis.
   For a Two-sided alternative, P-value is "double the tail" beyond our observed statistic, because we could be "as or more extreme" in either direction!   (Measuring how weird our observation is, if  Ho is the case.)

Example:  Shoebox, 12/30 1's.  We got z = - 1.10.  If your alternative is HA : p not = .5,
   there is probability .136 below z = - 1.10, and probability .136 above z = + 1.10,
   so the P-value = 2· .136 = .272.  1 in 4?   Not unusual at all.  Can't reject Ho .

 Example. Look at Therapeutic touch. Why? are you testing.
      Activstats 20-2, activity 2, HA : p not = .5  The detection is different from chance (better, or worse)
      Step-by-step p. 390,  Ch. 21    HA : p > .5  Can detect "energy field" better than chance.
    n = 150, observed successes 70.  p-hat = .467.   (notice, less able to detect than just chance)
        z-score is - .825 if you round SD to .04(text), -.809 if you round to .0408 (AS).  (so small differences in normal table.)
   HA : p not = .5  There is .2090 to the left of -.81; approximately .21 .  Two-sided P value is .21 + .21 = .42.  No evidence for ability different from chance.
   HA : p > .5   The statistic is actually on the wrong side  of the middle.  So the P-value is still the "up" side; the probability of seeing a p-hat greater than that observed.  So the P-value is 1- left tail value.  Using the above, P = 1-.21 = .79.  The chance of seeing a p-hat greater than what we observed (if they can't detect better than chance) is more than 3 out of 4.   Unsurprising result, no evidence that they can detect energy field.

One or two sided? p. 390   Statististicians differ philosophically.  Some much prefer 2 sided all the time. ("How can we really know which way things are changed/different?"  If it's clear you're expecting to see / looking to prove a particular direction, many (most? Me.) use one-sided. (D differs from V, I think.)  Say up front.  DON'T decide on one-sided/which side after you've seen the data.  That's cheating, statistically.
Skip for now "A Better Confidence Inteval..." p. 383.  We do a lot of approximating to get our CI's. This turns out to give trouble, especially for p's closeish to 0 or 1.  This is a nice "fix;" relatively new (1998).  Adds nothing to conceptual understanding.


Sievers home  Math151-Sp05/Days32.htm  5pm 4/20/05
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.