Math 151 , Day 37, Monday, November 20, 2006 After class Hit reload .

HW Day37   Continue  Ch. 15, to p. 376. Then 377-79, lightly. Optional: Two-sided Tests from Confidence intervalspp. 379-80 Check: p. 381  Hypotheses: 15.26, 27.  Test statistic15.28.   P-value (one-sided) 15.31, 32.  P-value (two-sided) 15.29, 30, 33.  Significance 15.34.  Test<--CI (0ptional) 15.35  Which of the answers to 35 is self-contradictory?  Which one makes logical sense?
Hand in  MONday . 
A. Simulation of shoeboxes: Use Applet:  P-value of a test of significance
H0: µ =20.
Ha:  µ  > 20     n = 4,   sigma = 4   (do Update. Not Reset-- Reset returns to mean 0)
a)  First simulate the shoebox where the mean is actually 20.  At the bottom of the picture, enter 20 in "the truth about the mean is" box.  Do Generate Sample 25 times; each time record from the picture  the P-value for that sample.
Find the number, and the proportion,  of your 25 in which the P value is < .10.   (Example:  if 3 of your 25 had P's at .10 or below, 3/25 = .12 would have the P value < .10.
b) Now simulate the shoebox where the mean is actually 24.  Leaving the top numbers the same, at the bottom of the picture, enter 24 in "the truth about the mean is" box.  Do Generate Sample 25 times; each time record from the picture  the P-value for that sample.   Notice the picture is still for the null hypothesis, but the sample values are usually "high".
Find the number, and the proportion,  of your 25 in which the P value is < .10.
Hand in your lists of xbars, and Be ready to add your numbers and proportions to the circulating sheet on Monday.

P-value and significance
p. 372, 15.15, 16 (more of the same problems.  #15, P = .1894., # 16, P = .0359 )
p. 372 15.17 ultramarathoners

Setups and Calculations.  Use the Applet:  P-value of a test of significance to check your work.
Use Table A (normal table) to find P-value: then also see what P's your z is between according to table C.
p. 376, 15.18 Water quality
p. 376 15.19 SAT  Check the mean you calculate in the back of the book.
p. 382, 15.37 IQ test scores
p. 383, 15.38 & 41  hotel managers
p. 383 15.40 Sample size affects p-value  Use the Applet:  P-value of a test of significance to do both n = 18 and n = 75.  Hit Update between. Use the same xbar of 17 in both (below the picture).  Show P.  Notice the scale change and the xbar value move "out".  Repeat, comparing n = 18 , n = 28,  n = 38.  Here the written scale on the x-axis doesn't change (so xbar stays in the same place) but the normal curve visibly narrows as n increases.
p. 383  15.42 Supreme Court

p. 383, 15.43  wrong alternative
p. 383, 15.44 the wrong p
p. 383, 15.45 Placebo effect, make hypotheses.

&&Postpone yet again & & & Leftover problems from Day 30 & & & & & & & &
          These ideas are related to those in Ch. 15.
p. 290, 11.39 Pollutants in auto exhausts  For 11.39:  You might want to know L so that if you tested your 25 cars and found a high value of x-bar, you would be able to compare it with L; if it was greater than L, you would go back to the manufacturer and say "I  believe you sold me a batch of bad cars, because the chances of getting an average emission level this high if the exhaust system is working properly is only 1 in 100. It is more reasonable to believe the exhaust system is not working, than that we "are" that 1 in 100 possibility."
  p. 290,  11.38 Glucose testing  If we use this cutoff level L to say that people (with a mean of 4 tests) over L "have diabetes", then the chances of declaring that someone "has diabetes" when they really are OK (with mean 125mg/dl) is .05.  .05 or 5% is the chance of a "false positive" using this protocol, when the real mean is 125.
& & & & & & & & & & & & & & & & & &

Read, 
to discuss
For 15.35, p. 382: Ignoring the actual question:  Which of the answers to 35 is self-contradictory?  Which one makes logical sense (whether or not it's true)?  Sketch a normal curve and mark out the areas for alpha = .10 and alpha = .05.
Optional 
(more practice) 
 
Your shoebox results:  Write your xbars , z's, P-values, and <.10 (Y/N) (one on each pad--yellow or white).  And, if you didn't last time, make a dot for each on the circulating dotplot.

Exams not finished.

Ch. 15: "Significance tests use an elaborate vocabulary, but the basic idea is simple: an outcome that would "rarely" happen if a claim were true--is good evidence that the claim is NOT true." (p.363 top)
Day 34 for other details.  Summary, comments:

The game:
Before taking data, define
H0: "Null hypothesis" A claim or statement about the population we would like to show is NOT true.
   Stated usually as:  A parameter = a particular value.  H0: µ =1000 hrs.  ("Average lightbulb life".)
Ha: "Alternative hypothesis" A claim or statement about the population we are trying to find evidence FOR.
      Stated usually as: The parameter  is >, or <, (one-tail tests) --
                       or NOT = the particular value. (two-tail)
    Ha:   µ  > 1000 hrs. (Suppose we have a New process that makes them burn longer. We hope.)
    Other possible alternatives: Ha:   µ  < 1000 hrs.  (Want evidence that Mfr.'s claim is inflated)
             (two-sided=two-tail) Ha:   µ  Not = 1000 hrs.  (Want evidence that Assembly line process is"off")

   Some authorities say you should always do two-sided tests.  Others say:  If you have a hope or suspicion; are only interested in one direction, then do it that way.  What's NOT OK is to look at your data and then decide your alternative hypothesis.
HW questions?  Day 35: 15.3, 4, 6, 7

Take data.  Calculate test statistic. For µ, test statistic is the z-score of xbar. (Start with xbar, standardize using mean of H0)
    Is it an unlikely result if  H0 is true?  Then that is evidence against H0.
HW questions?  Day 35: 15.8, 9, 10

Measuring the strength of the evidence against H0 (a common measuring stick for all distributions and parameters):
P-value of a test:  The probability, computed assuming that H0 is true, that the observed outcome would take a value as extreme or more extreme than that actually observed (if we could repeat taking-data again).  p. 368.
    The smaller the P-value, the stronger the data's evidence against H0 ( for Ha).

For a test of µ  , using xbar (sigma known), the P-value is
--the area of the tail beyond the observed xbar, in the direction of Ha (one-sided)
(--or twice that area (two-sided).)
<>Applet:  P-value of a test of significance automates this.  (Uses "raw" scale of xbars, rather than z-scores). 
HW questions?  Day 35: 15.12, 13, 14 (one-sided).  11 (two-sided.)

Example (one sided): 
H0: µ =1000 hrs.  (Average lightbulb life.)  Suspect company's cheating:   Show it's worse.
                 
Ha:   µ  < 1000 hrs.
       
Sample of size n = 25.  Population sigma = 150 hrs. Get xbar = 940 hrs.  Are these bulbs worse than claimed?
                 z = (940-1000) ÷ (150/5) = -60/30 = -2.
        
P(Z < - 2) =  .0228 = P-value  More than  2% and less than 3% chance of getting a result this high if we did it again.   

Example (two sided):    
H0: µ =1000 hrs.  (Average lightbulb life.)

           Ha:   µ  Not = 1000 hrs. (Quality control on assembly line--find if it is "off" either way.)    
Ha: "Alternative hypothesis"
A claim or statement about the population we are trying to find evidence FOR
              A value either much bigger than or much smaller than the H0 value is evidence against H0 & for Ha.
 
   Sample of size n = 25.  Population sigma = 150 hrs.  Get xbar = 940 hrs. z = (940-1000) ÷ (150/5) =  - 2
   
        P(Z <
- 2) = .0228
 P-value:
We measure the probability of seeing something (again) as extreme as the observed value (or more so).

So you need to measure the P-value symmetrically both directions from the observed value--so the P value is double what it would be for a one-sided test.  P-value is approximately 5%; more precisely, 2·.0228 = .0456
So for a test of a mean, the P-value for one-sided is half that for two sided, IF the result is in the direction of evidence for the alternative.
            
A "Significance level" alpha is a probability level we decide on  in advance as being the "rarely" amount that will push us over into believing (well, sort of) that the H0 claim  is not true. (Historically older language than P-value.  Appropriate levels vary by discipline.)
We tend to use simple benchmark numbers for it, like .10 (1 in 10), .05 (1 in 20), .01 (1 in 100).
When the P-value is less  than (or equal to) a particular significance level alpha (say .05), we say,
    "The results are significant at the alpha = .05 level," or "The results are significant (P< .05)" .  Giving actual P is better, if you can.
 Lightbulbs:  One-sided:  .0228 = P-value.  More than  2% and less than 3% chance of getting a result this high if we did it again.
          "Significant at the alpha =.03 level.  Also at the alpha = .05 level"  (P-value says,  rarer than these levels)
          "Not significant at the alpha = .02 level.  Also not significant at the alpha = .01 level" 
(P-value says, more common than these levels)
   Two-sided:  .0456 = P-value.  (Barely) less than 5% chance of getting a result this far out if we did it again.
            "Significant at the alpha = .05 level. (Also at alpha = .10).   Not significant at the alpha = .04 level.  Nor .01 level.
Applet:  Applet: Statistical Significance
You can pick the alpha you desire, and see if your x-bar lies outside the "alpha" barrier(s). (approach of p. 376-79) But P-value is more informative.

HW questions?  Day 35
- - - - - - -
Today, finally? NOT YET Look back
at 11.38, p. 297.   "backward normal" problem.  From a proportion/probability, find a z*, from that a raw value (here an x-bar).  We can think of this as a significance testing question.  n = 4, sigma = 10 mg/dl.
     H0: µ =125mg/dl (Sheila is normal),   Ha: µ  > 125 (Sheila has gestational diabetes.) 
    Find the L 
so that only .05 of random samples of 4 tests would have mean above L, among people(Sheila) whose real mean is 125. 
      L is the "cutoff" for doing an alpha = .05 test. 
5% of "healthy" people will be diagnosed diabetic (false positive).
           Doctors like a "decision making rule", want an alpha cutoff to apply,  rather than calculating a P-value for each indivual's set of 4 tests..
Note that table C gives us another way to get z*'s for some probabilities!  Bottom row, "one sided P".  The table is set up to go from "tail" probability to z*, without having to calculate "probability to the left."

- - - - - - - - - - - - - - - -
YES What if you don't have the Z-table but only have the t-table (Table C)?
What if you have a demanded level of significance, alpha?
    Table C: a limited list of probabilities  across the bottom rows:
            = Tail values for the bell curve distribution.   (one sided = one tail, two sided = two symmetrical tails)
        The value in the z* row above  P is the corresponding standard normal value ("critical value"). 
                 Check z* = 1.960, .025 above it (or below -1.960).  .05 farther out than it.  Corresponds to Table A.
      
  Do this: Find your z from the data. Make a sketch of the normal curve and mark your z on it.  Mark the direction(s) of Ha.
    (If your z is in the direction(s) of Ha, continue.  Otherwise the results are hopelessly not significant: you can quit.)
Find the two z*'s in Table C that bracket your z (ignore minus sign).  Find the corresponding P's.
    e.g. z =2.111
                                                 z = 2.111
      z*         2.054 \/ 2.326
One-sided P  ...  .02     .01
Two-sided P  ...  .04     .02 


      So the P-value for your z is: between .02 and .01 (If it's a one sided test)
         &  between double those 2 p's--between .04 and .02 (If it's a two sided test)

    Test is significant at the bigger bracketing probability; not sig. at the smaller.
One sided: P-value is less than .02 and greater than .01
        Significant at the .02 level,not at the .01 level
Two sided: P-value is less than .04 and greater than .02
        Significant at the .04 level,not at the .02 level
If you have a specific demanded significance level, compare it with these levels.
            If  a test is significant at level b, then it is significant at every level bigger than b.
            If a test is Not significant at level d, then it is Not significant at every level smaller than d.
    "Significant at a":  probability of getting my results (again) by chance (if H0 is true) is less than (or =) a. My result is less common than a.
Results   Significant at    Not significant at
p bigger  .10      .05      .01      .005     .001 smaller
                        /\
                        P-value (one-sided)
                        z-value
z* smaller 1.282   1.645  | 2.326    2.576    3.091 bigger
  You can compare z directly to z* for your desired alpha.  z >z*?  Significant at that alpha.  
    The 2-sided is a bit tricky. 
Don't halve or double z's--it doesn't work!)

Have a lovely Thanksgiving!


Sievers home  Math151-Fall06/Daym37.htm  11am 11/20/06
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.