Math 151 , Day 33, Friday, April 20, 2001

>EXAM 3 a week from Today, in class, closed book.
Through 6.3 at least; possibly part of 7.1 (we'll skip sec. 6.4)
Also Ch. 4, everything but probabilities in a finite space.

Quiz: better.  If you got a B+ or lower, you may try a third time (max grade A--), Monday before or after class, or at 12:30.  Returned quizzes are in HW folder outside my door.

Significance Testing, cont'd.
2-sided test: We measure the probability of seeing something (again) as extreme as the observed value (or more so).
So you need to measure the P-value symmetrically both directions from the observed value--so the P value is double what it would be for a one-sided test.

#6.35, p. 333 Engine crankshafts:  We want to stop the process and fix it if the mean gets too far "off" from 224--either direction would be bad.  So two-sided.     sigma = 0.060 mm.  n = 16.  Std. dev. of xbar = 0.060/4 = 0.015
H0 : mu= 224 mm
Ha : mu Not = 224 mm
xbar = 224.0019375   (sample standard deviation = .0618)
Standardizing: z = (224.0019375 - 224)/ .015 = .0019375/.015 = 0.12917 ~ .13  (xbar is clearly close to mu)
(If you used .0618, not the .06 you were supposed to, you would get .1254--still rounds to .13)
Farther out than .13 to the right has probability (1- .5517) = .4483.
Farther out than -.13 (symmetrical) to the left also has probability .4483.
So P-value, 2-sided, = .4483 + .4483 = .8966

Results of shoebox samples.
Questions on HW

Sec 6.3, cont'd:   cautions and limitations: pp. 345-348
>>Data must be from SRS or reasonable facsimile
      All the other warnings p. 312:  normality, watch out for outliers, skewness.  Sigma known or n large.
>>Multiple Tests: beware!
    If you do 100 tests and use the alpha = .05 significance level for each, then the structure of testing requires this:
    When all 100 null hypotheses H0 are true, out of your 100, about 5 of the 100 (.05) will give "significant" results by chance alone (falsely indicating the alternative hypothesis is to be preferred.)
    Moral: if you use the testing mechanism as a screening instrument for many questions, a proportion will give falsely significant results.  You can't accept the results from such multiple tests as good evidence, only as indicating questions requiring further, more specific study. The game give you one shot, not a hundred.

"Significance testing" vs. "Hypothesis testing"-- two different approaches that blur...
Both start with null and alternative hypotheses.  You want to show the alternative is true.
Significance testing:  Calculate P-value (or closest alpha), describe how unusual your result is if H0 is true.
Let the audience for your work decide if they believe in the alternative hypothesis or not.
   Language: "strong evidence for Ha, against H0 or not strong...

Hypothesis testing:  Make a decision  between H0 and Ha (often associated with predetermined fixed alpha level)
We need to do something.
    Language:  "Accept Ha, reject H0" if P-value smaller than alpha.
        What if we can't reject H0?  Do we accept H0? Safer:  "fail to reject H0"
      H0 "Innocent"                 "Guilty" Ha
                     \ "Not Proven" /         but defendant goes free...

If we make a decision we run the risk of error:
Type I error Accepting alternative Ha when null H0 is true (probability = alpha)  Test designed to focus on this one.
Type II error, Accepting null H0 when alternative  Ha is true (probability = beta, depends on what exact parameter value in  Ha is true)  Can't make this one if we refuse to commit, but
A small Type II error means the power of the test to detect  the alternative hypothesis is high.
(Sec. 6.4, optional, takes this further)

HW: Reread Ch. 6, bring questions.  If no questions, I'll start Chapter 7 Monday.
From Day 32--Hand in Monday: 
More p-values 
p.341, 6.44 CEO pay
= = = = = = = = = 
Table C: 
p.341, 6.48 CEO pay again
p. 341, 6.46, 6.49 general z statistic, significance,Turn the page--6.49 continues. 
p. 342 6.50 patent protection; another z.
= = = = = = = = = = 
Fixed significance levels: if you only have table C, what can you say? 
p. 337, 6.37 testing number generator
6.38 nicotine content
= = = = = = = = = = 
p. 342, 6.52 1% vs 5%
   6.53 define stat. signif.
p. 343, 6.54  knife edge .05
p. 345, 6.55 and 56 effect of n
Read, to discuss Optional
Also for Monday
Bring questions on Ch. 6, (or Ch. 4).
Hand in:
Sec. 6.3, (pp. 344-48 is new)
p. 346 6.57 test ok?
p.348 6.61 strong vs. signif.
p. 347 6.58 500 tests for psychic powers
 6.59 what is significance good for?
 6.60  radar detectors
 6.61 77 potential schizophrenia markers


Review of ch. 6
p. 339 6.40 job satisfaction, 2 sided
p. 360 6.74 wine--stemplot, CI , test.  Notice "less sensitive" noses will have higher thresholds.
p. 362, 6.79 a,b effect of sample size
6.83 Train Welfare mothers This kind of study was the basis (plus conservative philosophy) for our present "welfare reform."
Read, to discuss Optional
Review: p. 360, 6.75
Optional
Sec. 6.2  Two-sided test is doable using confidence interval (pp. 337-9)
6.39 IQ tests Use your calculator to get the sample mean


Sievers home  Math151-Sp01/Day33.htm  10:30 am 4/20/01
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.