MATH 251, Probability and Statistics I, Fall 2005, Wed. Nov. 9, Day 32Afterclass

Reading:   Continue 6.2. ReRead to p. 406,  Now read  "statistical significance" (406-top of 409).  ReRead "Tests for a population mean"(409-12), Read the rest (412 on).   Read  6.3, Ahead, a brief glance at 6.4, then 7.1
Hand in: 

With significance levels  (Sec. 6.2)
6.40, 41 P to alpha
6.48  bednets very significant (Bednets are now recommended and provided as a first line defense)
6.63, 64  .01 and .05
6.61 onesided z to P to alpha
6.62 two sided z to  P to alpha

6.104 (p. 443) Plot n on the x-axis and z  on the y-axis. Plot n on the x-axis and P-value  on the y-axis.
6.49 "significant"
6.110, 6.111 (p. 445)"significant"

6.67 (one-sided), 6.66 (two-sided) Table D
6.69 P-value from table A, table D (2-sided)
6.68 from alpha to z

6.47 CI <==> sig
6.56 Blood calcium: test, CI, significant but inconsequential
- - - - - - - - - - - - - -
Postpone 6.3 problemsSec. 6.3, pp. 428ff.
6.74  1000 tests
6.77 12 subjects, P = .052
6.81 managerial trainees
6.87 Bonferroni 
6.88 tornado damage

Read, discuss 
 6.63c sig.

6.58, 9 Applet exploration of xbars and alphas

6.49"significant"
- - - - - 
6.72, 73 P,  sig.
6.78 P = .95

6.82 n and P: answers are .3821, .1711, .0013.

6.84 P, alpha
Answers are z=1.64 (P=.0505) and z=1.65 (P=.0495)
 
 
 
 

Optional 

6.46 CI<==>sig
 
 

If you didn't: Add your shoebox results to each of  the 2 sheets circulating.  n =4, sigma =4, sigmaxbar =2
    the 4 values || xbar|| z (assuming mean is 20)|| P-value= P(Z > z) || Is P-value < .10?
Add a dot for each of your xbars to the dotplot transparency circulating.

Questions on Testing HW? Day 31

Quiz returned.  Almost everyone did almost everything correct, EXCEPT:  standard deviation of X - Y! Which was on the last quiz too!  Once more with feeling:  If X and Y are independent, sigma2X-Y= sigma2X +sigma2Y (Day 23, bottom, IPS p. 302.  Example Ann and Betty Day 28.   )

Significance tests use an elaborate vocabulary, but the basic idea is simple: a result that would "rarely" happen if a claim were true--is good evidence that the claim is NOT true.   Notes Day 30
Look at the results from the shoeboxes:   Notice  that there is always a chance of getting a "somewhat unusual" result from a population where the null hypothesis is true.  And if the actual mean is not extremely different from the null, a result may not be detectably different from the null-hypothesis results.

A "Significance level" alpha is a probability level we decide on  in advance as being the "rarely" amount that will push us over into believing (well, sort of) that the H0 claim  is not true.
Simple benchmarks: .10 (1 in 10), .05 (1 in 20), .01 (1 in 100).
When the P-value is less  than (or equal to) a particular significance level alpha (say .05), we say,
    "The results are significant at the alpha = .05 level," or "The results are significant (P < .05)" , or "Reject the null hypothesis at level alpha = .05"  Day 31
What if you don't have the Z-table but only have the t-table (Table D)?
What if you have a demanded level of significance, alpha?
    Table D: a limited list of probabilities  across the top row:
            = Right tail values for the bell curve distribution.
        The value in the bottom (z*) row under p is the corresponding standard normal value.
        "z* is the upper p critical value of the standard normal distribution."
  Do this: Find your z from the data. Make a sketch of the normal curve and mark z on it.  Mark the direction(s) of Ha.
    (If your z is in the direction of Ha , continue.  Otherwise the results are hopelessly not significant: you can quit.)
Find the two z*'s in Table D that bracket your z (ignore minus sign).  Find the corresponding p's.
    e.g. z =2.111
p      .02     .01
z*   2.054 \/ 2.326
       z = 2.111So the P-value for your z is: between those 2 p's (one sided test)
                                           between double those 2 p's (two sided test)
(Some versions of the table add another top line, for two-sided tests: Double the one-sided values)
    Test is significant at the bigger bracketing probability; not sig. at the smaller.
One sided: P-value is less than .02 and greater than .01
       Significant at the .02 level,not at the .01 level
Two sided: P-value is less than .04 and greater than .02
       Significant at the .04 level,not at the .02 level
If you have a specific demanded significance level, compare it with these levels.
            If  a test is significant at level b, then it is significant at every level bigger than b.
            If a test is Not significant at level d, then it is Not significant at every level smaller than d.
    "Significant at a":  probability of getting my results (again) by chance (if H0 is true) is less than (or =) a.
Results  Significant at    Not significant at
p bigger  .10      .05      .01      .005     .001 smaller
                         /\
                        P-value
                        z-value (one-sided)
z* smaller 1.282   1.645  | 2.326    2.576    3.091 bigger
  You can compare z directly to z* for your desired alpha. The 2-sided is a bit tricky.
          (2-sided: Split the alpha in 2, then find the z*.  Don't halve or double z's--it doesn't work!)

CI's and Two-sided tests (pp. 413-14):
        Your 95% CI doesn't include  µo  <==> Reject Ho = µo  at the alpha = .05 level  (Seems like common sense.)

Start here Friday Sec. 6.3
>>Don't do inference on data that doesn't look like probability-model data (All that bias, design flaws stuff was for this!) and check the data for weirdness (Ch. 1)
>>(Not in text any more?) How small a P is "convincing evidence" against H0?
     In practice...beyond the formal testing.
        How plausible is H0?  Ha?  Strong evidence needed to reject "conventional wisdom."
        How expensive (mentally, economically) will abandoning H0 be?
      (May need more than one set of data; replicate, recast, refine.)
>> In reality, no sharp border between "significance" and "not significant"

>>"Statistically Significant" doesn't always mean "Important." (e.g. medicine: "Clinically significant.") Big enough sample sizes will allow you to distinguish even small differences.
>> Lack of significance--doesn't prove H0 true.  Best: "data are consistent with (not inconsistent with) H0 "

>>You cannot legitimately test a hypothesis on the same data that first suggested that hypothesis. Every data set will turn up with some unusual pattern if you examine it hard enough.
       (If you must explore and confirm with the same data set, one way is to (randomly) take half the data set, explore and generate hypotheses; then use the other half for confirmatory tests.  You can use P-value to describe unusualness, but be wary of making decisions with it if you didn't expect that particular unusualness.)

>>Multiple Tests: beware!
    If you do 100 tests and use the alpha = .05 significance level for each, then the structure of testing requires this:
    When all 100 null hypotheses H0 are true, out of your 100, about 5 of the 100 (.05) will give "significant" results by chance alone (falsely indicating the alternative hypothesis is to be preferred.)
    Moral: if you use the testing mechanism as a screening instrument for many questions, a proportion will give falsely significant results.  You can't accept the results from such multiple tests as good evidence, only as indicating questions requiring further, more specific study. The game gives you one shot, not a hundred shots.
- - - - - - - - - - -
Statistical inference in a nutshell:
Am I surprised (If Hois true)? (Do I reject null?)
    How surprised? (give P-value)
  What would not surprise me?  (confidence interval--estimate the actual value)
(IPS:  Testing is over-used, Confidence interval estimation under-used)

Next: Brief look at issues of 6.4; then Ch. 7


Sievers home  Math251-Fall05/Dayps32.htm  2pm   11/09/05
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.