MATH 251, Probability and Statistics I, Fall 2005, Mon. Nov. 14, Day 34After classcorrection,answers

Reading:  Finish 6.3. Ahead:  Skim 6.4 for the words and concepts.  Start 7.1 Work on exam! 
Hand in: 

6.74  1000 tests
6.87 Bonferroni 
Note on Bonferroni procedure, #6.86, 87  If you do multiple tests as a fishing expedition, individual items that come up significant at (e.g.) .05 could have happened by chance, or could indicate "real" differences from their null hypotheses.  Then you would need to make a new study, collect new data, to check on these.  If you have no possibility of this, and need conclusions from this  set of data, you can use this procedure.  It is analogous to dividing the alpha (here .05)  between two tails for a two sided test--here we divide the alpha evenly among all the separate tests--anything that still comes up significant when it is that far out can legitimately be deemed significant at the .05 level overall.   What do we lose?  Power!  The chances of confirming any real difference from the null hypothesis go down, by demanding to be farther out in the tail. 
- - - - - - - - - - - - - 
Sec. 6.4, p. 441
6.99 diagnostics and error types
A.  Use the Applet:  Power of a test to get a feel for these issues: see below  for details. 
- - - - - - - - - - 
Sec. 7.1, practice with t-table.
B.  Assume T has the t(21) distribution. Use table D, df = 21. 
a) Find t* such that P(T > t*) = .01. 
b) Find t* such that P(T > |t*|) = .01 (further out than t* , symmetrically on both sides of 0) 
c) Find t* such that P(T < t*) = .99. 
d) Find  t* such that P(-t* < T < t*) = .99 (Hint: use the Confidence Level row) 
e) t = 2.132.  Find bracketing probabilities, so that __  _  < P(T >2.132) <  _____..

Sec. 7.1, pp.470ff
7.19 t* vs. n
7.22, 7.21 all but "software" parts  Critical value for test.
7.16, 7.17, 7.23  two-sided P to one-sided.
7.13 test for n = 5, 10 (same t)  This is a good point they make.  But if it were data from the same population, the t would also probably change to be bigger, because the SE will be affected by the sample size!
  answers to B: a) 2.518 b) 2.831 c) 2.518 d) 2.831 e) .01, .02
Postpone 7.8, 712.
7.8 ADD (CI by hand)
7.12 sales (test by hand)

Read, 
discuss
 
 

 

Optional
(more practice)
7.26Purdue
 
A.  Use the Applet:  Power of a test. to get a feel for these issues:
Take the shoeboxes situation:  Ho : µ = 20; sigma = 4.  Ha : µ > 20
1) (Changeµ) Start with n = 4 (our situation).  Set alpha = .10.  Watch the lower distribution move, and Find the power of the test for µ = 21, 22, 23, 24, 25, 26, 27. (Hit Update if needed between different values)  (Hint. Power = .761 for µ =24).
Graph your results (by hand is fine), with µ on the horizontal axis and power on the vertical, and connect the dots smoothly.  You've constructed a "Power curve".

2) Assume µ =24, really!  (same setup as before, n = 4, sigma = 4 ).  If alpha = .10, Power = .761.  Change alpha up and down (smaller and larger than .10).  Notice that smaller alpha means smaller power to detect the alternative,  and vice versa--we're just moving the cutoff line in both pictures.  (Nothing to write down)

3) Assume µ =24, really!  (same setup as before). Set  alpha = .10. If n = 4, Power = .761, about a 3/4 chance of detecting that the mean is really >20.  Suitable for my class example, where I wanted some people to NOT detect it. Not so good for real work.
Find the power for n = 8, 12, 16, 20.  Graph n (4 to 20) on the horizontal axis and power on the vertical axis; connect the dots smoothly.  From your graph, what sample size should I use if I want 95% power (approximately)
- - - - -  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Exam 2: Open book takehome.  Due by 1 pm Monday  Nov. 21 (Day 37)  Covers Chapter 3 through 6.3.

Quiz  Wednesday! to write down these definitions (memorize them):  I want the general definitions, not just formulas for the tests of µ.
>>"A Level C confidence interval for a parameter is...._ __ __ __ __ __ __"
>>P-value for a test.
>>"A test is significant at level alpha = .05 if the P-value__________________"

Homework questions, 6.3? Day 33
Sec. 6.3, multiple tests:  Notes Day 32
Sec. 6.4, decisions, type 1 and 2 error, power.  7.1, intro to t.   Notes Day 33
7.1, continued
Inference for means, using xbar from a SRS to make inference about µ:
Large n
 Sigma known          Sigma unknown
Small n
 Sigma known          Sigma unknown
normal
Population is 
not normal
 Xbar is normal; 
find z using sigma
 Xbar is normal; 
find z (or t) using s.
Xbar is normal; 
find z using sigma
Xbar is normal; 
Find t using s
Xbar is normal-ish (CLTh); 
find z using sigma
Xbar is normal-ish (CLTh); 
find z using s
Unrealistic. sigma's 
only "good" for 
normal pop's.
(See p. 463, 65ff) 
If you can't use t, 
Find a statistician?

t-distribution family:  like standard normal only slightly fatter in the tails.  Mean = 0. Symmetrical around 0.
    "Degrees of freedom" tell which member of the t family.
      t(k) is the t distribution with k degrees of freedom.   Table D
 Comparison with normal (Excel file)     As n gets large, t approaches normal

Start working on green box:
Assume Normal population .  Mean µ, s.d. sigma, both unknown.
Take SRS, size n, find xbar, find s (sample standard dev.)

Standard error of the (sample) mean =    Standard deviation of xbar, estimated from the data.
  "Standard error of the mean":  s/sqrt(n) SEM, SEXbar, etc.
        Just like sigma/sqrt(n), only s from data replaces sigma.
  When you estimate the standard deviation of a statistic,
                the resulting estimate is called the "standard error" of the statistic.

Standardizing xbar with s instead of sigma results in
  the one-sample t statistic
which has the t-distribution with n-1degrees of freedom.

We'll now repeat all the stuff from Chapter 6, only wherever there was a z, we'll substitute a t.
Here we go....
"One-sample" t- procedures: SRS of size n.  Use Xbar to estimate µ.
Substitute s for sigma in the standardizing formula. We get t instead of z, with n-1 degrees of freedom.
        It's a good idea to check for at least approximate normality in the data set.

Confidence intervals: 
   Choose t* from table D, using the n-1 row, and confidence level C.
    Special case of common pattern:    estimate + t* SEestimate

Significance tests:  State hypotheses as in Ch. 6, find t from data, by:
 Calculating the one-sample t-statistic, using the null hypothesis value of µ (call it µ0)
Then proceed as if it were a "z", only using the (n-1) d.f. row in  table D,
to find P-values for the t*'s it's between, write "P-value is between ___ and___".
(Or use software which will find P-value exactly. )
Start here Wed.
Example: bacteria per milliliter in 10 specimens of  raw milk from one producer.
  Parameter: actual mean bacteria/ml.
       5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870
4|5 
4|77
4|889 
5|11 
5|23 
 n = 10,   xbar = 4950,
s = 268.45   SEM = 268.45/sqrt(10) =268.45/3.162=84.89.  deg. of freedom = 9
90% CI:  from t(9) in table, t* = 1.833   CI is 4950+1.833x268.45/sqrt(10)
                                                       4950 +1.833x84.89, or  4950+155.6 bacteria/ml.
If we had KNOWN Population sigma = 268.45, 
  we'd have used z* = 1.645, gotten a narrower CI.   (but we don't know sigma!)

Test:  H0 : µ = 4800                          t = (4950 - 4800)/SEM = 150/84.89 = 1.767
          Ha : µ > 4800                           t is between 1.383 and 1.833   (d.f. = 9)
             (too contaminated)                P is between .10 and .05.  Some evidence for Ha
(If the test had been 2-sided, P would be between .20 and .10)

Raw data? SPSS--next time (handout)


Sievers home  Math251-Fall05/Dayps34.htm  11am   11/14/05
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.