MATH 251, Probability and Statistics I, Fall 2001, Fri. Nov. 30, Day 37 final version

Science Colloquium Today:  Mathematics! Logical paradoxes, speaker from Cornell.
Join us in the Sommer Center for (free) lunch afterward to meet the speaker, ask questions about grad school, math, whatever... You missed a good talk, if you missed it!

Further questions on SPSS? (Paired sample OK?)

Sign test  ,,Two-sample test
Will start here Monday
"Equal Variances" assumption, "pooled sample" p.550ff.)
Pooled estimator of the common variance
  Rationale:  Give each individual data point equal weight in estimating sigma.  The sigmas are the same but the means are not!
         If the values from sample 1 are x1, x2,...xn1, and those from sample 2 are y1, y2,...yn2,
our standard deviation-making table would look like this
value  | value - mean  | (value - mean)2
x1           x1 - xbar       (x1 - xbar)2
x2           x2 - xbar       (x2 - xbar)2
. . .
xn1          xn1 - xbar      (xn1 - xbar)2
y1           y1 - ybar       (y1 - ybar)2
y2           y2 - ybar       (y2 - ybar)2
. . .                                                  Sum the right hand column
yn1          yn1 - ybar      (yn1 - ybar)  to get the numerator.
__________________________________________________________________

The degrees of freedom is the total number of points (n1 + n2) minus one for each estimated mean, xbar and ybar.
                                                                              (n1 + n2- 2)  is the denominator.
    If you already have the separate sample variances, s12 and s22 ,  you can get the same numerator this way:  Multiply each one by its separate denominator (degrees of freedom) and add.  (n1 - 1)s12 + (n2- 1)s22   (This is the book's formula, p. 550)
    This only estimates the common variance sigma2.  To get the standard error of the difference, you need to do the analogous thing of dividing the estimate of sigma  by sqrt(n).  Here we multiply by sqrt(1/n1 +1/ n2) (Hypotenuse rule again)
    The nice thing about this approach is that the resulting "pooled two-sample t-statistic" really does have a t distribution
(with  (n1 + n2 - 2) degrees of freedom.)  The not-nice thing is that it's quite hard to know if two variances are equal if you only have small  n's.  Until modern computing methods tested out the "unequal variance" methods, it was the only t procedure.
 
Hand in: 
Sign tests can be done easily by hand.  (do at least one by hand.)
Try SPSS...
7.43 a, b (turn page!) sign test, rt. threads
7.44 sign test, summer institute.
7.45 sign test??
 Read, discuss
 
 
 

 

Optional
(more practice) 
Sec. 7.2 Those that need to be done on the computer are labeled SPSS (two-sample is on the handout you have)

p. 556, 7.48, 49, 50 (SPSS) bread vitamins
7.51, 52 (SPSS)  piano lessons again A lurking variable in the previous problems was the passage of six months' time; during which preschool children learn a lot of stuff.
7.53  compare all the piano lesson analyses.  I don't necessarily believe all of the book answer.
7.57 cocaine and birthweight by hand--the unequal sigma method.
7.64 rowing a, b,c (turn page) Use the unequal sigma method (note the std dev's are quite different). 
Read the rest of 7.2



Hand in with Monday Day 38: 
Pooled-sample (equal sigma's).  Pooled-sample computation gets a bigger d.f. and therefore a shorter CI & smaller p-value than the unequal variances method, on the same data. 
7.65 and 7.77 rowing--weight.  (unequal and equal methods compared.)
7.75 social insight. By hand.  this is Example 7.16, p. 546, not 526.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Some algebra:  General advice on designing experiments is to put equal numbers into each sample if you can.  Here's some hints why. 
A)  If n1= n2 then the expression for the standard error of (xbar1 -xbar2 ), i.e. the denominator of the t-statistic, is the same for the unequal variances version p. 541  and for the pooled-t p.551.  Use algebra to show they are the same (set n= n1= n2).  [Thus the only difference in computing with the different versions in this case will be the d.f. you use] 

B) If n1= n2 = n and  s1= s2 =s, the complicated df formula on p. 549 collapses into 
df = 2n - 2 (= n1 + n2 - 2).  Use algebra to show it. [So at least for equal n's and similar s's, the complicated  df formula will not lose you much sharpness compared to the pooled  version.  ]

 Read, discuss
 
 
 
 
 
 

 

Optional
(more practice) 


Sievers home  Math251-Fall01/DayP37.htm  3pm   11/30/01
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.