Math 151 , Sp2006, Day 40 Monday May 8 Hit reload After class, corrected

Day 40  (Re)reading Ch. 23, reading Ch. 25 (for Matched pairs), Reading: D&V Ch. 24 (two independent samples) first 2 pages, then  thru p. 462, then 465-9.  You will not need to compute a two-sample t procedure by hand, but you will need to know how to identify the situation, (possibly) to use SPSS, and understand the results. This is the end of the course material.
Hand in Wednesday
p. 449 #11 a,b,c (Not SPSS) Normal temperature II
On a separate page:
Review exercise, part 1:
 
For the situations in  problems  17 and 19, pp. 491-2,  we have in each an experiment with two treatments, wet and dry pavement.
 a) Look at the situations.   Which is Matched Pair design?  The Other design is most like what we called Completely Randomized design, although it doesn't say if any randomization took place. Make a Diagram of the Other design.
b)  For the Other data,  do back to back stemplots of the Dry and Wet stopping distances.  Find the 5-number summaries for each, and  make side-by-side boxplots of the  two.  We don't know how to do the statistics for this, but based on your boxplots, do you believe that (if we knew how) we would find a significant difference between mean stopping time  on wet pavement and on dry?  Explain.
c)  For the Matched Pair data, calculate 10 new data values, for each car the Wet distance minus the Dry distance.  These are the "differences".  Make a stemplot of these and find the 5-number summary.  On the basis of this, do you believe that there is a significant difference between mean stopping time  on wet pavement and on dry?  Explain.
d) The text seems to indicate that all the data in the design of #17 may have come from the same set of tires (it's all on the same car).  If so, it would be important that the sequence of trials not be confounded with whether the pavement is wet or dry (maybe these sudden stops wear down the tires very fast!)  Explain how you could organize the trials (perhaps by randomizing, perhaps just by planning) to keep possible wear on the tires from being confounded with the wet/dry treatments.
e) In #19, there are 10 cars, so probably 10 sets of tires.   Compare the Dry stopping times for the data of #19 with that of  Dry stopping times for the data of #17, using back to back stemplots.  Do they seem different, in either middle or spread?
f) Comment on the shapes of all the distributions you stemplotted:  symmetry, near-normality, unimodality, outliers?
More on these later.

Postpone all the rest:
Using SPSS: (
Handout,one sample t)

AUse SPSS. Redo the example on the handout (Cola sweetness loss). Type in the 10 data values.
B. Use SPSS. Redo the example on Day 38 page (Milk bacteria) Data is on the lab computers, in  Math151 D&V\SPSS for Class 05\MilkBacteria_t.sav.  Or You can copy and paste the data from Datasets page.
C.  Use SPSS. Redo the computations from p. 448, #9 (Normal body temperature). The data is not where it's supposed to be.  You can copy and paste the data from Datasets page.
p. 451 #27 Chips Ahoy Use SPSS. You can copy and paste the data from Datasets page.  For c,  Do the test with SPSS, and get the P-value.

Paired samples:
p. 491 #13 a-d (e optional) Sleep (by hand)  Use Table T to get a benchmark significance level, instead of P-value. (Optional, find the P-value using Activstats:  The table tool (23-1, activity 3) or T the Density tool  (Ch. 23--"normal dist" looking button on menu bar does t distribution))
D. Redo the Mileage example on the SPSS handout (back side).  The data is at   SPSS file, or in columns in Datasets page. 
p. 489 # 7 City Temperatures Use SPSS.  Data is on the lab computers, in  Math151 D&V\spss data files  D&V\dv01_25_07.sav
p. 493 # 22 Uninsured Use SPSS.  Data is on the lab computers, in  Math151 D&V\spss data files  D&V\uninsure.sav
p. 491 #12 Summer school Use SPSS. Type in the data.  Choose the columns so the Paired Data procedure subtracts the way you want it to.

The rest, we will get to lightly if at all.

Two independent samples 
E.  Repeat the analysis on the SPSS handout for the Polyester in landfill data.
p. 471, #1, 3 CPMP
6 a,b only Pulse rates
7 Cereal (SPSS: Data is in labs at Math151 D&V\spss data files D&V\dv01_24_07.sav )
8 Egyptians (SPSS: Data is in labs at Math151 D&V\spss data files D&V\dv01_24_08.sav, BUT it's in the wrong form!  It's in 2 columns as if it were Paired but it's not paired data.  You can highlight the 30 items in one column, copy and paste to the bottom of the other.  Then make a grouping variable to distinguish the two groups.)
17 Job satisfaction (What should you do? (Don't do it...))
12 a,b Memory
11 Hurricanes. Do a back-to-back tally of the two sets. Don't do the test, just think about appropriateness.  The answer in the back was a little misled by the inappropriate boxplot into thinking there are outliers, tho there aren't really, there's just "granularity" (small whole numbers here). 

Read,
  to 
discuss 
Optional
Sample size: (by hand) pp. 441-2 
p. 449 #11d  Normal temperatures II
D.  What would be a good sample size if you want a 95% CI with  a ME no more than 1, and you think the standard deviation in the population is about 1?   Assume that sampling is very expensive, so you really want  the smallest n that will do the job.
<>Exam 3 returned.  Comments
Final exam:
   Friday May 19, 9am-12m.  Wells Exam schedule. Contact me ASAP me if you have a problem with this time.

The "in-class" Final will be closed book, but bring one sheet with your notes, anything you like!  And a calculator!
Fay has promised lots of help times during study & exam week.  I'm on jury duty but will try to be available some time.
   Draft: Length 1 1/2 to2 times the length of the midterm exams; comprehensive but with special attention to the material covered since Exam 3. Reading but not creating SPSS.
What is the significance to Statistics of the Guinness Stout Bottle ?
Homework questions? Day 39
Add your 80% CI for the shoebox to the circulating  yellow pad.
One-sample t procedures,
  Conditions of "near-normality", random-sample-like, Day 39
  Start here Wed.
SPSS
Handout One sample and Matched Pairs,

  And Matched Pairs designs Day 39

Last thing: lightly if at all
Chapter 24, Comparing two means"Two-sample tests". Chapter 24  Two random samples,  independent of each other, from distinct  populations. (Populations are normally distributed)  p. 454-5
Often--comparing means from an experiment with two treatments (usually control and "treatment").
                /--- Group 1, n1---- Treatment 1---\
              /                                    \
 Random asst.                                       Compare results --"means"
              \                                    /
               \--- Group 2, n2---- Treatment 2---/
To examine  the difference of the  two means, µ1 - µ2:
We need fairly normal populations; no extreme outliers.  Back to back stemplots are good; boxplots will do.
(Above 40, Central Limit Th. helps:  15 to 40, a little skewness ok.  p. 455)
We use the difference of the two y-bars,  diff ybar1 - ybar2 .
We need the Standard Error of the difference  ybar1 - ybar2 , and then we can proceed as before, more or less.
The Standard Error is calculated like the hypotenuse of a right triangle (Pythagorean Theorem),  from the individual standard errors.
 SE(diff) = SE( ybar1 - ybar2 )= sqrt[SE(ybar1)2 + SE(ybar2)2
P. 453 has another way of writing the same thing:

This almost fits the  t-model. Degrees of freedom are weird.(p. 454)

(For doing by hand, if you must: df = smaller of (n1- 1) and (n2- 1).)
Will give a "conservative" result--slightly wider C.I., slightly less significance, than a "sharper" value.  If your results hinge on the difference between this result and the computer result, they're too close for comfort anyway.

From a computer:  df = complicated formula on p. 494 bottom.  Produces non-integer degrees of freedom.  Very good approximation to the exact distribution, if both sample sizes are at least 5.   Always between "smaller of (n1- 1) and (n2- 1)" and [(n1- 1) + (n2- 1)].   Unsuitable for doing by hand.

Once we have (ybar1 - ybar2) , SE(diff) ,  and the df, our formulas pattern on the earlier ones. Optional Example by hand
CI :  estimate + t* . SE(estimate)
    CI for µ1 - µ2, difference of means,  is 
Test:  H0: µ1 - µ2 = 0 same as µ1 = µ2 , "no difference" "always"
        Ha: µ1 - µ2 > 0 same as µ1 > µ2 Be careful with these, that you know which direction you want.
    or Ha: µ1 - µ2 < 0 same as µ1 < µ2 Often we label our variables "1" and "2" so that we expect µ1 > µ2
    or Ha: µ1 - µ2 0 same as µ1  µ2  (not equal)
        Calculate  find P-value

SPSS will do our computations when we are given raw data.  See handout.  Datasets
Analyze>Compare means> Independent-samples t. We use the Equal-variances-not-assumed line of the results.
  (Does same example as Optional Example by hand: twosampexample.htm)


Sievers home  Math151-Sp06/Daysp40.htm  10pm 5/8/06
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.