Math 151 , Day 5,  Wed., Feb. 1, Spring '12 .After class.. Hit reload to get most current version

HW Day 5:   (Re) Read  pp. 43-49 (5#summary/boxplot) Do Check p.59 17,19,20   Read 55-6, "Organizing...". then: Ahead: Finish Ch. 2 pp. 49-55(standard deviation, & using technology). ("Check" problems: skip 2.21, do 22-24)
Do the 5-number summaries required here by hand (with a calculator if needed for means, finding middles between 2 numbers). 
Read Ahead in Ch.3,  67-72 density curves, & ahead Normal Distributions 73-86:  There's a lot there, and will need repetition.

Hand in ..
5# summary, boxplots (Finish these, you may have started last time)
Text material for Questions on 5# summary, boxplots
p. 60, 2.28 Wood again. Go ahead and use the stemplot figures to find the quartiles.  Also make a boxplot.
p. 60, 2.30 fruit eating
p. 61, 2.35 guinea pigs survival:  For a) use the One Variable Statistical Calculator Applet at  http://bcs.whfreeman.com/bps5e   or on your text's CD .  Observe the skewness, sketch on your paper.  For b), find the 5-number summary (easy since they're in order in the book), check your answers with the Applet results.  Draw the boxplot and compare with the histogram on your screen.  (marking or not marking outliers, I don't care.)

p. 60, 2.27 U. endowments.  They mean, what do you have to count in to, in the list, to locate the mean and quartiles?

p.60, 2.29 Flower length: Find the 5-number summary for Yellows.  You may use the stemplot data p. 57. If you want more practice, do the other 2 by hand also, but you may just use the numbers from the answers in the back of the book.  Use them to make 3 side by side boxplots, and finish the problem as written.


+ + + +A few more + + + +

p.57, 2.13 Rainforest logging.  (Big picture--how fast does a forest recover from logging?) Use the 4-step process, see Day 4, bottom, p. 55-7&/or inside front cover.  Note that "state",  the first step, is usually "done"=the textbook statement of the problem.  The data are probably suitable for mean& standard deviation, but we don't have the SPSS power to do them easily yet, so use your hand methods--stemplots, quartiles, boxplots...  This is one where working together with others can have real benefits, since it's pretty open-ended.

p. 61, 2.36  days of births, Canada (Toronto, actually) The book's question is very open-ended.  Answer instead the questions just below the HW box,*
= = = = =Postpone the rest= = = = = = = = = =
Standard deviation
(you have to find a textbook from now on)  .We didn't get to S.D. today.  You Probably CAN do some/all of the standard deviation problems (from high school), but if you do, KEEP them to be part of MondayDay 7 HW (The above problems are a lot; this should balance the load a bit.).
A.  Find the mean and standard deviation of these 4 numbers: 2,3,5,6 by hand (with simple calculator)  Repeat for these 5 numbers:  2,3,5,6,14.   For each set, make a Dotplot, and mark the mean with a wedge, and indicate the standard deviation s with <----> lines from the mean to both sides, s long. (like the sketch below)
p. 52, 2.10 CFU's  Do a and b by hand. (Hint: Mean is 2138.5) Use SPSS or some other tool** to  do c.  Write your answers from screen to paper.  Also (re)make a dotplot of the data, mark the mean with a wedge, and indicate the standard deviation s with <----> lines from the mean to both sides, s long. (like the sketch below)

p. 53, 2.11   xbar=7.50, s = 2.03 the same for both dist's. Don't do the calculations--just make back to back or side by side  stemplots & compare their shapes!
ALSO with 2.11, type the data for Dataset B into SPSS
or other**, excluding the outlier of 12.50.  Find and write down the mean and s.d. now.  Compare to xbar=7.50, s = 2.03 .
**Where it says to use  SPSS, you may use SPSS (preferred) (Didn't get handout? Link ). Or a statistical calculator if you have one, or the Applet, One Variable Statistical Calculator, on the web http://bcs.whfreeman.com/bps5e or on the CD in your book.

Read, to discuss 


Optional

p. 62, 2.42  Play with  summary numbers. Use the Applet, One variable statistical calculator; type data in at the Data tab

* Questions for 2.36, p. 61 (Days of births, Canada ):
A.  a) Which day had the lowest Median (and about what was that number)?
     b) Which day had the highest Median (and about what was that number)?
      c) Which day had the highest variability (spread), measured by:
                     --IQR (about what are the quartiles for this day)
?
                     --Range (about what are min and max for this day)?
       d) Tuesday appears to be somewhat skewed.  Left, or Right skewed?
B. To compare the Canadian with the American data
p. 11 #1.5:
    a) Is the general pattern the same in the Canadian and American data?  Discuss briefly the common findings.
    b) Going deeper: (Following the 4-step method, p. 55-7:)
State the issue:
Is the weekend/weekday difference greater in Canada or in the US (or are they similar?) 
Plan how to find
an appropriate answer: In both countries, Tuesday is highest, Sunday is lowest. Relate the number of Tuesday's births to the number of Sunday's births for each country.  Proportion/ percents will show the relationship best, since different types of summary numbers are given for the two countries.  We'll find Sunday births as a percent of Tuesday's.
Solve
:  For Canada,you have (part A) estimated the median number of births for Tuesday and also for Sunday, from the graph.  Take the number for Sunday, divide by Tuesday's number, restate as a percent. For U.S., use the numbers on p. 1
1, dividing Sunday by Tuesday.
Conclude
, something like this:  " In Canada, on Sunday(s), the number of Sunday births was ___% of the number of births on Tuesday. In US [make the parallel statement.]  Therefore the difference is greater(?) in (Canada?US?).  This may indicate that proportionately more "planned births" occur in (Canada?US?)." (Remember we decided the most likely reason for the weekday/weekend difference was planned births--induced and Caesarians.)

    c)  The picture for 2.36 makes the difference between weekdays and weekend days look more extreme than it actually is.  Why/how?
    d)  To make the numbers more comparable,  (U.S. Means (per day) of all births in a year of Sundays/Tuesdays, Canada median number per Sunday/Tuesday) it would be better if we had the Canadian Means also.  Is it likely to make much difference? To address this, look at the boxplots and tell (using skewness) whether the Canadian mean for Tuesday would be less than the median, about the same, or more than the median.  Do the same for Sunday.




Go to Mac 101 Computer Lab Friday (Feb. 3) for SPSS Bring Flash drive.  Sign up today for 10:30 session if you can.  Back here Monday.--  At that point we'll be using SPSS heavily for about 3 1/2 weeks, then not again till the very end of term.  SPSS for you??

First hourly exam, a week from Friday: Feb. 10, Day 9 .
Sample exam  handed out  Friday or Monday.  Solutions will be linked from Day pages.  Closed book, but bring one sheet of notes (anything you like) and a calculator.
Exam will cover thru what is assigned on this coming Monday, Plus reading SPSS output. (We'll be going to the computer lab to learn SPSS this  Friday.)   You may be asked to read SPSS output (as we'll see it on the sample exam), but not how to produce it. 


Handout today:  SPSS--Mean and SD

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Homework questions? Day 4

I didn't say Wed:
Median:
  Middle one if n is odd, or average the 2 middle  if n is even.
         Formula:  Count in how far?  (n+1)/2 places.  ( 14 items--> 7 1/2 places? go halfway =average the 7th and 8th observations.  HW: 2.1, wood n = 20; (20+1)/2 = 10.5:  halfway between 10th &11th: average the 10th and 11th.)   

New today:
SPREAD: 
Quartiles, five number summary, boxplot, IQR.  Notes  Day 4    
   4-step process (Day 4, bottom)

SPSS Friday; start here Monday.
Cartoon  

Summaries of Middle & Spread continued--"Systems:"
-- (Midrange, Range  Very sensitive to outliers--they use only the max and min!)
-- Median, IQR  (+ Quartiles Q1, Q3, 5-number summary), based on percentiles ( j'th percentile is > j% of the data)
-- Mean, StandardDeviation "x-bar" (or "y-bar"), "s"  (good for symmetric unimodal, no outliers)

Standard deviation (measure of Spread that goes with mean)
    Variance s2:  (almost) average of squared deviations from the mean.
                 (Divide by (n-1) "degrees of freedom")
    s : Standard deviation  is the square root of the variance.
            Computation:  I will require you to know how to do it by hand for 4 or 5 observations
   (see BPS5e p. 49-51 for formula & computation example. )
Demo:  1,1,2,4, mean = 2, sum of squared deviations = 6, variance = 2, s = 1.41 (Using Table to calculate sum of squared deviations below)
..

1,1,2,4,12, mean = 4, sum of squared deviations = 86, variance = 21.5, s = 4.64.
(Midcomputation check:  Sum of deviations from the mean (before squaring each) always = 0 )

...
--s is Always > 0  (0 only if all observations are =)
--s units the same as those of the observations (squared and squarerooted).
        Physics: angular momemtum (spinning ice skater)

         Not so weird: High school geometry?
        Remember Pythagorean theorem:
         c2 = a2  + b2:

                Hypotenuse of right triangle is also square root of a sum of squares.

Very sensitive to outliers (the outliers contribute much more than their share to the Sum of Squared Deviations from the Mean)   Note contribution of 12  is disproportionate.

Mean and Standard Deviation are for Symmetric Unimodal  distributions without big outliers.
   (ideally "Bell-shaped" = Normal)

SPSS, for simple computation: Handout


Sievers   home
 Math151-Sp12/Days5.htm  1pm 2/1/12
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.in the box.
- - - - - - - - - - - - - - - - -
Table for calculating sum of squared deviations, for n = 4 observations.
x
x-xbar= x-2
(x-xbar)2
1
-1
+1
1
-1
+1
2
0
0
4
2
4
8 = Sum.  xbar = 8/4=2
0 = sum (always!)
6 = sum of squared deviations