Math 151 , Spring '07, Day 4, Mon, Feb. 5 After class Hit reload to get most current version

HW assignment  Day4  (From Moore unless otherwise noted.)
(Re)Read Ch.2 thru p. 47.  Read 53-55, "Organizing...". Do "check" p. 56,  2.15,17,18 (5#summary/boxplot) Ahead: Finish Ch. 2. ("check" 2.19 (don't calculate. It's not #a), 20, 21, 22)
Hand in Wed:  (note, the first problems are copied from Day 3 (postponed), & there are more.
p. 35, 1.43 Orange prices timeplot

Ch. 2
p.39, 2.1 Wood, mean Punch the 20 actual values into your calculator, adding and dividing by 20. 
A.  Find the (approximate) median for the data on Wood breakage, using the numbers in the stemplot on p. 21 (Fig.1.10.)  It's approximate because the stemplot data is rounded--Quick and Dirty is often sufficient!

p. 41, 2.4  Bonds Home runs Make a stemplot to put the numbers in order to find the medians.  For the means, just punch them in. (You can shorten the work by finding the sum of the 18 years excluding the 73, writing that down, and then adding the 73 to get the total for the 19 years.  Then divide the appropriate sums by 19 and 18.)
p. 41, 2.3,  p. 57, 2.23, 2.24  mean or median?

p. 45, 2.5 Wood again.  Also make a boxplot.
p. 58, 2.28 U. endowments.  They mean, what do you have to count in to, in the list, to locate the mean and quartiles?
p. 58, 2.29 fruit eating
p. 58, 2.30 newborns. 
(I said I wouldn't make you make a histogram, but the data's already pre-binned, so do it here.) Also Describe the distribution--symmetric, skewed?

p. 59, 2.34 guinea pigs survival:  For a) use the One Variable Statistical Calculator Applet at  http://bcs.whfreeman.com/bps4e   or on your text's CD (If you have an older, used book, it may be  in the datasets as if for BPS3e; ex02-23.dat).  Just observe the skewness.  For b), find the 5-number summary (easy since they're in order in the book), check your answers with the Applet results.  Draw the boxplot and compare with the histogram on your screen.  (with or without outliers, I don't care.)
p. 60, 2.35  days of births, CA The book's question is very open-ended.  Answer instead the questions just below*
"Read," to discuss (be able to answer in class)
Ch. 2
p. 57-8, 2.25 Dr's salaries.  Look at the answers in the back for the mean and median.
 
 

p. 58, 2.26 Resistance, with Applet
 http://bcs.whfreeman.com/bps4e  Or use CD from book.  Choose "Statistical Applets",Mean &Median. Also, add more points (up to 50 total). Check out symmetric,  skewed, distributions with outliers.




Optional 
play, with Applet)
P. 59, 2.32 (mean/median play, with Applet)
p. 63, 2.42, 43 (more play, with pencil)


* Questions for 2.35, p. 60:
A.  a) Which day had the lowest Median (and about what was that number)?
     b) Which day had the highest Median (and about what was that number)?
      c) Which day had the highest variability (spread), measured by:
                     --IQR (about what are the quartiles for this day)
?
                     --Range (about what are min and max for this day)?
       d) Tuesday appears to be somewhat skewed.  Left, or Right skewed?
B.  Compare the Canadian with the American data (p. 10):
    a) Is the general pattern the same in the Canadian and American data?  Discuss briefly the common findings.
    b)(Following the 4-step method, p. 53:) State the issue: Is the weekend/weekday difference greater in Canada or the US (or are they similar?)  Formulate an appropriate answer: In both countries, Tuesday is highest, Sunday is lowest. Relate the number of Tuesday's births to the number of Sunday's births for each country.  Proportion/ percents will show the relationship best, since there are different types of numbers for the two countries.   Solve:  For Canada,you have (part A) estimated the median number of births for Tuesday, and for Sunday, from the graph.  Take the number for Sunday, divide by Tuesday's number, restate as a percent. For U.S., use the numbers on p. 10, dividing Sunday by Tuesday. Conclude, something like this:  " In Canada, on Sunday(s), the number of Sunday births was ___% of the number of births on Tuesday. In US (the parallel statement.)  Therefore the difference is greater(?) in (Canada?US?).  This may indicate that proportionately more planned births occur in
(Canada?US?).
    c)  The picture for 2.35 makes the difference between weekdays and weekend days look more extreme than it actually is.  Why/how?
    d)  To make the numbers more comparable,  (U.S. total of all births in a year of Sundays/Tuesdays, Canada median number per Sunday/Tuesday) it would be better if we had the Canadian Means.  (because mean times n = total).  Look at the boxplots and tell whether the Canadian mean for Tuesday would be less than the median, about the same, or more than the median.  Do the same for Sunday.

Friday we'll start SPSS--meet at class time in Mac 101 Computer Lab. (Alternate time: 10:30; sign up Wed.)
Sign in--sign up to see me in my office if you haven't.  Find someone you don't know and introduce yourself.  Introduce yourself to at least one person you know, in case they've forgotten who you are.
Math clinic times are posted outside the clinic (Mac120) but not yet on the web.
Time plots (see Day 3 for details)
Measures of middle 
  Mean, median:  Mean is sensitive to skewness, outliers, Median is resistant to them.  Symmetric distribution? Mean = median!
Measures of spread (dispersion)  See Day 3 for details)
   Quartiles: 
1st quartile Q1: 1/4 below, 3/4 above. = 25th percentile.
             (2nd quartile= median = 50th percentile)
            3rd quartile Q3: 3/4 below, 1/4 above.  = 75th percentile.
        Hand computation (Tukey):  Q's are the medians of the 2 halves.  (Median is a data value? Discard it.)
    
Five-number summary:  min, Q1, Median, Q3, max. 
       
INTERQUARTILE RANGE = IQR= Q3 - Q1. (9.5 - 4 = 5.5 for both sets from day 3)
                =The range of the middle half of the observations.  Resistant to outliers!

Box (and whisker) plot: 
Graphical form of five number summary.
    Especially good for comparing sets of data, conditioned on a categorical variable.
"Plain vanilla--Moore" Draw and label the numerical scale first.  Then mark the five numbers. Finish the picture.
The box spreads over the middle half (Q1 to Q3), the whiskers over the lowest and highest quarters (Min to Q1, Q3 to Max).  Each section shows the spread of 1/4 of the data: the longer the section the thinner the data must be spread in there.   Can "read" skewness.
Demonstration with set of 9. 1 3 | 5 6 6 8 8 | 11 20    5#summ: 1, 4, 6, 9.5, 20
Direction of boxplot?  Vertical or horizontal is a matter of taste. I do horizontal, usually.

  |-----[   |      ]--------------------|
0·········5·········10········15········20

"Showing outliers" p.45ff. Outliers can make a boxplot whisker extend deceptively beyond the bulk of the data.
      Make the whiskers to the last item in the "main mass" of the data.
       Put a dot or a star for each outlier,  beyond the whisker end.
   How do we decide what's an outlier?  By hand; use your judgement.
     (Rule of thumb
: Knowing rule is optional--used by computers) Define "outlier" as a value farther out than 1.5 IQR  from the Quartiles.
          (Q1 - 1.5 IQR is lower "fence", Q3 + 1.5 IQR is upper "fence".)
                For the set of 9, 1.5 IQR = 1.5×5.5. = 8.25. Fences are 4 - 8.25 = -4.25, and 8 + 8.25 = 16.25.
                   So 20 lies outside the fence, and the whiskers & box  should go from 1 to 11 (largest inside the fence)
        (Dot or *?  Tukey:  Dot ·between 1.5 and 3 IQR's out, * if more than 3 IQR's out. By hand, I don't care. Here a * because it shows better.)

  |-----[   |      ]--|                 *
0·········5·········10········15········20 
   This is the same as we would have done without the rule, probably.

Example:  p.60, 2.34 Guinea pig survival: (redo for hw)
Use the One Variable Statistical Calculator Applet at 
http://bcs.whfreeman.com/bps4e
   Compare boxplot with histogram:  longer boxplot sections mean lower histogram height and vice versa.

Next: Standard deviation.

Sievers home  Math151-Sp07/Daysp4.htm  1:30pm 2/5/07
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.