Math 151 , Spring '12, Day 4, Mon. Jan. 30 .After class. Hit reload to get most current version

HW assignment  Day4  (From Moore unless otherwise noted.)
Timeplot, p.23-5.  You will need to be able to recognize cyles and trends, not make timeplot by hand. (We'll make them in SPSS later.)
Now  Ch.2 thru p. 43.(mean/median) Do Check p. 58 2.15, 16, 18.  PLEASE Read ahead! pp. 43-49 (5#summary/boxplot) Do Check p.59 17,19,20   Read 55-6, "Organizing...". then: Finish Ch. 2 pp. 49-55(standard deviation, using technology). ("Check" problems: skip 2.21, do 22-24)
Do the 5-number summaries required here by hand (with a calculator if needed for means, finding middles between 2 numbers). 

Hand in next class.:  (note,The first set of  problems is copied from Day 3 (postponed).  Label everything  Day 4.)
p. 36, 1.44 Housing starts timeplot
Chapter 2 
Questions--mean/median
p.41, 2.1 Wood, mean Punch the 20 actual values into your calculator, adding and dividing by 20. 
A.  Find the (approximate) median for the data on Wood breakage, using the numbers in the stemplot on p. 21 (Fig.1.11.)  It's approximate because the stemplot data is rounded--Quick and Dirty is often sufficient!  Keep a copy for #2.28, which will be assigned soon.  Note: how far to count in?  I didn't say in class.  Count (n+1)/2 places: (20+1)/2 = 10.5; halfway between 10th and 11th.  You probably figured that out without the formula.

p. 43, 2.4 (house prices nationally--mean or median)
p. 59, 2.25, 2.26  mean or median?

A.  You are driving on the thruway from Syracuse to Rochester and keep track of how many vehicles you pass and how many pass you.  You find that these 2 numbers are the same.  Your speed on the thruway is: (a) the Mean speed of the cars, (b) the Median speed of the cars, (c) the Modal speed of the cars.  Choose one, and justify your choice.

Text material for Questions for the rest of this page
p. 64-5, 2.46 Activity/obesity: Use the 4-step process, see below, p. 55-7&/or inside front cover.  Note that "state",  the first step, is usually "done"=the textbook statement of the problem. Here's your Plan: Use back-to-back stemplots to plot the data.   That, medians, and ranges (highest-lowest) should be enough here, plus discussion and description.  (With only ten observations each, in front of you in stemplots, I think more computation is overkill. )
 

YES:DON'T hand in any of the ones below yet, but :
5# summary, boxplots (YOU CAN Work on these but keep them to hand in as part of Day 5:  You can find ordered stemplots if there's raw data; you'll be using the ordered numbers to find medians, and next, quartiles, then boxplots. Text material for Questions on 5# summary, boxplots
p. 60, 2.28 Wood again. Go ahead and use the stemplot figures to find the quartiles.  Also make a boxplot.
p. 60, 2.30 fruit eating
p. 61, 2.35 guinea pigs survival:  For a) use the One Variable Statistical Calculator Applet at  http://bcs.whfreeman.com/bps5e   or on your text's CD .  Observe the skewness, sketch on your paper.  For b), find the 5-number summary (easy since they're in order in the book), check your answers with the Applet results.  Draw the boxplot and compare with the histogram on your screen.  (marking or not marking outliers, I don't care.)

p. 60, 2.27 U. endowments.  They mean, what do you have to count in to, in the list, to locate the mean and quartiles?

p.60, 2.29 Flower length: Find the 5-number summary for Yellows.  You may use the stemplot data p. 57. If you want more practice, do the other 2 by hand also, but you may just use the numbers from the answers in the back of the book.  Use them to make 3 side by side boxplots, and finish the problem as written.

+ + + +Another + + + +

p.57, 2.13 Rainforest logging.  (Big picture--how fast does a forest recover from logging?) Use the 4-step process, see below, p. 55-7&/or inside front cover.  Note that "state",  the first step, is usually "done"=the textbook statement of the problem.  The data are probably suitable for mean& standard deviation, but we don't have the SPSS power to do them easily yet, so use your hand methods--stemplots, quartiles, boxplots...  This is one where working together with others can have real benefits, since it's pretty open-ended.

"Read," to discuss (be able to answer in class)

Ch. 2  Questions--mean/median

p. 62, 2.37 Thinking about means Look at the answer in the back for the mean..

p. 62, 2.38 Thinking about medians


 
  p. 61, 2.33 Resistance, with Applet
 http://bcs.whfreeman.com/bps5e   Or use CD from book.  Choose "Statistical Applets",Mean &Median. Also, add more points (up to 50 total). Check out symmetric,  skewed, distributions with outliers.

 

Optional 

- - - - - -
 P. 61, 2.34 (mean/median play, with Applet)

p. 63, 2.41, 42 (more play, with pencil--and/or open the Applet, One Variable Statistical Calculator, type data in at the Data tab, see
Statistics, stemplot)


Sign in on the clipboard. .
--Compare HW with others, tell me unanswered questions, write #s on the board.
--Friday we'll meet in computer lab Mac 101 for intro to SPSS. 
At that point we'll be using it heavily for about 3 1/2 weeks, then not again till the very end of term.  SPSS for you??
    Lab session offered 10:30 & 11:30.  Sign up Wed. 
(Not binding)


  Class members   Math151@wells.edu (??)
HW questions? Day 3
  http://bcs.whfreeman.com/bps5e  for 1.35, Doctors (they round, so 798 goes on as 8|0.  Yours (truncating)  should look a little different, but match a histogram.) 
   http://bcs.whfreeman.com/ips7e/  CO2 per cap. by country.  The data here is a little different from yours, but should be same rough shape.  Different years?  (I used the references  and tried to track them down, but the data didn't match up perfectly to any source.)

Overview--"Variability happens, but things settle down in long run." Notes, Day 2
Timeplots. Trend, cycles   Beer1

Measures of
Middle 
Mean/ median. Notes, Day 3 

...
Measures of Spread (dispersion, variability)  distributions with different spreads
    Range:  largest - smallest.   Resistant?  NO!  Two observations carry all the info; the rest could be anywhere.

Dot plots of 3 distributions, all with same range:
.        .
.        .
.        .
.        .
__________
                                   We need measures of spread that will better take into account  all the observations:
..........
__________
           Quartiles, five-number summaries, boxplot, InterQuartile Range.
    ..
    ..
.   ..   .
__________
                                      (Variance), Standard deviation.
Start here Wednesday
Quartiles Divide data into quarters: 1st quartile Q1: 1/4 below, 3/4 above. = 25th percentile.
             (2nd quartile= median = 50th percentile)
            3rd quartile Q3: 3/4 below, 1/4 above.  = 75th percentile.

Computation of quartiles:  Different texts, packages use different methods.
By hand: We'll use Tukey's quick and dirty: (he called them "hinges")
Take the two halves of the data you got from finding the median.  Find the median of each half, using the same rule as before.  (Detail.  IF you had an even number of observations to start with, the data divides evenly into an upper and a lower half. No problem.  IF you had an odd number to start with, you have one in the middle, the median. In this case only, you throw the median away, and use the remaining halves.)
1 3 5 6 8 8 11 20, are n=8 observations.
    Median at (8+1)/2= 9/2=4 1/2th 1 3 5 6 | 8 8 11 20, M = 7
 8/2 = 4 in each half: Halves are 1 3 5 6, and 8 8 11 20.  The quartiles are the medians of each half; count in to (4+1)/2= 2 1/2th place. 
1 3 | 5 6
Q1=(3+5)/2= 4.         8 8 |11 20. Q3= (8+11)/2= 9.5                                          
                                                           
1 3 | 5 6 | 8 8 | 11 20

1 3 5 6 6 8 8 11 20, are n=9 observations.
     Median at (9+1)/2=10/2=5th ; 1 3 5 6 6 8 8 11 20, M = 6
 Throw away the median.  Now we have an even number again, 8 numbers
8/2 = 4 in each half: Halves are 1 3 5 6, and 8 8 11 20.  Continue as before. (This is a  dirty method because it gives the same quartiles for both these data sets.  Quick because computation is minimal and simple.)
1 3 | 5 6 6 8 8 | 11 20

Five-number summary:  min, Q1, Median, Q3, max.  (1, 4, 6, 9.5, 20  for the set of 9 above)
...
    InterQuartile Range
= IQR= Q3 - Q1.
(9.5 - 4 = 5.5 for both sets above)
      =The range of the middle half of the observations.  Resistant to outliers!
How to put numbers in order?  Stemplot is good! 
  StudatSp08plussome.xls (scroll down)

Box (and whisker) plot: 
Graphical form of five number summary.
    Especially good for comparing sets of data, conditioned on a categorical variable.
"Plain vanilla--Moore" Draw and label the numerical scale first.  Then mark the five numbers. Finish the picture.
The box spreads over the middle half (Q1 to Q3), the whiskers over the lowest and highest quarters (Min to Q1, Q3 to Max).  Each section shows the spread of 1/4 of the data: the longer the section the thinner the data must be spread in there.   Can "read" skewness.
Demonstration with set of 9. 1 3 | 5 6 6 8 8 | 11 20    5#summ: 1, 4, 6, 9.5, 20
Direction of boxplot?  Vertical or horizontal is a matter of taste. I do horizontal, usually.

  |-----[   |      ]--------------------|
0·········5·········10········15········20

"Showing outliers" pp.47-9. Outliers can make a boxplot whisker extend deceptively beyond the bulk of the data.
      Make the whiskers to the last item in the "main mass" of the data.
       Put a dot or a star for each outlier,  beyond the whisker end.
   How do we decide what's an outlier?  By hand; use your judgement!
     (Rule of thumb
: Knowing rule is optional--used by computers) Define "outlier" as a value farther out than 1.5 IQR  from the Quartiles.
          (Q1 - 1.5 IQR is lower "fence", Q3 + 1.5 IQR is upper "fence".)
                For the set of 9, 1.5 IQR = 1.5×5.5. = 8.25. Fences are 4 - 8.25 = -4.25, and 8 + 8.25 = 16.25.
                   So 20 lies outside the fence, and the whiskers & box  should go from 1 to 11 (largest value inside the fence)
        (Dot or *?  Tukey:  Dot ·between 1.5 and 3 IQR's out, * if more than 3 IQR's out. By hand, I don't care. Here a * because it shows better.)

  |-----[   |      ]--|                 *
0·········5·········10········15········20 
   This is the same as we would have done without the rule, probably.

Example:  p.60, 2.34 Guinea pig survival: (redo for hw) From
      http://cnx.org/content/m17103/latest/Ch2_boxplot_4.png
Use the One Variable Statistical Calculator Applet at 
http://bcs.whfreeman.com/bps5e
   Compare boxplot with histogram:  longer boxplot sections mean lower histogram height and vice versa.
<<-- Example:   from Connexions:
Collaborative Statistics
Barbara Illowsky, Ph.D., Susan Dean.
 

Organizing a statistical problem: Four-step process (pp. 55-7, & inside front cover) 
State: the issue to be explored, question to be addressed (real-world)  (In HW problems, often already stated.)
Plan:  What statistical tools, measures, analyses should we use to answer the question?
Solve:  Carry out the process.  (May need to back up & try again.  You decide on mean, s.d., but stemplot shows badly skewed?  go back and decide on 5#summary instead.)
Conclude:  Give the conclusion as it addresses the real-world question/issue.
Any time left?? .won't be probably. Begin p. 57, 2.13 in class in pairs (or 3's).  Decide what analyses to do; start doing them (make a copy for each person, if you won't be working together outside of class.)


Sievers home  Math151-Sp12/Days4.htm  1pm 1/30/12
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.