Math 151 , Spring 2004, Day 3, Friday Feb. 6 Hit reload to get most current version

Class email list: Math151@wells.edu. (Wells email only) Tell me if you didn't get the notice .
Class list is up:  Class members,  check your entry (email me if not ok.)
HW assignment  Day3  (From Moore, The Basic Practice of Statistics, 2nd ed, unless otherwise noted.)
Reading:   rest of 1.1, 1.2: to p. 32 for this hw.
Ahead: For next assignment day 4:  5-number summary and boxplots,to p. 37,
        +  annotated 5-number summary page handout (handed out today),
        +   ( standard deviation & summary), p. 37- 42.
Do the means and medians required here by hand (with a calculator).  Make the timeplot(s) by hand.
Hand in 
(review: p. 14, 1.8)

p. 19 1.10 (time: trend&cycles)
Make a timeplot of McGuire's HR's (data p. 28, or p.23, 1.19)  Any trend? 

p. 32 1.28 (C-sec. mean and med.)
   1.29 (rich: mean or med?)
p. 45, 1.48 (mean or median?)

Read, to discuss (be able to answer in class)

p. 69 1.74 (hospital discharges)
 
 

p.45, 1.46 (net worth) &47 (athletes)
 http://www.whfreeman.com/scc Choose "Statistical Applets",Mean &Median. Check out symmetric, skewed, distributions with outliers.

Optional 
(review: p. 14, 1.7  describe lighning, Shakesp.)
p. 22ff, 1.21 (time: flu-lag)
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Sec 1.1, cont.
Data:  Numbers (usually) in context:  What, Who (how many), Why?  When and Where? How?
     Hair color:  influenced by how asked?
        Spelling:  "Code" to numbers for computer.  Open-ended vs. list of choices.
        When?  class has changed since this compilation, not all data was there last time..
Distribution of one variable:  what values, how many (or what proportion) of each.
Graphical summaries of data: Area represents proportion.
      Pretest:  Restate #5 as histogram of 100 "5-volt" batteries tested for actual voltage.
              The proportion with voltage < 1 is 20.  The proportion with voltage < 3 is 60.
               a) What proportion have voltage beween 1 and 3?  b) What proportion have voltage > 3?
      Quantitative: Shape (symmetric, skewed (think smeared, or sliding) right or left),
             (Humps:  uni- or bi- modal (multi-)   Two humps = two "causes"?)
              Center, spread--rough eyeballing--specific measures next

What do we see?  What can we infer? (Introduction)
    Data source? Lurking variables?  (pulse: stair climb)
    Variability happens (Skewed heights this term).  Things settle down on average, BUT inferences are never certain.
    Statistics gives us a language for talking about uncertainty.
HW questions?

Time plot. (pp. 17-19) Time on horiz. axis, values on vertical.  trend? (general slope up or down). Cyclic?.
  --Beware of extrapolation --predicting a time trend into the future.
  -- Research data: time, or order of taking measurements, is often a lurking variable.  Always do a time plot.

Section 1.2:  Summarizing distribution info with numbers
Measures of middle (central tendency)
        --Colloquially "average" can refer to any measure of middle, so watch out; be more specific.
    Mean (most common "average"):  Take sum (aggregate) of all observations and divide by how many (n)
        Metaphors.  1) Center of gravity, balance point of histogram.
                2) Slice off bits from the big and add to the little till everyone has the same.
                    (Or "aggregate"--total-- it all and portion it out evenly.)
        Outlier or long tail will pull mean in that direction (think seesaw balancing)  "Sensitive" to outliers, skewness.
        Especially useful: 1) For symmetric, tidy distributions
            2) When metaphor 2 makes sense--looking for "fair share" of a total.
    Median: half are bigger, half are smaller
        Point on histogram with half the area to the left, half to the right.
        Calculating:  Put observations in numerical order (stemplot!).
                           Middle one if n is odd, or average the 2 middle  if n is even.
                Formula:  Count in how far?  (n+1)/2 places.  (7 1/2 places? go halfway =average the 7th and 8th observations)
        "Resistant to skewness and outliers"--trimming off ends will make little difference in median value.
        More "typical" than mean, if there is skewness or outliers.
     (Badly bimodal distribution--"middle" doesn't mean much.)
    Symmetric distribution: mean = median
Author's website http://www.whfreeman.com/scc Select a Category, choose "Statistical Applets", Mean &Median. Check out symmetric, skewed, distributions with outliers.

Measures of Spread (dispersion, variability) next:
    Range:  largest - smallest.   Resistant?  NO!  Two observations carry all the info; the rest could be anywhere.
Dot plots of 3 distributions, all with same range:
.        .
.        .
.        .
.        .
__________
                                   We need measures of spread that will better take into account  all the observations:
..........
__________
           Quartiles, five-number summaries, boxplot, InterQuartile Range. (HANDOUT)
    ..
    ..
.   ..   .
__________
                                    Variance, Standard deviation.



Sievers home  Math151-Sp04/Days3.htm  10pm 2/5/04
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.