Math 151 , Fall 2002, Day 3, Wednesday Sept. 4 Hit reload to get most current version

Today at 12:30, Macmillan 101--introduction to Activstats.  All welcome--better if you bring Walkman type headphones.
Amanda's hours on the Helpers page.  Classlist posted, check yours!
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
HW assignment  Day3, Wed. Sept 4
From David S. Moore, The Basic Practice of Statistics, unless otherwise noted.
Reading:   rest of 1.1, 1.2: to p. 32 for this hw.
Ahead: For next assignment day 4:  5-number summary and boxplots,to p. 37,
        +  annotated 5-number summary page handout (handed out today),
        +   ( standard deviation & summary), p. 37- 42.
Do the means and medians required here by hand (with a calculator).  Make the timeplot(s) by hand.
Hand in 
(review: p. 14, 1.8)
p. 19 1.10 (time: trend&cycles)
Make a timeplot of McGuire's HR's (data p. 28, or p.23, 1.19)  Any trend? 

p. 32 1.28 (C-sec. mean and med.)
   1.29 (rich: mean or med?)
p. 45, 1.48 (mean or median?)

Read, to discuss

p. 69 1.74 (hospital discharges)
 
 

p.45, 1.46 (net worth) &47 (athletes)
 http://www.whfreeman.com/scc Choose "Statistical Applets",Mean &Median. Check out symmetric, skewed, distributions with outliers.

Optional 
(review: p. 14, 1.7  describe lighning, Shakesp.)
p. 22ff, 1.21 (time: flu-lag)
(Activstats:  Finish Ch.3-4, Do 4-1 Center (Midrange isn't in Moore). 4-2 Spread lightly (Get concepts: you can postpone calculating standard deviation, SPSS.  We'll learn to do quartiles by hand first, then s.d.). )
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Sec 1.1, cont.
Data:  Numbers (usually) in context:  What, Who (how many), Why?  When and Where? How?
     Hair color:  influenced by how asked?  (differences between attendance sheet and info sheet).
        Spelling:  "Code" to numbers for computer.  Open-ended vs. list of choices.
        When?  class has changed since this compilation.
Distribution of one variable:  what values, how many (or what proportion) of each.
Graphical summaries of data: Area represents proportion.
     Categorical:(Bar chart ordered by size = "Pareto chart"--not in text)
      Quantitative: Shape (symmetric, skewed (think smeared, or sliding) right or left),
             (Humps:  uni- or bi- modal (multi-)   Two humps = two "causes"?)
              Center, spread--rough eyeballing--specific measures next
What do we see?  What can we infer? (Introduction)
    Data source? Lurking variables?
    Variability happens.  Things settle down on average, BUT inferences are never certain.
    Statistics gives us a language for talking about uncertainty.
HW questions?

Time plot. (pp. 17-19) Time on horiz. axis, values on vertical.  trend? (general slope up or down). Cyclic?.
  --Beware of extrapolation --predicting a time trend into the future.
  -- Research data: time, or order of taking measurements, is often a lurking variable.  Always do a time plot.

Section 1.2:  Summarizing distribution info with numbers
Measures of middle (central tendency)
        --Colloquially "average" can refer to any measure of middle, so watch out; be more specific.
    Mean (most common "average"):  Take sum (aggregate) of all observations and divide by how many (n)
        Metaphors.  1) Center of gravity, balance point of histogram.
                2) Slice off bits from the big and add to the little till everyone has the same.
                    (Or "aggregate"--total-- it all and portion it out.)
        Outlier or long tail will pull mean in that direction (think seesaw balancing)  "Sensitive" to outliers, skewness.
        Especially useful: 1) For symmetric, tidy distributions
            2) When metaphor 2 makes sense--looking for "fair share" of a total.
    Median: half are bigger, half are smaller
        Point on histogram with half the area to the left, half to the right.
        Calculating:  Put observations in numerical order (stemplot!).
                            Middle one if n is odd, or average the 2 middle  if n is even.
                Formula:  Count in how far?  (n+1)/2 places.  (7 1/2 places? go halfway =average the 7th and 8th observations)
        "Resistant to skewness and outliers"--trimming off ends will make little difference in median value.
        More "typical" than mean, if there is skewness or outliers.
     (Badly bimodal distribution--"middle" doesn't mean much.)
    Symmetric distribution: mean = median
Author's website http://www.whfreeman.com/scc Select a Category, choose "Statistical Applets", Mean &Median. Check out symmetric, skewed, distributions with outliers.

Measures of Spread (dispersion, variability)
    Range:  largest - smallest.   Resistant?  NO!  Two observations carry all the info; the rest could be anywhere.
Dot plots of 3 distributions, all with same range:
.       .
.       .
.       .
.       .
__________
                                   We need measures of spread that will better take into account  all the observations:
.........
__________
            Quartiles, five-number summaries, boxplot, InterQuartile Range. (HANDOUT)
    ..
    ..
.   ..  .
__________
                                    Variance, Standard deviation.


Sievers home  Math151-Fall02/Day-3.htm  10p  9/3/02
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.