Math 151 , Fall 2005, Day 8 Mon. Sept. 12 Hit reload ...After class

Day 8 (Mon Sept. 12): Reading: (Re)read D&V Ch5 (Re-expressing paragraph p.68 is optional, but don't miss any of What can go Wrong pp. 67-8) Re(do) AS Ch5 .  Changing units and Normal dist:  Start D&V Ch6 pp. 82-98.  (Normal Prob. Plots p. 94-95 is  Optional, but don't miss What Can Go Wrong, p95 bottom).  & AS Ch. 6, in order.Changing units D&V 84-5, AS 6-1 ¶activities 1&2
Hand in (All D&VCh5 p. 72ff except as noted)
Mean/Median.  (from Day4)
7a,b,c,Payroll  Also, with c: What measure would be most useful if you wanted to use it to figure the total weekly payroll cost? 
6 Sick days
+ + + + + + + + + + 
(Problems Continued from Day 7):
19, 20 (no computations needed.  19 d may not be decidable from pictures.  Don't worry about it.)
5 Mistake 
9 Standard deviation First, make  dot plots of each pair on axes with the same unit size, find the mean of each set and mark it with a little ^ (like fig. 5.6 p. 64).  Notice this looks like a good balance point. Leave space to calculate  some standard deviations next time.  Also, make a dot plot of  #10b set 2 (10, 50, 60, 70, 110).  Which of the data sets in problem 9 does it most resemble? 

9 Standard deviation, finishing.   You made dot plots of each pair on axes with the same unit size, found the mean of each set and marked it with a little ^ (like fig. 5.6 p. 64).  Note this is the balance point. Which of each pair  has the bigger "spread"?  Calculate  standard deviations by hand for part a;  check b and c in the back of the book.  Also, dotplot and calculate the mean and  standard deviation.for #10b set 2 (10, 50, 60, 70, 110).  Verify that Each number w in #10b set 2 is the number x in #9bset1 less 9, multiplied by 10.  (w=10(x-9))
& & & & & & & & more:
Review probs p110#28 Pay
p. 77 #33 (use SPSS)
Postpone this only: Use the Dotplot Tool (AS5-3--see Day 7 for details.) 
= = = = = = = = = = = = = = = = = 
Do  these, using shift & rescale(D&V 84-85) 
A) The U.S. is almost the only country left that uses Fahrenheit to measure temperatures. To change F to C (Celsius), you subtract 32, and divide by 1.8.  HANDOUT with both scales ("Alias").  Keep the handout. 
a)  The low temperature a few days ago was  500 F.  Calculate the temperature in C. (Check your calculation on the handout scale) 
b)  If the mean low temperature in Ithaca during  Sept. is 40o, and the standard deviation is 100 F, and you want those in Celsius instead, what do you do? Calculate  the results.  (Check your results on the handout scale.) 

B) See  Ch.5 p.72 #9and10, above.  Pair #9 c is a shift.  Check that the mean shifts correctly and that the s.d. stays the same (use the back of the book and your picture.)  Pair #9bset1 and #10b set 2 is a shift followed by a rescale.  (w=10(x-9)).  Check that the mean undergoes the shift and rescale, but the s.d. undergoes just the rescale. 
Ch6 p. 99: 1 Payroll (hint for d: each employee gets 110% of previous) 
3 (SATtoACT)
Do everything above here:  Postpone the following:
C).Complete the Handout: Tables for simple models (densities)

Read,  to discuss 
http://www.whfreeman.com/scc or http://bcs.whfreeman.com/ips5e.  Under Student Categories or Student tools,  choose "Statistical Applets", Mean &Median . (50 points max.)Check out symmetric, skewed, distributions with outliers. How far apart can you get the mean and median? 

13 Marriage age.  Ithaca Journal Jan 22, '05 had quiz answers: "How old is the average bride? 24.5 years.... How old is the average groom? 26.5 years." Give some reasons that could account for the big difference between these numbers and the graphed numbers 
38 Holes What is the problem here? The slow method looks better, mostly, but the summary values are worse! (Examine the data.) 

p.99 2e(effect of outlier) 
Do everything above here
 = = = = = = = = = = 

Op-
tion-
al 
HW, SPSS questions?
  Handouts: F-to-C scale
     (  Optional: SPSS handout to create new computed variables.)
    Tables for simple models (densities)
Summaries of Middle & Spread continued--"Systems:"
-- (Midrange, Range  Very sensitive to outliers--they use only the max and min!)
-- Median, IQR  (+ Quartiles Q1, Q3,5-number summary), based on percentiles(j'th percentile is > j% of the data)
-- Mean, StandardDeviation "y-bar" (or "x-bar"), "s"

Standard deviation (measure of Spread that goes with mean)  See Day 7
     Very sensitive to outliers--they contribute much more than their share to the Sum of Squared Deviations from the Mean.

Mean and Standard Deviation are for Symmetric Unimodal  distributions without big outliers.
   (ideally "Bell-shaped" = Normal)
- - - - - - - - - - - - - - - - - - - - - - - - -
D&V Ch. 6, AS 6
Standardizing an observation or valueNew ruler:
Make the mean the baseline (0) and measure in units one standard deviation wide.
Standardized value = "z-score" = # of standard deviations above the mean
 "raw" y becomes z = (y -ybar)/s p. 83
 Find z:  Subtract the mean from y .  Now you know how far "above" the mean y is, in "raw" units. (If it's below the mean, the number will be negative.)We "shifted" it.  Find how far this is in "standard deviations" by dividing by the standard deviation. (We "rescaled" it. That's the z-score.

Changing units: (D&V 84-85, AS 6-1 ¶paragraphs 1&2)
Variable: your Heights.  Units = inches.  Change this:
  1)  Shift: Take 5 feet = 60" as the new baseline: 60" =0 inches above 5 feet.  How?  Subtract 60 from each value. y-60.
  2)  Rescale: Change to cm.  How?  1" = 2.54cm.  Multiply each value by 2.54. y*2.54  (x or /?  Need more centimeters for the same length, so multiply.  Or a non-American  might know 1 cm = .394 inches, and divide by .394, the length of a cm measured in inches.)
    ( + shift ) Measures of middle should shift  along with the raw data.  Measures of spread are unaffected by +
    ( x/ rescale) Measures of middle and of spread should stretch or shrink along with raw data (We assume we only multiply/divide  by positive numbers.)
To recalculate:  Do the same thing to measures of middle as you do to raw data.
                        To spreads, just do the multiplying or dividing part.
 Shapes  (skewness, humps, clumps, outliers) are not affected by shifts and rescaling.
Start here Wednesday
&&Alias/alibi:  When you change units of measurement for all your data values, you can think of the result 2 different ways:
    "Alias (other name)":  The data distribution sits still. You have just changed the ruler stick you measure by.
             (in/cm ruler.  Thermometer)
    "Alibi (other place)" :  The ruler stick keeps the 0 the same and 1 the same width, and the data distribution with "new" values moves to the new location.  D&V pp. 84-5.
  Optional SPSS handout to create new computed variables.
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
GET  handout HW sheet: "Tables for simple models (densities)"
Models for quantitative variables    (AS6-2 ¶1)
(When values can take on any of a continuous interval of numbers)
Example:  Spinner:  Label edge with continuous values from 0 to 1. Spinning should produce 1/10 of all spins in each colored sector.  Simulations of 500, 3000 spins show roughly true. More spins would get closer to  Uniform shape.

Abstraction, idealized histogram ("Probability Model") =
Density curve. Describes a theoretical distribution of data.
Any such model is a curve
   --always on or above the horizontal axis
   --has area exactly 1 underneath it.

Many, many models are possible, modeling many phenomena:  (Histograms of data for some models) Median, mean, percentiles, standard deviation are defined for a density model in analogy to those for a histogram.
-- median has half of area below and half above.
-- mean is balance point.  On the long-tail side of median if distribution is skewed. Same as median if symmetric.
--First quartile has 1/4 of area below, 3/4 above. Etc. for others.

Numerical summary: (D&Vp.86)
Statistic   from data:        xbar         s           Q1   Median    Q3
Parameter   for model :     µ          sigma       Q1   Median    Q3

Many models have tables to describe them.  Especially percentiles tables showing area to the left of (below) a given value
= theoretical proportion of observations below the value.  30% below x, x is the 30th percentile).

  • You will make and use tables for the simple models on the handout.  These are similar to the table we will use to describe the normal model.  (Table Z, appendix E, p. A-50)


  • Sievers home  Math151-Fall05/Days8.htm  10pm 9/11/05
    This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.