Math 151 , Spring 2005, Day 16 Mon. March 7 Hit reload After class, corrected

Day 16 (Mon. March 7): Reading: Read D&V Ch8 & Ch9 thru 165 top, Do AS8 Regression.
      Ahead, rest of Ch9,  (AS9, lightly)
Hand in Wed.
A. Open the Excel file RegressionSlope (or in the folder RegressionDemosExcel for D&V in ClassMaterial\Math151 D&V).  Change x-y values in the yellow boxes and watch the line change.  Change x-values in col. F and watch the "run" (red line) change, in the rightmost 2 graphs. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x.  Fix it so the increase in x (the "run") is exactly 1.   Also, look at the leftmost graph, where the length of the standard deviations are shown, and note that in standard-deviation units, the rise is r s.d.s in y for each s.d. run in x. Print the page  to hand in.

B. Practice fitting lines:  Use the Moore website ("Do this, "bottom of Day 14) and try to fit at least 4 different data sets. Write down on your paper what you discovered (were your judgment errors consistent in any ways--did you have any surprises?) 
(All D&V p. 153ff unless otherwise noted)
9 Real estate (I think it should be "-6000" in part d)
24 Veggie burger 
36 a thru d Gators
21(SPSS) a,b,c &23 Used cars
------
1 a, b  line equation 
17 SAT scores
26 Chicken (y = calories, x = fat)  a thru f only.
- - - - - - - - - - - - -Postpone rest
C.  Use Residuals.xls from here or the lab(in  ClassMaterial\Math151 D&V\RegressionDemosExcel for D&V) to graph these data sets, along with a graph of the residuals.  Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) 
   a)  x 1 2 8 4 6 9 
       y 1 3 6 6 7 5 
   b) x 1 2 7 4 6 9
      y 7 6 2 4 2 1
3 Residuals
32 Birthrates (type the data into SPSS. Make a plot of residuals also, to help with 32) 
SPSS Handout p. 3:  You can do all but #10 at this point. Keep till we finish that.

Read,
to discuss 
Optional 
HW, questions?  What did you see in your circle data?
Regression line: D&V Ch 8&9, AS8&9, A model that Predicts or estimates a y (vertical) value for a given x, using a straight line. ("line of best fit, least squares line") "Regressing y ON x"
See Day 14
SPSS will fit a regression line to data (back page of handout).  While  Editing graph, Insert>Fit line>Regression.
Get line, Equation of line and R2 (the square of the correlation coefficient).  Govsal on avgpay

Residual:  Look at an individual observed (x,y) data pair.  The residual is the "leftover" amount of y after predicting a y using the line.  Visually, length of vertical line drawn from y to regression line (+ if point is above line, -  if point is below line)
   Residual = observed - predicted = Data - Model   e = y -yhat.
       Govsal vs. avpay:   Govsal = 28,569.69 + 2.71*avgpay    (Predicted Governer's salary  increases  $2.71 for every dollar increase in a state's average pay.)
  Visually, SPSS (handout, p. 3, bottom:  In Edit mode, Insert>Spikes: Spike to: Regression)  Govsal-deviations.spo
     Calculating:  Montana (17895, 55502)
           Predicted Govsal = 28,569.69 + 2.71*17895 = 28,569.69 + 48495.45 = 77065.14
           Residual = 55,502 - 77065 =  -21563,  $21563 below expected value.

Extrapolation:  (p. 148&163-5) Using the line to predict for x's outside the range of the data:  The association may change away from what you have data for.  Be cautious!  especially in predicting far into future.

The Regression line equation:
    If we standardize both the x-values and the y-values, the slope will just = r !   zyhat = r zx
     And the intercept will be at  (0,0)  (Which was the point given by the two means, (xbar, ybar) in the original graph.)
         govsalstd.sav govsalstd.spo .   (also in SPSS for Class 05 folder)
     Also Excel,     RegressionSlope.xls in  ClassMaterial\Math151 D&V\RegressionDemosExcel for D&V

To find the equation yhat =  b0 + b1 x  in "real" units:  calculate the "coefficients" b1, b0
   b1 : A change of one standard deviation in x corresponds to a change of r standard deviations in y, along the regression line.
 The slope b expresses change in y-units per x-unit. (Suppose x is inches, y is pounds. Then b is in pounds per inch.) You can find b by multiplying r by the standard deviation of the y's (that's in pounds)  and dividing by the standard deviation of the x's (that's in inches) .  In algebra (p. 140)
            b1  = r times (s.d. of y)/(s.d. of x)
   b0 :  The line goes through (xbar, ybar).  If you know this, you know ybar = b0+ b1(xbar).  You can solve this for b0 ,
           b0 = ybar - b1(xbar).
  So, if you have the means and standard deviations and r, you can find the regression equation.
See p. 141, text. That example is incomplete:  they found the b's but didn't write the equation:
    Av.cost-per-person = 2,266.61 -36.21 Peak-fwy-speed.   Check units.
P.142  slope = -36.21 $/mph:  For every mph increase in peak freeway speed, there is a decrease in cost of  $36.21 per person.  Or: "Traffic delays cost each urban area resident about $36 for every mph the freeways are slowed at peakperiod."
Start here Wednesday
Pattern in graph of residuals:  (p.162) If you graph residual values against x (or against predicted y's), you eliminate visually the linear portion of the association. (The regression line "becomes" the new x-axis; a "shear" transformation)
   Excel Residuals.xls  in  ClassMaterial\Math151 D&V\RegressionDemosExcel for D&V
Curving or other structure may stand out more visibly.  "Good" fit = no structure in residuals.
SPSS:  (old wing) (Handout bottom p.4&3)  Analyze>Regression>Linear.   Plots button, *ZRESID on *ZPRED. Save button,  Residuals: Unstandardized calculates all the residuals and saves them as a new variable.


Sievers home  Math151-Sp05/Days16.htm 3:15pm 3/7/05
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.