| Hand
in Wed.
A. Open the Excel file RegressionSlope (or in the folder RegressionDemosExcel for D&V in ClassMaterial\Math151 D&V). Change x-y values in the yellow boxes and watch the line change. Change x-values in col. F and watch the "run" (red line) change, in the rightmost 2 graphs. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x. Fix it so the increase in x (the "run") is exactly 1. Also, look at the leftmost graph, where the length of the standard deviations are shown, and note that in standard-deviation units, the rise is r s.d.s in y for each s.d. run in x. Print the page to hand in. B. Practice fitting lines: Use the Moore
website ("Do
this, "bottom of Day 14) and try to fit at least 4
different
data sets. Write down on your paper what you discovered (were
your judgment errors consistent in any ways--did you have any
surprises?)
|
Read,
to discuss |
Optional |
Residual: Look at an individual observed (x,y)
data
pair. The residual is the "leftover" amount of y after predicting
a y using the line. Visually, length of vertical line drawn from
y to regression line (+ if point is above line, - if point is
below
line)
Residual = observed - predicted
=
Data - Model e = y -yhat.
Govsal vs. avpay:
Govsal = 28,569.69 + 2.71*avgpay (Predicted
Governer's
salary increases $2.71 for every dollar increase in a
state's
average pay.)
Visually, SPSS (handout, p. 3, bottom: In Edit mode,
Insert>Spikes:
Spike to: Regression) Govsal-deviations.spo
Calculating: Montana (17895,
55502)
Predicted
Govsal
= 28,569.69 + 2.71*17895 = 28,569.69 + 48495.45 = 77065.14
Residual
=
55,502 - 77065 = -21563, $21563 below expected
value.
Extrapolation: (p. 148&163-5) Using the line to predict for x's outside the range of the data: The association may change away from what you have data for. Be cautious! especially in predicting far into future.
The Regression line equation:
If we standardize both the x-values and the
y-values,
the slope will just = r ! zyhat = r zx
And the intercept will be at
(0,0)
(Which was the point given by the two means,
(xbar, ybar)
in the original graph.)
govsalstd.sav,
govsalstd.spo . (also in SPSS for Class 05 folder)
Also Excel, RegressionSlope.xls
in
ClassMaterial\Math151
D&V\RegressionDemosExcel for D&V
To find the equation yhat
= b0 +
b1 x in
"real" units: calculate the "coefficients" b1,
b0
b1
: A change of one standard deviation in x
corresponds
to a change of r
standard deviations in y,
along the regression line.
The slope b1
expresses change in y-units per x-unit. (Suppose
x
is inches, y is pounds. Then b1
is in pounds per inch.) You can
find b by multiplying r by the standard deviation of the y's (that's in
pounds) and dividing by the standard deviation of the x's (that's
in inches) . In algebra (p. 140)
b1 = r times (s.d. of
y)/(s.d. of x)
b0
: The line goes through (xbar,
ybar).
If you know this, you know ybar = b0+
b1(xbar). You can solve this for
b0 ,
b0
= ybar - b1(xbar).
So, if you have the means and standard deviations and
r, you can find the regression equation.
See p. 141, text. That
example
is incomplete: they found the b's but didn't write the equation:
Av.cost-per-person = 2,266.61 -36.21
Peak-fwy-speed.
Check units.
P.142 slope = -36.21 $/mph: For every mph increase
in peak freeway speed, there is a decrease in cost of $36.21 per
person. Or: "Traffic delays cost each urban area resident
about
$36 for every mph the freeways are slowed at peakperiod."
Start here Wednesday
Pattern in graph of residuals: (p.162) If you
graph
residual values against x (or against predicted y's), you eliminate
visually the linear portion of the association. (The regression line
"becomes"
the new x-axis; a "shear" transformation)
Excel Residuals.xls
in ClassMaterial\Math151
D&V\RegressionDemosExcel
for D&V
Curving or other structure may stand out more visibly. "Good"
fit = no structure in residuals.
SPSS: (old wing) (Handout bottom p.4&3)
Analyze>Regression>Linear.
Plots button, *ZRESID on *ZPRED. Save button, Residuals:
Unstandardized
calculates all the residuals and saves them as a new variable.
| Sievers home | Math151-Sp05/Days16.htm | 3:15pm | 3/7/05 |