| Hand
in Monday. From Day 14: Correlation: (more) (Problems from D&V) p.130, 13 lunchtime (SPSS) 16 Drug abuse (SPSS) 26 Oil consumption (SPSS) (this is another timeplot) 23 Correlation errors A. If women always married men who were two years older than themselves, what would be the correlation between the ages of husband and wife? (Hint: make a data table and the corresponding scatterplot for 4-5 couples with different x's) Your click-in-the-Circle Data: Created &saved Day 1, ACT 2-2 or 3 ActivStats(Ch.8)HW ACT-2 Circle Correlations. (SPSS) (copied here) What is the association between the time it took you to click in a circle and the size of the circle? Does it typically take longer to click in a smaller circle? What is the association between the time it took you to click in a circle and the distance you had to move to reach the circle? What is the association between the distance of your click from the center of the circle and the size of the circle? Can you account for the pattern you see? Write a paragraph summarizing these relationships. (If you forgot to make scatterplots before computing correlations, you might want to go back and make them now, before anyone notices. Be sure to discuss any unusual patterns or points you see in the scatterplot and note how they might have affected the correlations you computed.)Don't forget to do scatterplots as well as computing correlations. Cf. Circle questions = = = = = = = = = = = = = = = = Regression Prep (Review graphing straight lines if needed--Math clinic) A. Open the Excel file RegressionSlope (or in the folder RegressionDemosExcel for D&V in ClassMaterial\Math151 D&V). Change x-y values in the yellow boxes and watch the line change. Change x-values in col. F and watch the "run" (red line) change, in the rightmost 2 graphs. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x. Fix it so the increase in x (the "run") is exactly 1. Also, look at the leftmost graph, where the length of the standard deviations are shown, and note that in standard-deviation units, the rise is r s.d.s in y for each s.d. run in x. Print the page to hand in. <>B. Practice fitting lines: Use the Moore website www.whfreeman.com/scc("Do this" below) and try to fit at least 4 different data sets. Write down on your paper what you discovered (were your judgment errors consistent in any ways--did you have any surprises?)- - - - - - - - - - - - - Postpone the rest: SPSS Handout: Do problems 7, 8, 9, 11 p. 3. Keep this with the previous work. > <>(Rest:All D&V p. 153ff unless otherwise noted) 21(SPSS) a,b,c & 23a,b,c,d Used cars Keep a copy of your equation. The SPSS data file is missing a value! age 4, price 6995 has been omitted.This gives price = 12519.62 - 940.04*age, R-square = .91. When the missing value is restored, we get price = 12319.59 - 924.0 * age, R-square = .89 The graphs don't look much different. 36 a thru d Gators (See p. 149 for how to read results) |
Read,
to discuss |
Optional
If you feel at all shaky about graphing or using straight lines (slopes, intercepts) be sure to do Linear Equations exercise and Line Equations, Activstats 8-1, activities 3&4 (in preparation for Ch.8) |
Homework questions? Day 14
Regression line:
D&V
Ch 8&9, AS8&9, A model that Predicts or estimates a
y (vertical) value for a given x (horizontal) value: Straight
line!
"Regressing y ON x"
Formula yhat = b0 + b1
x, yhat = a + b x, weight = -70 +3 height.
(inches, pounds)
To predict
a y-value for a given x-value, plug the x value into the formula and
calculate.
60 inches-->110 lb
To do it graphically, use the "Up-and-Over" method .
Find the x, go straight up to the line, then go over to the y-axis;
that
y-value is the predicted y.
b0 or
a or -70 is y-intercept.
b1 or
b or 3 is slope (b1 multiplies
x, the horizontal value):
If
x increases one unit, yhat increases b1
units.
For every inch of height, the model predicts 3 pounds increase in
weight.
RegressionSlope.xls
(or in the folder RegressionDemosExcel for D&V in
ClassMaterial\Math151
D&V)
We all get the same line from a batch of data because we use the
"least-squares
best fit" criterion. (How we get the line by hand, later.)
We are trying to find an "average" (mean) y value for each x value,
with the constraint that they all lie on a straight line.
Do this: Practice fitting "least
squares
best fit" lines: Moore's website,
http://www.whfreeman.com/scc.
Choose "Statistical Applets", Correlation/Regression
Demo.
Check in the "Show least-squares line" box and put in some data
points.
Check in the "Show Mean X &Mean Y lines" box; note that line always
goes thru their crossing. Repeat for a few data sets.
--Try fitting the line yourself: (Uncheck the "Show ..." boxes.)
Put in some data points. Now click Draw Line. Click and
drag
in the picture and you'll get a line with 3 blobs. Drag the center and
it will go up and down, Drag an end and the slope will change. Put the
line in the best place for predicting y's from x's. If you do
well
by the "least squares" criterion, the green bar up top will shrink
close
to 0 (but you have to be really good.
Dumb.)
Check in the "Show Mean X &Mean Y lines" box; adjust your
line.
Check in the "Show least-squares line" box and see how you did.
Start here Monday:
SPSS: will fit a regression line to
data (back page of
handout). While Editing graph, Insert>Fit
line>Regression.
Get line, Equation of line and R2 (the square of the
correlation
coefficient). Govsal
on avgpay
For Govsal vs. avpay: Govsal = 28,569.69 +
2.71*avgpay
(Predicted Governer's salary increases $2.71 for every
dollar
increase in a state's average pay.)
Residual: Look at an individual observed (x,y)
data
pair. The residual is the "leftover" amount of y after predicting
a y using the line. Visually, length of vertical line drawn from
y to regression line (+ if point is above line, - if point is
below
line)
Residual = observed - predicted
=
Data - Model e = y -yhat. (e for
"error")
Govsal = 28,569.69
+ 2.71*avgpay
Visually, SPSS (handout, p. 3, bottom: In Edit mode,
Insert>Spikes:
Spike to: Regression)
Calculating: Montana (17895,
55502) Govsal = 28,569.69 + 2.71*avgpay
Predicted
Govsal
= 28,569.69 + 2.71*17895 = 28,569.69 + 48495.45 = 77065.14
Residual
=
55,502 - 77065 = -21563, $21,563 below expected
value.
| Sievers home | Math151-Sp06/Daysp15.htm | 1am | 3/2/06 |