| Hand
in
Mon. Correlation (thinking): 4.26 date heights again You graphed this by
hand. r = .5653. Now answer the questions in the text. A. If women always married men who were exactly two years older than themselves, what would be the correlation between the ages of husband and wife? (Hint: make a data table and the corresponding scatterplot for 4 or 5 couples with different x's, and look at it.) Correlation (computing & thinking) p. 104, 4.11 (SPSS) gas, speed: association but 0 correlation. Find the means and draw the mean lines on your graph (by hand) to help explain the 0 correlation. p. 104, 4.10 (SPSS) bird colonies again. To add
a data pair in SPSS just type them in a new row at the bottom. To
delete, click on the case number, which highlights the whole row, hit
delete. |
Read to discuss |
Optional
Do now (for ch. 5 ) if you need the practice: Straight line graphing practice: A. y = -10 + 3x, graph for 2<x<10. B. y = 500 - 20x, graph for 0<x<10.
4.28, I said to
draw the line by hand. |
Handout on SPSS Scatterplots etc.
link., showing subgroups, labeling individual points.
govsal_vs_pay.sav
is the file used for most of the handout. (In SPSS for Class BPS
folder)
Homework questions? Day
14
Correlation:
The (Pearson)
correlation coefficient r is a numerical measure for how strongly linear
(and in what direction) the relationship is. Doesn't
substitute for a scatterplot.
Use if data is: 2 quantitative variables,
& "nice":
One cluster/cloud/band.
Pretty straight.
Outlier(s)? Do with/without & be cautious.
Correlation experiments:
Website, http://www.whfreeman.com/bps4e,"Statistical
Applets", Correlation/Regression. Play with data
points,
observing the Correlation Coefficient.
Check in the "Show
Mean X & Mean Y lines" box. See how much is in each
quadrant.
Compare with correlation coefficient.
Using SPSS (p.4, Scatterplot handout) Analyze>Correlate>Bivariate
Properties (p. 101) and cautions (p. 103):

--You won't have to calculate a correlation coefficient by hand. This
formula is a bad one for hand computation (roundoff error); if you must
do one by hand, find the computational formula in an old textbook.
--Eyeballing: sketch xbar and ybar lines, see how much
data is
in + quadrants, how much in - quadrants.
Strength of correlation says NOTHING about causality!
Strong
correlation could be:
A causes B/ B causes A/ C
causes both A and B (lurking C)/ just Chance that they go
together in this data set.
= = = = = = = = = = = = = =
= = = = =
Regression line: Ch. 5, Predicts or estimates a y
(vertical) value for a given x (horizontal) value: Straight
line!
"Regressing y ON
x" .
(Graphing a straight line: pick an x-value at one end of the
useful range. Plug in to the formula and calculate the
corresponding y. Graph the (x,y) pair. Repeat with an x
value at the other end of the range. Connect the 2 dots with a
line (see pretest). Insurance: Pick a third x and calculate
the y. This point must also lie on the line, if you did it right.)
Experimenting
http://www.whfreeman.com/bps4e,
Correlation and Regression Applet.
SPSS--graph line, p. 2 top
Govsal
on avgpay
Formula yhat = a + b x. (yhat means we're finding
a sort of average y for each particular x).
Govsal = a
+
b avgpay
SPSS-- formula p. 4. Read off "coefficients" (intercept
and slope) from table.
a is y-intercept.
b is slope:
If x increases one unit, yhat increases b
units.
(b multiplies the x-variable.)
Govsal = 28,569.69 + 2.709*avgpay
yhat = 28,569.69 + 2.709* x
To predict
or
estimate a y-value for a given x-value, plug the x value into
the
formula and calculate.
To do it graphically, use the Up-and-Over method (Fig. 5.1, p.116):
Find the x, go straight up to the line, then go over to the y-axis;
that
y-value is the predicted y.
Calculating:
Montana (17,895,
55,502) y = 28,569.69 + 2.709*x
Predicted
y
= 28,569.69 + 2.709*17,895 = 28,569.69 + 48,477.56 = 77,047.25
(higher than actual)
a is y-intercept.
b is slope:
If x increases one unit, yhat increases b
units.
(b multiplies the x-variable.)
If you know that yhat increases 12 units for every one that x
increases, you know that the slope of the line b = 12.
Governor's salaries increase (on the average across the states)
$2.71 for every increase of $1 of average pay.
This is a summary of the linear
relationship, in the same way that the mean of a distribution is one
summary of the distribution. Particular states won't match this
exactly.
(In a straight-line relationship, the amount that y
increases
for one unit increase in x is the same no matter what value of
x
you start with) RegressionSlope.xls
or
in ClassMaterial\Math151-BPS4e \RegressionDemos Excel BPS4e
| Sievers home | Math151-Sp08/Days15.htm | 8pm | 2/28/08 |