| Hand in Monday p. 122, 5.3b only. verify formula Use the means, s.d.'s and r from the answers in the back of the book. p. 141, 5.30 husbands and wives (Note, you have to find the equation of the line to draw the graph, tho it doesn't explicitly tell you to...) p. 125, 5.5 (SPSS. Let SPSS find the regression line. Get the mean yield and mean planting rate too--you need it for part c) corn again, straight line is a "bad fit" p. 142, 5.32 going to class Postpone the rest: p. 129, 5.7 (SPSS) does fast driving waste fuel? residuals There is a data file for problem 5.7, and its third column is the residuals. Do all the parts, and Also with 5.7, In SPSS, Make a variable containing the residuals (Handout, bottom p. 4. Also bottom of this page.) The values should match the ones in the book/SPSS file. SPSS Handout p. 3 (Governors' salaries): You can now finish#12, the last question. Hand it all in Monday(?). p.133, 5.9 Farm populationB. Use Residuals.xls from the website or the lab to graph these data sets, along with a graph of the residuals. Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) a) x 1 2 8 4 6 9 y 1 3 6 6 7 5 b) x 1 2 7 4 6 9 y 7 6 2 4 2 1 (SPSS) Do a, b, c (read p. 132 for a good word to use in part c). Also, make a variable containing the residuals, and plot it against the x (year) values. Draw (in pencil) a horizontal line at height 0. What pattern do you see in the residuals? p 179 7.28, 29, 30 (SPSS) Soap
in the shower. Also, look
carefully at the graph and guess why there is no data after day
21. (Read p. 132 for the word to describe using the line for day
30, and a discussion of the issue) |
Read, to discuss Look at this, especially with reference to the r standard deviations in y for every 1 standard deviation in x: A. Open the Excel file RegressionSlope (or in the folder RegressionDemosExcel for D&V in ClassMaterial\Math151 D&V). Change x-y values in the yellow boxes and watch the line change. Change x-values in col. F and watch the "run" (red line) change, in the rightmost 2 graphs. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x. Fix it so the increase in x (the "run") is exactly 1. Also, look at the leftmost graph, where the length of the standard deviations are shown, and note that in standard-deviation units, the rise is r s.d.'s in y for each s.d. run in x. Postpone the rest: C. Use Applet http://www.whfreeman.com/BPS4e
Correlation/regression. Make a cloud of data (about 15
points),
put in the regression line. Play with an outlier: drag a point to
the far left (or right) and drag it up and down. p.
136, 5.12 lurking variables
|
Optional p. 179, 7.27 (review Normal) Postpone the rest: p. 136, 5.11, lurking variables
|
NOTE: The standard deviation doesn't say anything about the distance of any individual point from the mean; it's only about a kind of "average" variability. R2 doesn't say anything about the line and any particular (x,y) pair --just about a kind of "average" goodness of the explanatory power of the line for the data.Other questions for exam?
Extrapolation--
extra (outside) polation (putting a point): Using the line to predict
outside
the range of x's you have data for. Linear relationships don't go
on forever; straight line is often a first approximation to a
more complicated relationship.
Government projections of national budget surplus/deficit:
(www.cbo.gov publications>search)
Jan. 2001 http://www.cbo.gov/showdoc.cfm?index=2727&sequence=6
Projection used to justify Bush tax cuts.
Jan. 2002
http://www.cbo.gov/showdoc.cfm?index=3277&sequence=6
August 2006
http://www.cbo.gov/ftpdocs/74xx/doc7492/08-17-BudgetUpdate.pdf
Pdf p. 19, single line projection--10 years,
p. 36, uncertainty--6 years.
March. 2007(p.2)pdf p. 8
http://www.cbo.gov/ftpdocs/78xx/doc7837/03-05-Uncertain.pdf
June 2000, conservative think tank analysis http://www.hoover.org/publications/policyreview/3487697.html
Fig 1, budget surplus/deficit 1901
on. Notice only previous longterm surplus is 1920's,
Fig. 6 --1960 on, & projections
"Lurking" variable:
has an important effect, but not one of the variables studied.
Meatloaf shrinkage vs.
placement
in oven? (cooking thermometer/not had greatest influence)
Time sequence of
observations
a common one. (Learning, tiring, aging)
The trouble with lurking
variables is that by definition you don't know they're there.
Look
behind every tree.
Association does not
imply
causation
Strong association/correlation between A and B could be:
A causes B/ B causes A/ C
causes both
A and B (lurking C)/ just Chance that they go together in this
data
set.
Direction? Rooster causes sun to rise by
crowing?
Both variables "caused" by a lurking variable?
Lurking variable can be part of the cause.
--Women with a history of heavy antibiotic use have higher rates of
breast cancer.
--Baby rats whose mothers licked and groomed
them more grew up to be more exploratory, social, less
timid.
Cause? Effect? How to tell?
Establishing that x "causes" y:
difficult:
Best: Do an experiment
in which we change x, keep lurking variables under control. (Ch.
9
Rats.
)
Otherwise: Strong
association. Consistent over many studies. Higher x-->stronger
y.
X precedes y in time. A plausible mechanism exists (parallel
studies?)
Generalize rat grooming to humans?
E.g.Partially hydrogenated oils --> heart disease?
Homocysteines -->
heart
disease?
| Sievers home | Math151-Sp07/Daysp17.htm | 9:30pm | 3/11/07 |