| Hand in Wednesday
(repeated from Day 16) A. Use ResidualsRSquared from the website or the lab to graph these data sets, along with a graph of the residuals. Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) a) x 1 2 8 4 6 9 y 1 3 6 6 7 5 b) x 1 2 7 4 6 9 y 7 6 2 4 2 1 Moore p. 122, 2.36 speed&gas again a, b, c, d. There is a data file for problem 2.36, and its third column is the residuals (check them against the book). B. Use Author's website, http://www.whfreeman.com/scc, ...Correlation/regression. Make a cloud of data (about 15 points), put in the regression line. Play with an outlier: drag a point to the far left (right) and drag it up and down. Try it if it's in the middle range of x's. Write answer: Where is it most influential? Now add a bunch more points (50 is max.) Play with an outlier again. Does the outlier have more or less influence with a larger data set? Moore p. 123, 2.38 Gesell first word-point in middle of x range. Get the data into SPSS, delete child 19, graph and get the regression line and r2. Use the formula on p.117 and graph the line for the full data set by hand on your printout. r2 for the full data set is on p. 122. Moore p. 122, 2.37 Calories (You saved these, I think--or, from
Moore's files, in TA02-04) Graph and get lines in SPSS with and without
the outliers. Graph the line for "without outliers" by hand on the
printout for "with outliers" so you can compare them better. Print
one more graph (with outliers) and keep it for problem C below.
p. 133 2.55 tv watching & grades
Income depends on height?! Read the
article and answer this.
|
Read, to discuss
Sec. 2.4
|
Optional
SPSS will make residuals: See Day 16 Sec. 2.4
|
Heard on NPR driving home Friday: The World Bank says: For
every $5 increase in the price of a barrel of oil, the world economic growth
rate drops 3/10 of 1%. What kind of analysis did they do?
They have restated what statistical thing?
This comes from a regression line, and is a restatement
of the slope. "Rise" = - 0.3% (negative because it drops), Run =$5.
The slope in the regression line = Rise/Run =
-0.3%/$5 = -0.06% per dollar.
Income depends on height?!
What is "$789", and what kind of analysis
did they do?
Cautions Sec.
2.4
Plot the data:
Summary formulas and numbers don't tell the whole story. (Anscombe's
quartet, Moore p.127, 2.46-7) (Overhead slide.
Copy hanging on my door. You can reconstruct these pictures using
SPSS and Moore's problems, if you like)
Extrapolation-- extra (outside) polation (putting a point): Using the line to predict outside the range of x's you have data for.
Averaged data will
produce a stronger relationship (higher correlation, R2) than
the merged raw data from individuals (the averaging hides much variability)
Heating-degree days graph (Similar data in
TA 2.1, p. 86, 107): Each value represents a month's average
temperature and average fuel. If we graphed the daily temperature
and fuel use we would see a lot more scatter.
(Overhead slide. Copy hanging on my door.
)
"Lurking" variable:
has an important effect, but not one of the variables studied.
Meatloaf shrinkage vs. placement
in oven? (cooking thermometer/not had greatest influence)
Time sequence of observations
a common one. (Learning, tiring, aging)
The trouble with lurking
variables is that by definition you don't know they're there. Look
behind every tree.
Association does not imply
causation
Manatees:
Year
boat registrations
kills
If you didn't know boat registrations, would you believe that "year" was
the cause of "kills"?
(Are all boats actually registered? Possible lurking variable= unregistered
boats.)
Direction? Rooster causes sun to rise by
crowing?
Both variables "caused" by a lurking variable?
--Women with a history of heavy antibiotic use have higher rates of
breast cancer.
--Baby rats whose mothers licked and groomed
them more grew up to be more exploratory, social, less timid.
Cause? Effect? How to tell?
Establishing that x "causes" y: difficult:
Best: Do an experiment
in which we change x, keep lurking variables under control. (Sec. 3.2)
Rats.
Otherwise: Strong
association. Consistent over many studies. Higher x-->stronger y.
X precedes y in time. A plausible mechanism exists (parallel
studies?)
Generalize rat grooming to humans?
E.g. hydrogenated oils --> heart disease? Homocysteines --> heart
disease?
= = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = =
Pick a digit (from 0,1,2,3,4,5,6,7,8,9).
Write it down. Keep it for next time.
= = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = =
| Sievers home | Math151-Fall04/Dayf17.htm | 11:10am | 10/4/04 |