Math 151 , Fall 2004 Monday Day 17, Oct. 4After class Hit reload...

HW:  Reading:   Finish Sec. 2.3.  Sec. 2.4.  Skip 2.5.
Please read ahead: Ch. 3 Intro.  Ch. 3.1, (skip stratified, multistage pp.174-5, on first reading).
Hand in  Wednesday
(repeated from Day 16)
A.  Use ResidualsRSquared from the website or the lab to graph these data sets, along with a graph of the residuals.  Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) 
a)  x 1 2 8 4 6 9 
    y 1 3 6 6 7 5 
b) x 1 2 7 4 6 9
   y 7 6 2 4 2 1
Moore p. 122, 2.36 speed&gas again a, b, c, d.   There is a data file for problem 2.36, and its third column is the residuals (check them against the book). 

B. Use Author's website, http://www.whfreeman.com/scc, ...Correlation/regression.   Make a cloud of data (about 15 points), put in the regression line.  Play with an outlier: drag a point to the far left (right) and drag it up and down.  Try it if it's in the middle range of x's.  Write answer: Where is it most influential? Now add a bunch more points (50 is max.)  Play with an outlier  againDoes the outlier have more or less influence with a larger data set?

Moore p. 123, 2.38 Gesell first word-point in middle of x range. Get the data into SPSS, delete child 19, graph and get the regression line and r2.  Use the formula on p.117 and graph the line for the full data set by hand on your printout.   r2  for the full data set is on p. 122. 

Moore p. 122, 2.37 Calories (You saved these, I think--or, from Moore's files, in  TA02-04) Graph and get lines in SPSS with and without the outliers.  Graph the line for "without outliers" by hand on the printout for "with outliers" so you can compare them better.  Print one more graph (with outliers) and keep it for problem C below.
Sec. 2.4
p. 132  2.54 Dow average/stocks
p. 138 2.63 math&verbal r, states/individuals
C.  Look again at p. 122, 2.37(calories).   These values are averaged values, over a bunch of people's guesses.  What would the graph look like if all the individuals' separate guesses had been graphed?  Add points to your graph to give the idea.

p. 133  2.55 tv watching & grades 
 2.56 economists&pay 
 2.64 herbal tea

Income depends on height?! Read the article and answer this.
If your browser doesn't get the link, it's at http://aurora.wells.edu/~srs/Math151-Fall04/tallpeoplewin.htm 
  a)What is "$789", and what kind of analysis did they do? 
  b)What does my footnote at the end tell you about the data that the article did not? 

Read, to discuss 

Sec. 2.4
p.136 2.57 firefighters, 
   2.58 self-esteem
p. 138 2.61 shoe size/reading
2.66 Education/income

Optional 
SPSS will make residuals:  See Day 16
 

Sec. 2.4
p. 137 2.59 size of hospital
 
 
 
 
 
 
 

 

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Least squares criterion, residualsDay 16,

Heard on NPR driving home Friday:  The World Bank says:  For every $5 increase in the price of a barrel of oil, the world economic growth rate drops  3/10 of 1%.  What kind of analysis did they do?  They have restated what statistical thing?
This comes from a regression line, and is a restatement of the slope.  "Rise" = - 0.3% (negative because it drops), Run =$5.
The slope in the regression line = Rise/Run = -0.3%/$5 = -0.06% per dollar.

Income depends on height?!
    What is "$789", and what kind of analysis did they do?

Cautions  Sec. 2.4
Plot the data: Summary formulas and numbers don't tell the whole story.  (Anscombe's quartet, Moore p.127, 2.46-7) (Overhead slide.  Copy hanging on my door.  You can reconstruct these pictures using SPSS and Moore's problems, if you like)

Extrapolation-- extra (outside) polation (putting a point): Using the line to predict outside the range of x's you have data for.

Averaged data will produce a stronger relationship (higher correlation, R2) than the merged raw data from individuals (the averaging hides much variability) Heating-degree days graph (Similar data in TA 2.1, p. 86, 107):  Each value represents a month's average temperature and average fuel.  If we graphed the daily temperature and fuel use we would see a lot more scatter.
(Overhead slide.  Copy hanging on my door.  )


"Lurking" variable has an important effect, but not one of the variables studied.
    Meatloaf shrinkage vs. placement in oven?  (cooking thermometer/not had greatest influence)
    Time sequence of observations a common one.  (Learning, tiring, aging)
    The trouble with lurking variables is that by definition you don't know they're there.  Look behind every tree.

Association does not imply causation
    Manatees:                                        Year
             boat registrations            kills

            If you didn't know boat registrations, would you believe that "year" was the cause of "kills"?
                (Are all boats actually registered?  Possible lurking variable= unregistered boats.)
Direction?  Rooster causes sun to rise by crowing?
Both variables "caused" by a lurking variable?
--Women with a history of heavy antibiotic use have higher rates of breast cancer.
--Baby rats whose mothers licked and groomed them more   grew up to be more exploratory, social, less timid.
            Cause? Effect?  How to tell?

Establishing that x "causes" y:  difficult:
    Best: Do an experiment in which we change x, keep lurking variables under control. (Sec. 3.2)  Rats.
    Otherwise: Strong association. Consistent over many studies. Higher x-->stronger y.  X precedes y in time.  A plausible mechanism exists (parallel studies?)
                Generalize rat grooming to humans?

         E.g. hydrogenated oils --> heart disease?  Homocysteines --> heart disease?
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Pick a digit (from 0,1,2,3,4,5,6,7,8,9).  Write it down.   Keep it for next time.
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =


Sievers home   Math151-Fall04/Dayf17.htm  11:10am 10/4/04
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.