Math 151 , Fall '08 Mon. Day 17, Oct. 6  .After class.Hit reload...

Reading: Reread, Finish Ch.5   Continuing regression, p. 126-137.   Next,  Read Ch. 7, summary.  (Skip Ch. 6)

Hand in  the rest of Regression.  Bring questions for Exam.

Residuals
p. 129, 5.7 (SPSS) does fast driving waste fuel? residuals  There is a data file for problem 5.7, and its third column is the residuals.  Do all the parts, and
Also with 5.7, In SPSS, Make a variable containing the residuals (Handout, bottom p. 4.  Also middle-bottom of this page.)  The values should match the ones in the book/SPSS file.

SPSS Handout p. 3 (Governors' salaries):  You can now finish #12, the last question.  Hand it all  in Next time.

p.133, 5.9 Farm population Do a, b, c (read p. 132 for a good word to use in part c).  Also, make a variable containing the residuals, and plot it against the x (year) values.  Draw (in pencil) a horizontal line at height 0.  What pattern do you see in the residuals?

B.  Use Residuals07.xls (Excel 07) or Residuals.xls (older Excel)from the website or the lab to graph these data sets, along with a graph of the residuals.  Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) 
a)  x 1 2 8 4 6 9 
    y 1 3 6 6 7 5 
b) x 1 2 7 4 6 9
   y 7 6 2 4 2 1

p 179 7.28, 29, 30 (SPSS) Soap in the shower.  Also, look carefully at the graph and guess why there is no data after day 21.  (Read p. 132 for the word to describe using the line for day 30, and a discussion of the issue)
p. 136 5.13 hospitals: big = bad?

Read, to discuss
A. Look at this, especially with reference to the r standard deviations in y for every 1 standard deviation in x: Open the Excel file--(Using Excel 2007 (in the labs)?RegressionSlope07 (  Using an older Excel?  RegressionSlope (or in the folder RegressionDemosExcel for D&V in ClassMaterial\Math151 D&V).  Change x-y values in the yellow boxes and watch the line change.  Change x-values in col. F and watch the "run" (red line) change, in the rightmost 2 graphs. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x.  Fix it so the increase in x (the "run") is exactly 1.   Also, look at the leftmost graph, where the length of the standard deviations are shown, and note that in standard-deviation units, the rise is r s.d.'s in y for each s.d. run in x. 

..

B. Use Applet http://www.whfreeman.com/BPS4e Correlation/regression.   Make a cloud of data (about 15 points), put in the regression line.  Play with an outlier: drag a point to the far left (or right) and drag it up and down. 
Try it if it's in the middle range of x's.  (Drag it up and down.)  Answer: Where is it most influential? Now add a bunch more points (50 is max.)  Play with an outlier  againDoes the outlier have more or less influence with a larger data set?

p. 136,  5.12 lurking variables

Optional 
p. 179, 7.27 (review Normal)



p. 136, 5.11, lurking variables 







 
 
 
 
 
 
 

 

Exam 2 this Friday: Day 19 (Oct. 10).  Day before break.  Let me know Right Away if you can't take the exam Friday.  Starts with Ch. 3,  Normal distribution, tables.  Thru Ch. 4, and what we cover of Ch.5 (&7)  through  today.  (All questions on the sample exam will be covered.)  Sample exam (handout), solutions (link) (NOW works) Normal probability practice   As of end of today, you can do all.
One sheet of notes: I will give you
paper copies of the Normal table.

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
HW questions?  Day 16
 
  Formula of line from summary numbers Day 16

Continuing with regression:
Least Squares property, Residuals & Residual plots ,  Influential points Day 16
Cautions
Day 16
 We finished the material on Day 16: you can do all the HW above. Wednesday the first thing will be answering any questions for the exam. If any more time, we'll continue:  

Association does not imply causation
Strong association/correlation between A and B could be:
     A causes B/   B causes A/  C causes both A and B (lurking C)/  just Chance that they go together in this data set.    
Direction?  Rooster causes sun to rise by crowing?
Both variables "caused" by a lurking variable?   Lurking variable can be part of the cause.
--Women with a history of heavy antibiotic use have higher rates of breast cancer.
--Baby rats whose mothers licked and groomed them more   grew up to be more exploratory, social, less timid.
            Cause? Effect?  How to tell?

Establishing that x "causes" y:  difficult:
    Best: Do an experiment in which we change x, keep lurking variables under control. (Ch. 9  Rats. )
    Otherwise: Strong association. Consistent over many studies. Higher x-->stronger y.  X precedes y in time.  A plausible mechanism exists (parallel studies?)
                Generalize rat grooming to humans?

         E.g.Partially  hydrogenated oils ("trans fats")--> heart disease?  Homocysteines --> heart disease?

- - - - - - - - - - - - - - -
Regression Extras:

r2 again: The Line formula yhat = a + bx tells us our best prediction or estimate of a response (y) value for a particular value of the explanatory (x) value.  It says NOTHING about how good that "best" is--that is, it says nothing about how tight or scattered the data is around the line.  R-squared does that job.
    r2 is the square of the correlation coefficient r!  (-, + Sign gets lost.)
Extrapolation again-- extra (outside) polation (putting a point): Using the line to predict outside the range of x's you have data for.  Linear relationships don't go on forever; straight line  is often a first approximation to a more complicated relationship.

Government projections of national budget surplus/deficit:  (www.cbo.gov publications>search)
    Budget extrapolations



Sievers home   Math151-Fall08/Dayf17.htm  8pm 10/7/08
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.