Math 151 , Spring 2004, Wednesday Day 14, March 3Hit reload to get most current version

HW assignment Day 14
Reading:  (Re)read Moore 2.3 thru p. 114.  (Read ahead,Moore, p. 108,Moore, p. 116-119,Moore, p. 119-122. Then sec. 2.4. We'll skip Moore 2.5, start Ch. 3  next.)
Regression prep.  (copied from Day 13) Hand in Friday
Review of straight lines: 
p. 124, 2.39, 2.40. Most people did fine on lines on the pretest. If these are a problem, ask someone NOW! Any MathClinic assistant can help with these.  Also Just the Basics on reserve covers it.

A. Open the Excel file RegressionSlope (or in the folder RegressionDemos in ClassMaterial\Math151).  Change x-y values in the yellow boxes and watch the line change.  Change x-values in col. F and watch the "run" (red line) change. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x.  Fix it so the increase in x (the "run") is exactly 1.  Print the page to hand in.

B. Practice fitting lines:  Use the text website ("Do this" bottom of Day 13) and try to fit at least 4 different data sets. Write down on your paper what you discovered (were your judgment errors consistent in any ways--did you have any surprises?) 

Moore p. 111, 2.31 acid rain No data, therefore no SPSS (draw the line by hand)

Read, 
to 
discuss 
Op
tion
al 
 
 

 

Hand in Friday (except as noted)regression with SPSS  (copied from Day 13)
 C. Use the SPSS Scatterplot handout and graph  the regression line for govsal on avgpay (as shown, back page), also the lines for the 4 separate groups (either on one graph or on panels.) Print them out and keep them.  Start answering questions 6-11, on p. 3 of the handout.  Keep till you can answer all questions

 Moore p. 111 2.32 (Manatees) all parts. Import the dataset into SPSS (Class Materials\Math151) In
 SPSS,  Print the plain graph, and one with the regression line. Draw the regression line BY HAND as
 best you can on the plain graph. Check with the other one. For part b, pencil in the new points on the
 graph with the printed line. Find the mean by hand(calculator)... 
  p. 126, 2.44 p. 129, 2.48 Sarah grows.... Use SPSS for parts a and b, calculator for the rest. 

 D. For the data of Moore, p103, 2.22 (metabolism), (SPSS) Print out a graph with the regression line
 for all the people, and another with 2 separate lines (M and F). Use the equations  to calculate the
 predicted metabolic rate for 
      a) a person of mass 45 kg. 
      b) a female of mass 45 kg. 
      c) a male of mass 45 kg. 
  Now use the "up and over" method of Fig. 2.10 p. 107, with a pencil and straightedge to mark the 
  predicted values on the y-scale. Write down your computed answers next to them.  Make sure the
 two  methods give consistent answers. 

Read, 
to 
discuss 
Op
tion
al 
 
 

HW on  the 4 "facts":   Work on these, Keep till we finish the 4 "facts"
p. 114, 2.33  prof. swims--two lines x->y, y->x Also,Make both graphs in SPSS, each with its regression line.  Use SPSS to find the means for time and pulse, and draw (by hand is ok) the xbar, ybar lines on each graph.  Note the Regression lines won't coincide if you flip one graph.

p. 111, 2.30 heating degree days,  checking formulas on p. 109. Import the dataset
 into SPSS. Use SPSS to get the formula in part a (again), and the mean, s.d., and correl. coeff. in part b.  Then use your calculator to calculate the slope and intercept.  Compare with SPSS's. 

p. 116, 2.35  beavers (prop. explained.) Do parts a and b on SPSS, c is just to answer. Note Text &Excel files are put in order, so look different,+ Text is MISSING the 23rd point, (5,56).  You can just type it in.

p. 128, 2.47  Julie's grade (Not SPSS, just calculator) 
p. 129, 2.51 "regression"  (Not SPSS, just calculator)   Hint below*

E .  Use the Excel RSquared page. ( R-Squared (or R-squared tab in ResidualsRSquared.xls: ClassMaterial\Math151\RegressionDemos)). Shift points around and get an r2 close to .8 (80%) (Between .75 and .85 is good enough.).  Note that if r = +.9, then  r2 = .81.   Now shift the points so that r is negative and r2 is close to .8.  Print the resulting page to hand in. (Data and graph)

Read Op
tion
al 
 
 
 

 

*Hint:  ybar = 46.6 + .41xbar [why?].  Let c be the amount Octavio's final is predicted to exceed the mean.
Then (ybar +c) = 46.6 + .41(xbar + 10) [why?].  Use the two equations and solve for c.  If your algebra skills are not strong enough, don't get upset; this calculation is not central to the course.  Read the answer!
= = = = = = = = = = = = = = = = = = = = = =
HW questions?   educ-v-mortality.sav  
- - - - - - - - - - -
Regression line: Section 2.3, Predicts or estimates a y (vertical) value for a given x (horizontal) value: Straight line!
     "Regressing y ON x" .
SPSS--back of handout.  Govsal on avgpay

Formula yhat = a + b x.    Govsal = a + b avgpay
         To predict or estimate a y-value for a given x-value, plug the x value into the formula and calculate.
                To do it graphically, use the Up-and-Over method (Fig. 2.10, p.107):
                    Find the x, go straight up to the line, then go over to the y-axis; that y-value is the predicted y.

 a
is y-intercept. b  is slope:  If x increases one unit, yhat increases b units. 
(In a straight-line relationship, the amount that y increases for one unit increase in x is the same no matter what value of x you start with)  RegressionSlope.xls or in ClassMaterial\Math151\RegressionDemos

We all get the same line from a batch of data because we use the "least-squares best fit" criterion (pp. 107-8): we'll investigate this more closely later.

Facts:  1, 3 first.  Then 2. Thru again.   Then 4.

Facts (Moore pp. 112-14)

  1. The Regression line is trying to predict the "average y" for a given x (with the added requirement that it is a straight line).  See lines for govsal on avgpay.

  2. Unless the data lies perfectly on a straight line, the line for predicting weight from height -- "regressing weight on height" --(for example) will NOT be the same line as that for predicting height from weight--"regressing height on weight".  (In-class demonstration) (The picture on p.113 is about this. )
     
  3. A change of one standard deviation in x corresponds to a change of r standard deviations in y, along the regression line.

  4.  The slope b expresses change in y-units per x-unit. (Suppose x is inches, y is pounds. Then b is in pounds per inch.) You can find b by multiplying r by the standard deviation of the y's (that's in pounds)  and dividing by the standard deviation of the x's (that's in inches)
    In "algebra", b = r times (s.d. of y)/(s.d. of x)  (Equation p. 109)
           If we standardize both the x-values and the y-values, the slope will just = r !  govsalstd.sav,,  govsalstd.spo .

  5. The regression line goes through the point given by the two means, (xbar, ybar). http://www.whfreeman.com/scc

  6. --If you know this, you know ybar = a + b (xbar).  You can solve this for a, a = ybar - b (xbar). (OtherEquation p. 109)
    --So knowing 2 and 3 give you the equation of the line from the means, s.d.'s, and r.
    --And if you draw the two lines, y on x and x on y, they will intersect at (xbar, ybar)
     
  7. r2 ("Coefficient of Determination") = Proportion of variability in y-values explained/predicted by knowing x and using the least squares regression line.  (Exactly what that means mathematically is hard.  Just get used to it as a measurement.) More:R-Squared (or R-squared tab in ResidualsRSquared.xls: ClassMaterial\Math151\RegressionDemos)

  8. r2 is the square of the correlation coefficient r!  (-, + Sign gets lost.) 
    If r = .7, about half (.49) of the variability  in the y's is explained by using the regression line relationship to predict y from x. (If weight and height have a correlation of .7, then half of the variability in weight can be explained by knowing height.)
Sievers home  Math151-Sp04/Days14.htm  2pm 3/2/04
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.