Math 151 , Spring 2002, Monday Day 16, March 4 Hit reload to get most current version

Does anyone have the "Living Scatterplots" (height vs. weight, height vs. gpa) handed around several classes ago?  I'd like them back.
Friday we worked on using Normal Tables.
  If you weren't there and want more practice, get handout, see me or Math Clinic.

HW questions?
SPSS--regression lines for multiple groups: (9RMW-2--Arby's sandwiches)
  When making the Interactive scatterplot, put in the Legend (grouping)variable,
choose the Fit tab, and Regression.
    At the bottom is Fit lines for  Total, Subgroups. (Total is the default)  Choose Subgroups also.
    The labels will come out all on top of one another.  Click and drag them to better locations.
      To alter?  Double click any line, get the Regression Parameters menu.  (Options allows changing line styles)
(Noninteractive graphs:  In Chart Editor: Chart>Options: Fit Line, checkmark Total & Subgroups.  Read equations for lines from (dotless) interactive graph.)
- - - - - - - - - - -
Regression line: Moore Section 2.3, ACT Ch.9.  Predicts or estimates a y (vertical) value for a given x (horizontal) value.  "Regressing y on x" .
    Formula yhat= a + b x.     a is y-intercept. b  is slope:  If x increases one unit, yhat increases b units.
Facts (Moore pp. 112-14)

  1. The Regression line is trying to predict the "average y" for a given x (with the added requirement that it is a straight line).

  2. Unless the data lies perfectly on a straight line, the line for predicting weight from height -- "regressing weight on height" --(for example) will NOT be the same line as that for predicting height from weight--"regressing height on weight".  (In-class demonstration) (The picture on p.113 is about this. )
     
  3. A change of one standard deviation in x corresponds to a change of r standard deviations in y, along the regression line.

  4.  The slope b expresses change in y-units per x-unit. (Suppose x is inches, y is pounds. Then b is in pounds per inch.) You can find b by multiplying r by the standard deviation of the y's (that's in pounds)  and dividing by the standard deviation of the x's (that's in inches)
    In "algebra", b = r times (s.d. of y)/(s.d. of x)  (Equation p. 104)
           If we standardize both the x-values and the y-values, the slope will just = r !
     
  5. The regression line goes through the point given by the two means, (xbar, ybar)http://www.whfreeman.com/bps

  6. --If you know this, you know ybar = a + b (xbar).  You can solve this for a, a = ybar - b (xbar). (OtherEquation p. 104)
    --So knowing 2 and 3 give you the equation of the line from the means, s.d.'s, and r.
    --And if you draw the two lines, y on x and x on y, they will intersect at (xbar, ybar)
     
  7. r2 ("Coefficient of Determination") = Proportion of variability in y-values explained/predicted by knowing x and using the least squares regression line.  (Exactly what that means mathematically is hard.  Just get used to it as a measurement.)  More:R-Squared  (or R-squared tab in ResidualsRSquared.xls: ClassMaterial\Math151\RegressionDemos)

  8. r2 is the square of the correlation coefficient r!  (-, + Sign gets lost.)
    If r = .7, about half (.49) of the variability  in the y's is explained by using the regression line relationship to predict y from x. (If weight and height have a correlation of .7, then half of the variability in weight can be explained by knowing height.)


PreClass assignment Day 16  for Day17
Next time I'll discuss  the least squares criterion (Moore, p. 108, ACTp. 9-3, 1st 2 activities)
Residuals (Moore, p. 116-119, ACTp. 9-3, 1st activity), and  
Influential Observations and Outliers (Moore, p. 119-122)
I hope to start Moore sec. 2.4: Extrapolation, Averaged data, Lurking variables, Association is not Causation. 

HW assignment Day 16, Monday March 4,
ACT: From Activstats Homework, Moore, from The Basic Practice of Statistics
Reading:  Reread Moore 2.3 thru p. 114.  (Read ahead, see above.  We'll skip Moore 2.5, start Ch. 3 =ACT10 next.)
Hand in 
Multiple groups
ACT Ch9 RMW-2--Arby's sandwiches
Repeat class work: Get total and subgroup lines and formulas, and arrange so they're legible.  (If you do the noninteractive graph, you won't get a line for the 2 turkeys--there is only one possible line and that is thru both points.  Pencil it in.  Write the other formulas on the graph)

For the data of Moore, p103, 2.22 (metabolism), Print out a graph with the regression line for all the people, and another with 2 separate lines (M and F: Fit line:Subgroups). Use the equations  to predict the metabolic rate for
    a) a person of mass 45 kg.
    b) a female of mass 45 kg.
    c) a male of mass 45 kg.  
Now use the "up and over" method of Fig. 2.10 p. 107, with a pencil and straightedge to mark the predicted values on the y-scale, Write down your computed answers next to them.  Make sure the two methods give consistent answers.)
- - - - - - - - - - - - - - - 
With the 4 "facts":  From Moore 
p. 114, 2.33  prof. swims--two lines x->y, y->x Also, Make both graphs in SPSS, each with its regression line.  Use SPSS to find the means for time and pulse, and draw the xbar, ybar lines on each graph. Note the lines won't coincide if you flip one graph..

p. 111, 2.30 heating degree days,  checking formulas on p. 104. Import the dataset
 into SPSS. Use SPSS to get the formula in part a (again), and the mean, s.d., and correl. coeff. in part b.  Then use your calculator to calculate the slope and intercept.  Compare with SPSS's. 

p. 116, 2.35  beavers (prop. explained.) Do parts a and b on SPSS, c is just to answer.

p. 128, 2.47  Julie's grade (Not SPSS, just calculator) 
p. 129, 2.51 "regression"  (Not SPSS, just calculator)   Hint below*

A.  Look at your lines from the Arby's sandwiches above.  Note the R-Squared for the line thru the two turkey sandwiches = 1.  Why does this make sense? (What proportion of variability in calories is explained or predicted by the line,  for turkey sandwiches?)
B.  Use the Excel RSquared page. (See r2  : More RSquared above for link or file). Shift points around and get an r2 close to .8 (80%) (Between .75 and .85 is good enough.).  Note that if r = +.9, then  r2 = .81.   Now shift the points so that r is negative and r2 is close to .8.  Print the resulting page to hand in. (Data and graph)

Read Optional 
 
 
 

 

*Hint:  ybar = 46.6 + .41xbar [why?].  Let c be the amount Octavio's final is predicted to exceed the mean.
Then (ybar +c) = 46.6 + .41(xbar + 10) [why?].  Use the two equations and solve for c.  If your algebra skills are not strong enough, don't get upset; this calculation is not central to the course.  Read the answer!


Sievers home  Math151-Sp02/Day16.htm  8pm 3/3/02
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.