Math 151, Fall 2004, Wednesday, Sept. 22, Day 12Hit reload After Class

-Exam 1  Next class Friday Sept.24,  in class, closed book.  Bring a simple calculator. I'll give you copies of the Normal table.
Covers all of CHAPTER 1, +2.1, making (by hand) and examining scatterplots.  You will need to read SPSS output, but not tell how to produce any. You will need to calculate "by hand" a standard deviation for four numbers. (As well as medians, quartiles, etc.)   Problems like HW + some true-false or multiple choice types.
Extra time, special time?  Tell me today! (If you need extra and can stay past 10:30, don't need to tell me.)
    You may take it in Macmillan out of the classroom but must tell me where you're going.

LaReina will be in the Math Clinic tonight at 7, special, to do exam review.
 Helpers
-------------------------------------------------------------------------------
HW Day 12  (Re)read 2.2 (correlation) You do not have to be able to calculate r by hand.  You should be able to guess roughly at an r for a swarm of data; as p.101, fig. 2.9, and know and  be able to use facts 1 thru 7, p. 100  Also to find r using SPSS.  Start Moore 2.3 pp.106-112, then onward in 2.3.  Review straight lines: graphing, "slope"
Hand in Wednesday:
**A. Go to Text website http://www.whfreeman.com/scc,  (or http://bcs.whfreeman.com/bps3e/  ): and play with the  Correlation/Regression Applet.  Create a data set of around 10-15 points with r = -.65 (close to it).  Add the meanX&meanY lines, and make a sketch of your result on your paper to hand in. (Or you can print it out like this: Hit the Printscreen while holding down the Alt button.  This puts the image of the active window on the Clipboard.  Open Word, do Edit>Paste.  Then you can print the Word document.)

 **Using SPSS to find correl. coeff.  (Back page of Scatterplot handout: Analyze>Correlate>Bivariate This isn't hard.)
Hand in the scatterplots, write the correlation values, other info on your printout.
**B. Use the file educ-v-mortality.sav    This is median education level and mortality rate for 60 American cities.  Make a scatterplot showing mortality on the y (vertical) axis  vs. education on the x axis.  with the two outliers (lower left) labeled with their cities.  Find r for the data with the outliers, then delete** the two outliers and find r again.  Write the two r's on your printed graph.
**p. 106, 2.28 (SPSS) speed, gas (real)) 
**p. 103, 2.23 (SPSS) calories  **To delete a case, click on the gray case number.  The whole row should show black (selected), except for first column.  Then Delete key deletes it. (Edit has an undo) Save both data files, original and deletions, to your disk. 

Sec. 2.2 Correlation (no SPSS ).** read these over, as prep.
p. 102 2.18  thinking about correlation.
2.19 men two years older
2.20 r =0, strong assoc. (By hand is fine) graph the data (speed on the x-axis). Draw a horizontal line at the mean of the y's (26.8 MPG) and a vertical line at the mean of the x's (40 mph).  For each data point, draw a dotted line from the point horizontally to the 40 mph line, and another line vertically to the 26.8MPG line.  Use this picture to explain as best you can why the correlation is 0. (Think about each point's contribution to r, as in the lecture.)
p. 105 2.26  newspaper
p. 157 2.90 education/age

Read, to discuss 
Moore p. 99 Use data of 2.17.You graphed this by hand for Sec. 2.1.  Guess what r is; look in the back of the book to see how close you got.
p. 106 2.29 blunders

C.  Many communities find a strong positive correlation between the amount of ice cream sold in a given month and the number of drownings that occur in that month.  Does this mean that ice cream causes drowning?  If not, can you think of an alternative explanation for the strong association?

D. Explain why one would expect to find a positive correlation between the number of fire engines that respond to a fire and the amount of damage done in the fire.  Does this mean that the damage would be less extensive if only fewer fire engines were dispatched?  Explain. 

Optional 
 
 

 

Regression prep.   Hand in Wednesday
Review of straight lines if needed:
**p. 124, 2.39, 2.40. Most people did fine on lines on the pretest. If these are a problem, ask someone NOW! Any MathClinic assistant can help with these.  Also Just the Basics on reserve covers it.

**??A. Open the Excel file RegressionSlope (or in the folder RegressionDemos in ClassMaterial\Math151).  Change x-y values in the yellow boxes and watch the line change.  Change x-values in col. F and watch the "run" (red line) change. Notice the slope = the coefficient of x = the rise/run = increase in y per unit increase in x.  Fix it so the increase in x (the "run") is exactly 1.  Print the page to hand in.

**??B. Practice fitting lines:  Use the text website ("Do this" below) and try to fit at least 4 different data sets. Write down on your paper what you discovered (were your judgment errors consistent in any ways--did you have any surprises?) 

Moore p. 111, 2.31 acid rain No data, therefore no SPSS (draw the line by hand)
There may be more HW...

Read, 
to 
discuss 
Op 
tion 
al 
 
 

 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  - - -
Exam Questions??  Took whole class!
Monday we'll finish correlation 2.2 and start Regression.  Nothing to hand in, but I suggest working ahead, to even out the workload.  Problems with ** should be accessible now, or with a little reading.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -  - - -
START HERE Monday
Homework questions? 2.13 corn plant density: "Curve" can predict/estimate a yield for a given planting density.

Correlation: Day 11
--You won't have to calculate a correlation coefficient by hand. This formula is a bad one for hand computation (roundoff error); if you must do one by hand, find the computational formula in an old textbook.
--Eyeballing:  sketch xbar and ybar lines, see how much data is in + quadrants, how much in - quadrants.
--Strength of correlation says NOTHING about causality!  Strong correlation could be:
         A causes B/  B causes A/ C causes both A and B/ just chance that they go together in this data set.
Using SPSS to find correl. coeff. (Back page of Scatterplot handout:Analyze>Correlate>Bivariate)

Graphing Straight lines? p. 124, 2.39, 2.40

Regression line: Section 2.3, Predicts or estimates a y (vertical) value for a given x (horizontal) value: Straight line!
    Formula yhat = a + b x.
         To predict a y-value for a given x-value, plug the x value into the formula and calculate.
                To do it graphically, use the Up-and-Over method (Fig. 2.10, p.107):
                    Find the x, go straight up to the line, then go over to the y-axis; that y-value is the predicted y.

        a is y-intercept. b  is slope (b multiplies x, the horizontal value):  If x increases one unit, yhat increases b units.
    RegressionSlope.xls or in ClassMaterial\Math151\RegressionDemos

We all get the same line from a batch of data because we use the "least-squares best fit" criterion (pp. 107-8): we'll investigate this more closely later.

Do this: Practice fitting "least squares best fit" lines:  Author's website,  http://www.whfreeman.com/scc,  (ClickNetscape toolbars to minimize them, if needed.  If line drawing doesn't work, try the newer version at http://bcs.whfreeman.com/bps3e/  )
  Choose "Statistical Applets",  Correlation/Regression.  Check in the "Show least-squares line" box and put in some data points.   Check in the "Show Mean X &Mean Y lines" box; see if #3 below holds.  Repeat for a few data sets.
--Try fitting the line yourself:  (Uncheck the "Show ..." boxes.) Put in some data points.  Now click Draw Line.  Click and drag in the picture and you'll get a line with 3 blobs. Drag the center and it will go up and down, Drag an end and the slope will change. Put the line in the best place for predicting y's from x's.  If you do well by the "least squares" criterion, the green bar up top will shrink close to 0 (but  you have to be really good.  Dumb.)   Check in the "Show Mean X &Mean Y lines" box; adjust your line.  Check in the "Show least-squares line" box and see how you did.


Facts (Moore pp. 112-14)

  1. The Regression line is trying to predict the "average y" for a given x (with the added requirement that it is a straight line).

  2. Unless the data lies perfectly on a straight line, the line for predicting weight from height -- "regressing weight on height" --(for example) will NOT be the same line as that for predicting height from weight--"regressing height on weight".  (In-class demonstration)(The picture on p.113 is about this. )
     
  3. A change of one standard deviation in x corresponds to a change of r standard deviations in y, along the regression line.

  4.  The slope b expresses change in y-units per x-unit. (Suppose x is inches, y is pounds. Then b is in pounds per inch.) You can find b by multiplying r by the standard deviation of the y's (that's in pounds)  and dividing by the standard deviation of the x's (that's in inches)
    In "algebra", b = r times (s.d. of y)/(s.d. of x)  (Equation p. 109)
           If we standardize both the x-values and the y-values, the slope will just = r !
     
  5. The regression line goes through the point given by the two means, (xbar, ybar).

  6. --If you know this, you know ybar = a + b (xbar).  You can solve this for a, a = ybar - b (xbar). (OtherEquation p. 109)
    --So knowing 2 and 3 give you the equation of the line from the means, s.d.'s, and r.
    --And if you draw the two lines, y on x and x on y, they will intersect at (xbar, ybar)
     
  7. r2 ("Coefficient of Determination") = Proportion of variability in y-values explained/predicted by knowing x and using the least squares regression line.  (Exactly what that means mathematically is hard.  Just get used to it as a measurement.)

  8. If r = .7, about half (.49)of the variability  in the y's is explained by using the regression line relationship to predict y from x.(If weight and height have a correlation of .7, then half of the variability in weight can be explained by knowing height.)

Sievers home  Math151-Fall04/Dayf12.htm  11:30pm 9/21/04 
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.