Math 151 , Day 12, Wed., Sept. 24, 2008 Hit Reload...After class.

HW Day 12: Read Ch. 4 (Scatterplotts and correlation) to p. 99 Check p.105  4.12, 13, 14,   and ahead  pp. 99-105 (correlation) Check 4.14 thru 4.20.  You do not have to be able to calculate r by hand.  You should be able to guess roughly at an r for a swarm of data; as p.102, eg. 4.6, and know and  be able to use facts 1-4, p. 101, and cautions 1-4 p. 103.
Please also , Ch. 5, Regression, thru p. 125 (check p. 137: 5.14 through 20, basic line and regression line facts and tools. 21 r and slope, 22 is harder--changing units--don't worry about it. 23 If you sketch the graph and draw a line thru the points, you should be able to guesstimate the slope well enough to choose among the 3 answers.) ahead: Continuing regression, p. 126-137.

Hand In Fri. ..
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Scatterplots using SPSS. Get Scatterplot handout, outside my door, or link. Please email me with any  SPSS difficulties or discoveries!
---From now on, make all scatterplots on SPSS!  Don't forget to check Measure, and to add Labels.  (Trouble printing? Try copy/paste into Word, printing in Word.  If you print from Word, on a computer without SPSS, symbols may look funny.  's OK.)
Governors' Salaries HW, accompanying  Scatterplot Handout  handout and govsal_vs_pay.sav  data file. Use SPSS and answer questions 1-5.   Do these questions on a separate page, and Keep till we have finished all 12 questions!

p. 96, 4.4 and 4.5 (SPSS) bird colonies (Save your file; you'll use it again for 4.10)
p.96, 4.6 (SPSS) gas mileage
p. 98 4.7 (SPSS) icicle growth. Data is in table 4.2. Be sure to write on your graph which group is slow water and fast.
p. 109 4.25 a, c (not b) (SPSS) running records, M/F These are record breaking times, so a year without a number is one in which the best time was slower than the last record. (Keep a copy of the graph, to use in the next section's hw.)
A. Added at end of class:
Use   educ-v-mortality.sav  (in SPSS for class BPS folder). Identify the two outlier cities at left, and speculate as to why they are different from the pack of data, having very low mortality rates compared with the "typical" for their education level. Ask others, till you get a satisfying answer.

- - - - - - - - - - - - -
Postpone Correlation, but you can get the SPSS output now if you want.
Correlation (thinking):
p. 112, 4.36 and 4.37 Applet explorations
p. 112, 4.34 and 4.35 correlation meaning

4.26 date heights again  You graphed this by hand.  r = .5653. Now answer the questions in the text.

p. 109 4.25 b  running records again.  It's a little complicated in SPSS to get the r's for the separate groups, so get them by looking at the answers in the back of the book.  Answer the question.

A.  If women always married men who were exactly  two years older than themselves, what would be the correlation between the ages of husband and wife? (Hint: make  a data table and the corresponding scatterplot for 4 or 5 couples with different x's, and look at it.)

Correlation (computing & thinking)
Governors' Salaries HW:
Do problem 6.  Keep this with the previous work.

p. 104, 4.11 (SPSS) gas, speed: association but 0 correlation.  Find the means and draw the mean lines on your graph (by hand) to help explain the 0 correlation.

p. 104, 4.10 (SPSS) bird colonies again.  To add a data pair in SPSS just type them in a new row at the bottom.  To delete, click on the case number, which highlights the whole row, hit delete.

(This problem looks forward to Ch. 5, sort of
 p. 110, 4.28 corn plant density. (SPSS)  Notice how the data is entered for SPSS--not as displayed here! but with the first column giving Plants per acre and the second giving Yield.  Make a scatterplot.  Use your calculator to find the mean yields, and write these on your paper.   (Or You can find means for the separate groups in SPSS : in Explore, Plants to the Factor list).  Graph the means by hand with a pencil on your printed plot, and connect the means dots.
Read, to discuss 
 
Correlation:
p. 112, 4.33  Do a rough sketch for yourself.

Look at all the graphs you make, and guesstimate the correlation coefficient (before you read or calculate it.)

 

 

Optional 
Do now (for Ch. 5) if you need the practice:
Straight line graphing practice:
A.  y = -10 + 3x, graph for 2<x<10.
B.  y = 500 - 20x, graph for 0<x<10.

Correlation:  Use
http://www.whfreeman.com/bps4e
(see below for details) 
to make different scatterplot 
patterns, and observe their r's.

4.28, I said to draw the line by hand.
SPSS can plot the line connecting means on your graph:  In the Chart Editor, do Elements>Interpolation Line. If it doesn't look right, in the Properties window , interpolation Line tab, choose Line Type: Straight.












Exams still not finished!  still sorry.
Leftover:
one of the locomotive problems had a z = 4.5--off the end of the table!  What happens further out in normal tails?  Almost (but not quite) 0.  (Handout last time   p. 80-81 3.11 and 3.12 (locomotive adhesion, 2 dist's)(Handout error in graph label: female "tail" is .0119 not .0019)  )
HW Questions?  backward problems?  Going from area to x: Day 11,   Recap Day 8,   Normal probability practice 
= = = = = = = = = = = = = = = = = = = =

Relationships: (BPS4e Ch.4, at first to p. 98)  
Two Related quantitative variables  (We used side by side stemplots, boxplots, histograms to relate a quantitative variable to a categorical variable)
    "Just Related" or "explanatory & response?"
(Scatterplots)
explanatory = independent = "x" = horizontal axis ( = "cause", sometimes but not always)= predictOR
  response =    dependent = "y" = vertical axis      = ("effect ") =predicteED

(Living histograms:  Height vs. weight, Height vs. gpa)

Discussing Scatterplot
General Pattern                                      Deviations
Clusters?                                                      Outliers? (label if possible)
Form (linear, curved, ...?)
    Strength of relationship (how unfuzzy)  "Weak, moderate, strong"
Direction
    Positively associated:  y increases as x increases (generally).
    Negatively associated:  y decreases as x increases.

Mark subgroups differently to do comparisons. (Subgroups defined by categorical variable, like Sex, Region of country)

Get SPSS Scatterplot handout, link.  Governors' Salaries HW sheet,or outside my door, if you missed class. (BPS Ch. 4&5)
SPSS:   Graphs>Legacy Dialogs>Scatter/Dot > Simple Scatterplot.  Move variables from the lefthand  list to the X-axis (horizontal)  and Y-axis (vertical) boxes. See Handout for more.  Files from text? Don't forget to check Measure, and to add Labels.

  Some scatterplot data:  educ-v-mortality.sav  ,   Studat-in-SPSS.sav .   govsal_vs_pay.sav  is the file used for the handout.
(BPS Ch. 4&5) 


Start here Friday
Correlation
:
(pp. 98-105)  The (Pearson) correlation coefficient r is a numerical measure for how strongly linear (and in what direction) the relationship is.  Doesn't substitute  for a scatterplot.
Use if data is:  2 quantitative variables, & "nice":
    One cluster/cloud/band.
   Pretty straight.
   Outlier(s)? Do with/without & be cautious.
Correlation experiments:
  Website,  http://www.whfreeman.com/bps4e,"Statistical Applets",  Correlation/Regression.  Play with data points, observing the Correlation Coefficient.   Check in the "Show Mean X & Mean Y lines" box.  See how much is in each quadrant. Compare with correlation coefficient.

Using SPSS (p.4 top, Scatterplot handout) Analyze>Correlate>Bivariate

Properties (p. 101) and cautions (p. 103):

  1. Measures relationship--same whichever variable is on the x-axis
  2. "Unitless"--original measurement units (cm., inches) are "standardized out"
  3. Sign of correlation coefficient matches direction of relationship. + positive, -negative.
  4.  Between -1 and +1.   0: no linear relationship,   +1 or  -1: perfect straight line.
  1. Between two quantitative variables only!
  2. Does NOT give info about curved relationships (only measures linear part of relationship).
  3. NOT resistant to outliers--quite sensitive.
  4. Not a complete summary, even for nice linear data.  Need means, s.d.'s too.
correlation graph


--You won't have to calculate a correlation coefficient by hand. This formula is a bad one for hand computation (roundoff error); if you must do one by hand, find the computational formula in an old textbook.
--Eyeballing:  sketch xbar and ybar lines, see how much data is in + quadrants, how much in - quadrants.

Strength of correlation says NOTHING about causality!  Strong correlation could be:
     A causes B/   B causes A/  C causes both A and B (lurking C)/  just Chance that they go together in this data set.


Sievers home  Math151-F08/Dayf12.htm  1pm  9/24/08
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.