HW Day 12: Read Ch. 4 (Scatterplots and
correlation) to p. 104 Check p.112 4.14, 15,
pp. 104-112 (correlation) Check 4.16 thru 4.22.
You do not have to be able to calculate r by hand. You
should be able to guess roughly at an r for a swarm of data; as
p.108-9, and know and be able to use facts 1-4, p. 107, and
Cautions 1-4 pp. 108,110.
Please also, Ahead, Ch. 5, Regression, thru p. 135 (Check: p. 137: 5.17 through 23, basic line and regression line facts and tools (5.18: those are not very satisfactory answers, but you should be able to eliminate at least one). 5.24 r and slope. 5.26 Don't calculate! If you sketch the graph by hand and draw a line thru the points, you should be able to guesstimate the slope well enough to choose among the 3 answers. 5.25 r2 is the square of r) Then Continuing regression, p. 126-147.
Hand In Next class; please
Read Chapter 4 and as far as you can stand in 5.
= ="Approximately" Normal = =
Postpone Ch. 4, but please read!
- - - -.. - - - - -
- - -
p. 115, 4.26 date heights again You
graphed this by hand. r = .5653. Now answer the
questions in the text.
Correlation (computing & thinking)
p. 111, 4.13 (SPSS) gas, speed (made-up data): association but 0 correlation. Find the means and draw the mean lines on your graph (by hand) to help explain the 0 correlation.p. 116, 4.28 (SPSS) Sparrowhawk colonies.
p. 110, 4.12 (SPSS or Applet) Lean vs metab. rate again (Women)To add a data pair in SPSS just type them in a new row at the bottom. To delete, click on the case number, which highlights the whole row, hit delete.
(This problem looks forward to Ch. 5, sort of)
p. 118, 4.32 corn plant density. (SPSS) Notice how the data is entered for SPSS--not as displayed here! but with the first column giving Plants per acre and the second giving Yield. Make a scatterplot. Use your calculator to find the mean yields, and write these on your paper. (Or You can find means for the separate groups in SPSS : in Explore, Plants to the Factor list). Graph the means by hand with a pencil on your printed plot, and connect the means dots.
p. 119-20, 4.35 changing units. Do a rough sketch for yourself.
p. 126, 4.37 investing
Look at all the graphs you make, and guesstimate the correlation coefficient (before you read or calculate it.)
Do now (for Ch. 5) if you need the practice:
Straight line graphing practice:
A. y = -10 + 3x, graph for 2<x<10.
B. y = 500 - 20x, graph for 0<x<10.
More practice reading graph:
p. 114, 4.24 Masters scores
(see below for details)
to make different scatterplot
patterns, and observe their r's.
p. 118, 4.32 corn yield, I said to draw the
line by hand. SPSS can plot the line connecting
means on your graph: In the Chart Editor, do
Elements>Interpolation Line. If it doesn't
look right, in the Properties window , Interpolation
Line tab, choose Line Type: Straight.
Exam 1 returned Comments Solutions
Sample exam solutions
I haven't been mentioning Science Colloquium,
every Friday 12:40-1:20 but they're often fascinating, and
often have Statistics in Action (unpredictably,
unfortunately) This spring, mostly student theses. Today,
"The Independence Option; Business Knowledge for Science Majors,."
Prof. Ellis. Please come!
Going from x to area (proportion), & backward--area to
x: Day 11,
Normal probability practice
What happens further
in normal tails? Almost (but not
quite) 0. Rounds to .0000.
p. 90, 3.43: Difference in tails, M/F math. Other evidence relevant to the question: Across countries, the difference in math scores M/F is related to the level of gender equality in the country--the more equal the sexes are in general, the smaller the differential in math scores, and vice versa. (would be good on a scatterplot but I don't have the data in that form). Evidence for nurture not nature.
= = = = = = = = = = = = = = = = = = = =
Start here Monday
Relationships: (BPS5e Ch.4, at first to p. 104)
Two Related quantitative variables (We used side by side stemplots, boxplots, histograms to relate a quantitative variable to a categorical variable)
"Just Related" or "explanatory & response?"
explanatory = independent = "x" = horizontal axis ( = "cause", sometimes but not always)= predictOR
response = dependent = "y" = vertical axis = ("effect ") =predicteED
(Living histograms: Height vs. weight, Height vs. gpa)
General Pattern Deviations
Clusters? Outliers? (label if possible)
Form (linear, curved, ...?)
Strength of relationship (how unfuzzy) "Weak, moderate, strong"
Positively associated: y increases as x increases (generally).
Negatively associated: y decreases as x increases.
Mark subgroups differently to do comparisons. (Subgroups defined by categorical variable, like Sex, Region of country)
Get SPSS Scatterplot
handout, link + Governors'
Salaries HW sheet,or outside
my door, if you missed class. (BPS Ch. 4&5)
SPSS: Graphs>Legacy Dialogs>Scatter/Dot > Simple Scatterplot. Move variables from the lefthand list to the X-axis (horizontal) and Y-axis (vertical) boxes. See Handout for more. Files from text? Don't forget to check Measure, and to add Labels.
Some scatterplot data: educ-v-mortality.sav . The file used for the handout is govsal_vs_pay.sav..
(BPS Ch. 4&5)
Correlation: (pp. 104-112) The (Pearson) correlation coefficient r is a numerical measure for how strongly linear (and in what direction) the relationship is. Doesn't substitute for a scatterplot.
Use if data is: 2 quantitative variables, & "nice":
Outlier(s)? Do with/without & be cautious.
Correlation experiments: Website, http://www.whfreeman.com/bps5e,"Statistical Applets", Correlation/Regression. Play with data points, observing the Correlation Coefficient. Check in the "Show Mean X & Mean Y lines" box. See how much is in each quadrant. Compare with correlation coefficient.
Using SPSS (p.4 top,Scatterplot Handout ) Analyze>Correlate>Bivariate, move both variables across.
Properties (p. 107) and Cautions (p. 108,110):
Strength of correlation says NOTHING about causality!
Strong correlation could be:
--You won't have to calculate a correlation coefficient by hand. This formula is a bad one for hand computation (roundoff error); if you must do one by hand, find the computational formula in an old textbook.
--Eyeballing: sketch xbar and ybar lines, see how much data is in + quadrants, how much in - quadrants.
**[In 1973] the following item appeared in Dear Abby's column:
Dear Abby: You wrote in your column that a woman is pregnant for 266 days. Who said so? I carried my baby for ten months and five days, and there is no doubt about it because I know the exact date my baby was conceived. My husband is in the Navy and it couldn't have possibly been conceived any other time because I saw him only once for an hour, and I didn't see him again until the day before the baby was born. I don't drink or run around, and there is no way this baby isn't his, so please print a retraction about that 266-day carrying time because otherwise I am in a lot of trouble.Abby's answer was consoling and gracious but not very statistical:
San Diego Reader
Dear Reader: The average gestation period is 266 days. Some babies come early. Others come late. Yours was late.The question here is not whether the baby was late. That fact is already known. At issue is the credibility of the length of the delay. Ten months and five days is approximately 310 days, which means that the pregnancy exceeded the norm by 44 days. [How unusual is that?]