We'll start using SPSS next time,Monday--have class in the computer
lab. If you're computer-phobic, coming into Mac 101 and trying
some
ActivStats SPSS exercises ahead of time might help. No downside
except
for time lost. ActivStats only shows how to handle raw
data,
and Ch.3 text hw problems all involved pre-piled data, harder in SPSS.
| Hand in Monday (copied from Day 3 Ch3p. 31, 2 cat. variables 17 Canadian languages 14 Cars (for f, do a segmented bar graph of the cond. dist's of part e, as part of your discussion.) 24 a only Obesity 26 Pet ownership (Also: what's most startling about these percents?) 20 Prisons (include one or more graphs) |
Read, be able to discuss in class
CCh3 25 Family planning 21 Working Parents (What's "wrong" with the graph in the back of the book?) |
Optional ActivStats lessons on SPSS, in Mac 102: on AS pp.1-2, 3-1, 3-2, are a gentle introduction (using raw data). 4-2, 4-3 do continuous data. Ch.3 31 Simpson's paradox, UC |
The rest of Day 4's original page is postponed!
| Hand in Ch3p. 31, 2 cat. variables 17 Canadian languages 14 Cars (for f, do a segmented bar graph of the cond. dist's of part e, as part of your discussion.) 24 a only Obesity 26 Pet ownership (Also: what's most startling about these percents?) 20 Prisons (include one or more graphs) == == = = = = = = = = = = = Ch5, p. 72 Start these, keep for Day 6 assignmt. #3, also make a boxplot. ("No calculator" means no statistical calculator) 15 Wines 16 Ozone 28 Population growth p. 107 (review) 18&19 Old Faithful - - - - - - - - - - Mean/Median. Will be Assigned when? p. 72 7a,b,c,Payroll Also, with c: What measure would be most useful if you wanted to use it to figure the total weekly payroll cost? 6 Sick days + + + + + + + + + + Start now on a separate page; keep for the next assignments : p. 72 19, 20 (no computations needed. 19 d may not be decidable from pictures. Don't worry about it.) 5 Mistake (You can do parts now) 9 Standard deviation Tonight, make dot plots of each pair on axes with the same unit size, find the mean of each set and mark it with a little ^ (like fig. 5.6 p. 64). Notice this looks like a good balance point. Leave space to calculate some standard deviations next time. |
Read,
be able to discuss Ch3 25 Family planning 21 Working Parents (What's "wrong" with the graph in the back of the book?) === == == = = = = = = = = Start these, keep for Day 6 assignmt. 25 Caffeine 41 Eye & Hair color 31 Reading scores (f is harder; optional) - - - - - - - - - - - http://www.whfreeman.com/scc or http://bcs.whfreeman.com/ips Under Student Categories or Student tools, choose "Statistical Applets", Mean &Median . (50 points max.)Check out symmetric, skewed, distributions with outliers. How far apart can you get the mean and median? 13 Marriage age. Ithaca Journal Jan 22, '05 had quiz answers: "How old is the average bride? 24.5 years.... How old is the average groom? 26.5 years." Give some reasons that could account for the big difference between these numbers and the graphed numbers |
Optional
ActivStats lessons on SPSS, in Mac 102: on AS pp.1-2, 3-1, 3-2, are a gentle introduction (using raw data). 4-2, 4-3 do continuous data. |
Categorical vs. Categorical
(Color
vs. Hand) Ch2, pp. 18-22
Day 3
From the New Yorker magazine,
traditionally the most literary and error-free of all, Feb.14/21, '05:
CORRECTION: The Mail of January 3rd contained
the incorrect statistic that four-fifths of Bush voters identified
moral
values as the most important factor in their decision. In fact,
four-fifths
of those identifying moral values as the most important factor of their
decision were Bush voters.
Start here Wednesday
Ch.5
Summarizing distribution
info with numbers
Measures of middle (center)
--Colloquially
"average" can refer to any measure of middle, so watch out; be
more
specific.
Mean (most common
"average"):
Take sum of all observations & divide by how many (n) p. 63
(Midrange:
Average
the maximum & minimum values. Very sensitive to outliers.)
Median:half
are bigger, half are smaller
Point on histogram
with half the area to the left, half to the right.
Calculating: Put observations in numerical order (stemplot!).Spread (dispersion)
Middle one if n is odd, or average the 2 middle if n is even.
Formula: Count in how far? (n+1)/2 places. (7 1/2 places? go halfway =average the 7th and 8th observations. Book's method, p.58, is more complicated, same result.)
Computation of quartiles: Different texts, packages use different methods.Five-number summary: min, Q1, Median, Q3, max.
By hand: quick and somewhat dirty:
Take the two halves of the data you got from finding the median. Find the median of each half, using the same rule as before. (Detail. IF you had an even number of observations to start with, the data divides evenly into an upper and a lower half. IF you had an odd number to start with, you have one in the middle, the median. In this case only, you use the median as part of both halves)
1 3 5 6 8 8 11 20, are n=8 observations.
Median at (8+1)/2= 9/2=4 1/2th ; 1 3 5 6 8 8 11 20, M = 7
8/2 = 4 in each half: Halves are 1 3 5 6, and 8 8 11 15. The quartiles are the medians of each half; count in (4+1)/2= 2 1/2. 1 3 5 6, Q1=(3+5)/2= 4.
8 811 15. Q3= (8+11)/2= 9.5 1 3 | 5 6 | 8 8 | 11 201 3 5 6 6 8 8 11 20, are n=9 observations.
Median at (9+1)/2=10/2=5th ; 1 3 5 6 8 8 11 20, M = 6
The median joins both halves. Each half has (n+1)/2 values.
9+1/2 = 5 in each half: Halves are 1 3 5 6 6, and 6 8 8 11 15. Quartiles are middle values of each half.
Q1=5, Q3= 8 1 3 5 66 8 8 11 20
(This is a dirty method because it doesn't "exactly" divide the data into quarters. Quick? Yes. Tukey did a variation on this, throwing away the median instead of giving it to each half. He called them "Hinges" to avoid fights over the "quartile" name. People who took the course out of Moore, Basic Practice, before this term, learned that method.)
INTERQUARTILE RANGE = IQR= Q3 - Q1.Box (and whisker) plot: Graphical form of five number summary.
=The range of the middle half of the observations. Resistant to outliers!
9.5 - 4 = 5.5 for the set of 8. 8 - 5 = 3 for the set of 9.
|-------[ |
]-----------------------|
0·········5········10········15·········20
"Showing outliers" Outliers can
make a boxplot whisker extend deceptively beyond the bulk of the data.
Make the whiskers
to the last item in the "main mass" of the data.
Put a dot
or a star for each outlier, beyond the whisker end.
How do we decide what's an
outlier?
(Rule of thumb; esp. for computers.)
Fence: Define
"outlier" as a value farther out than 1.5 IQR from the Quartiles.
(Q1 - 1.5 IQR is lower fence, Q3 + 1.5 IQR is upper fence.
For the set of 9, 1.5 IQR = 4.5. Fences are 5 - 4.5 = .5, and 8 +
4.5 = 12.5.
So 20
lies outside the fence, and the whiskers & box should go from
1 to 11 (largest inside the fence)
(Dot
or *? Tukey: Dot ·between
1.5 and 3 IQR's out, * if more than 3 IQR's out. By hand, I don't care.)
|-------[ |
]-----|
*
0·········5········10········15·········20
<>Boxplots shine at comparing distributions
conditioned with several categories .
| Sievers home | Math151-Fall05/Dayf4.htm | 2:15pm | 9/2/05 |