| Hand
in Monday (bring any remaining questions on
these; I'll review boxplots. I don't care if you don't know the
"fences" rule--just draw a "whisker" line from each quartile all the
way to the min or max . Ch5, p. 72 #3, also make a boxplot. ("No calculator" means no statistical calculator) 15 Wines 16 Ozone (note, this is a sort of "time plot" using boxplots instead of dots) 28 Population growth p. 107 (review) 18&19 Old Faithful - - - - - - - - - - Postpone Mean/Median. p. 72, 7a,b,c,Payroll Also, with c: What measure would be most useful if you wanted to use it to figure the total weekly payroll cost? 6 Sick days + + + + + + + + + + Start now on a separate page, do the parts that you can; keep for the next assignment : p. 72 19, 20 (no computations needed. 19 d may not be decidable from pictures. Don't worry about it.) 5 Mistake 9 Standard deviation Tonight, make (by hand) dot plots of each pair on axes with the same unit size, find the mean of each set and mark it with a little ^ (like fig. 5.6 p. 64). Notice this looks like a good balance point. Leave space to calculate some standard deviations next time. Also, make a dot plot of #10b set 2 (10, 50, 60, 70, 110). Which of the data sets in problem 9 does it most resemble? |
Read,
be able to discuss Read Circle questions: email me any more: sievers@wells.edu Ch.5: 25 Caffeine 41 Eye & Hair color 31 Reading scores (f is harder; optional) - - - - - - - - - - - Postpone: http://www.whfreeman.com/scc or http://bcs.whfreeman.com/ips Under Student Categories or Student tools, choose "Statistical Applets", Mean &Median . (50 points max.) Check out symmetric, skewed, distributions with outliers. How far apart can you get the mean and median? 13 Marriage age. Ithaca Journal Jan 22, '05 had
quiz answers:
"How old is the average bride? 24.5 years.... How old is the average
groom?
26.5 years." Give some reasons that could account for the big
difference
between these numbers and the graphed numbers in D&V. |
Optional
ActivStats lessons on SPSS, in Mac 102: on AS pp.1-2, 3-1, 3-2, are a gentle introduction (using raw data). 4-2, 4-3 do continuous data. |
Two way table
questions?
Error in description
Ch.5 Summarizing distribution info with numbers
Measures of middle (center)
--Colloquially
"average" can refer to any measure of middle, so watch out; be
more
specific.
Mean (most common
"average"):
Take sum of all observations & divide by how many (n) p. 63
(Midrange:
Average
the maximum & minimum values. Very sensitive to outliers.)
Median:half
are bigger, half are smaller
Point on histogram
with half the area to the left, half to the right.
Calculating: Put observations in numerical order (stemplot!).Spread (dispersion)
Middle one if n is odd, or average the 2 middle if n is even.
Formula: Count in how far? (n+1)/2 places. (7 1/2 places? go halfway =average the 7th and 8th observations. Book's method, p.58, is more complicated, same result.)
Computation of quartiles: Different texts, packages use different methods.Read the following: I'll review Monday.
By hand: quick and somewhat dirty:
Take the two halves of the data you got from finding the median. Find the median of each half, using the same rule as before. (Detail. IF you had an even number of observations to start with, the data divides evenly into an upper and a lower half. IF you had an odd number to start with, you have one in the middle, the median. In this case only, you use the median as part of both halves)
1 3 5 6 8 8 11 20, are n=8 observations.
Median at (8+1)/2= 9/2=4 1/2th ; 1 3 5 6 8 8 11 20, M = 7
8/2 = 4 in each half: Halves are 1 3 5 6, and 8 8 11 15. The quartiles are the medians of each half; count in (4+1)/2= 2 1/2.
1 3 5 6, Q1=(3+5)/2= 4. 8 811 15, Q3= (8+11)/2= 9.5 1 3 | 5 6 | 8 8 | 11 201 3 5 6 6 8 8 11 20, are n=9 observations.
Median at (9+1)/2=10/2=5th ; 1 3 5 6 8 8 11 20, M = 6
The median joins both halves. Each half has (n+1)/2 values.
(9+1)/2 = 5 in each half: Halves are 1 3 5 6 6, and 6 8 8 11 15. Quartiles are middle values of each half.
Q1=5, Q3= 8 1 3 5 66 8 8 11 20
(This is a dirty method because it doesn't "exactly" divide the data into quarters. Quick? Yes. Tukey did a variation on this, throwing away the median instead of giving it to each half. He called them "Hinges" to avoid fights over the "quartile" name. People who took the course out of Moore, Basic Practice, a year and a half ago, learned that method.)
INTERQUARTILE RANGE = IQR= Q3 - Q1.Box (and whisker) plot: Graphical form of five number summary.
=The range of the middle half of the observations. Resistant to outliers!
9.5 - 4 = 5.5 for the set of 8. 8 - 5 = 3 for the set of 9.
|-------[ |
]-----------------------|
0·········5········10········15·········20
"Showing outliers" Outliers can
make a boxplot whisker extend deceptively beyond the bulk of the data.
Make the whiskers
to the last item in the "main mass" of the data.
Put a dot
or a star for each outlier, beyond the whisker end.
How do we decide what's an
outlier?
(Rule of thumb; esp. for computers.)
Fence: (Knowing rule is optional) Define
"outlier" as a value farther out than 1.5 IQR from the Quartiles.
(Q1 - 1.5 IQR is lower fence, Q3 + 1.5 IQR is upper fence.
For the set of 9, 1.5 IQR = 4.5. Fences are 5 - 4.5 = .5, and 8 +
4.5 = 12.5.
So 20
lies outside the fence, and the whiskers & box should go from
1 to 11 (largest inside the fence)
(Dot
or *? Tukey: Dot ·between
1.5 and 3 IQR's out, * if more than 3 IQR's out. By hand, I don't care.)
|-------[ |
]-----|
*
0·········5········10········15·········20
Boxplots shine at comparing distributions
conditioned on several categories .
| Sievers home | Math151-Sp06/Daysp6.htm | 2:30pm | 2/10/06 |