| Day 4 Hand in: Sec. 2.1 p.56ff. 1.75 quintiles by hand. Method p. 45 middle 1.50 (Do xbar and s by hand. Then put them in SPSS & do them.) 1.43 abc (you did the stemplot Day 2) Use SPSS for c (Table1.5) 1.77 (SPSS) Trimmed mean. Do it like this: Load the guinea pig file (Table 1.8) into SPSS. Find the mean. Then delete the highest 10% and lowest 10% of the observations (Click on the row, hit the Delete key). Find the mean of these = 10% trimmed mean. Similarly find the 20% trimmed mean. (Median = 102.5, to do the comparisons. ) 1.70 (SPSS) (computational accuracy) 1,72, 1.76 (Linear transformations) |
Read, discuss C. In problem B below, you need b > 0. |
Optional Do 1.70 (computational accuracy) in Excel, if you're an Excel user. |
Monday Quiz: In
class, closed book: Stemplot, 5#summary and boxplot. Mean
& s.d. by hand, showing all steps.
- - - - - - - - - - - - - - - - - - - - - - - - - - -
Handout (For Sec. 1.3): Density Density (Solutions)
(Re)visit mean/median, 5#summary,
boxplot
Some other measures of middle:
Mode (modal class) (peak,
most popular), trimmed
mean (throw away a % on each end), midrange (midway
between
min and max)
Spread, cont.
Standard deviation (goes with mean)
. Square root of:
Variance: (almost) average
of squared deviations from the mean.
(deviations sum to 0)
(Divide by (n-1)
"degrees of freedom"--dimension of vector space
spanning
the deviations from the mean)
Demo: 1,1,2,4, mean = 2, sum of squared deviations
= 6, variance = 2, s = 1.41
1,1,2,4,12, mean = 4, sum of squared deviations = 86, variance =
21.5, s = 4.64.
(Midcomputation check: Sum of deviations from the mean (before
squaring
each) always = 0 )
--s is Always > 0 (0 only if all observations are =)
--s units the same as those of the
observations
(squared and squarerooted).
Very
sensitive
to outliers (the outliers contribute much more than their
share to the Sum of
Squared Deviations from the Mean)
SPSS to find mean and s.d. Handout
We've been looking at SHAPE of distributions, and the ways
irregularities
can point us to knowledge about the data. (Living histograms.) As
we Note p.49 middle: Statistical [summary] measures and methods
based
on them are generally meaningful only for distributions of sufficiently
regular shape. ... [Q]uickly resorting to fancy calculations is the
mark
of a statistical amateur. Look, think, and choose your
calculations
selectively.
Summaries
of Middle & Spread "Systems:"
-- (Midrange, Range Very
sensitive to outliers--they use only the max and min!)
-- Median, IQR (+
Quartiles Q1, Q3, 5-number summary), based on percentiles (j'th
percentile is > j% of the data)
-- Mean, StandardDeviation "y-bar"
(or "x-bar"), "s" (good for symmetric unimodal, no outliers)
--------------------------------------------
-----------------------------------------
Linear
transformations do not
change the
shape
of a distribution : A "good" measure of center or spread
should
"act naturally" if you change units of measurement by shifting
(translating) (everyone eats one more cookie)
or by stretching or shrinking (changing scale) (all
cookies are broken in half; count half-cookies) .
Fahrenheit<--> Celsius.
New x* = a + bx, for each observation.
Measures of spread are unaffected by the
shifting!
Only affected by the scale change.
Page 55 gives the rules explicitly. Problem
B has you prove them for mean and standard deviation.
-------------------------------------------
------------------------------------------
1.3
Density function or curve: idealized histogram.
Area = relative frequency.
Any curve that is above the x-axis and has area exactly 1 under
it can be thought of as the idealization of some set of observations,
and
can be called a Density curve. We carry over our terms for shape,
and our summary measures.
Densities
(When values can take on any of a continuous interval of numbers)

Example: Spinner: Label edge with continuous values from
0 to 1. Spinning should produce 1/10 of all spins in each colored
sector.
Simulations of 500, 3000 spins show roughly true. More spins would get
closer to Uniform shape.
Abstraction, idealized histogram ("Probability Model") =
Density
curve. Describes a theoretical distribution of
data.
Any such model is a curve
--always on or above the horizontal axis
--has area exactly 1 underneath it.
| Sievers home | Math251-Fall07/Day2s4.htm | 10pm | 8/30/07 |