| Hand in: Sec. 3.3 p. 225ff.
3.54 ring-no answer 3.47, 3.48 systematic 3.49 a. For b, don't find the sample, but tell what the type of sampling is. random digit dialing 3.52 stratified over/under 21. (Don't find the sample) 3.44 census tracts (use table B) Sec. 3.4 p. 240ff
Postpone the rest:
|
Read, discuss
3.45 different starts 3.57, 58 questions 3.39 movies 3.50, 3.53 strata 3.46 census Postpone 3.69
|
Optional |
Probability Samples:
SRS--Simple Random Sample (& my initials)
Systematic Random Sample
Stratified Random Sample
Multistage Sample
(All our later theory will be for SRS; modifications need to
be made for other probability samples)
Stratified Random Sample: population is cut into natural segments ('strata'). A specific number of individuals is chosen from each stratum (within each stratum we take a simple random sample). Advantage: Every stratum is represented with a known proportion of the sample; a simple random sample might under- or over-represent a stratum, by chance. "Strata" are like "blocks"--different subcultures, different jargon.
Multistage Sample: Useful when individuals are at the bottom of a sequence of categories: E.g. to choose a sample of college women, first select 10 colleges, at random, then from those colleges select 2 dorms at random, then from each dorm select 10 students to interview. Total sample = 200. Advantage: $$, time: you only have to visit 10 colleges, 2 dorms in each. An SRS from the whole country, even if you could do it, might mean 200 colleges. (You can also mix this with stratification, for instance selecting the 10 colleges in a stratified way from large coed, small coed, womens,...)
Systematic Random Sample (p.228, problem 3.47) Using a list, to pick a sample of 1/20 of the list: First pick a number at random from 1,2,....20. Suppose you get 8. The 8th individual in the list is the first one in the sample. Then take every 20th individual after that, numbers 28, 48, 68,.... Advantage: Easy to implement, avoids "clumps" that might occur with SRS. Another description: a "one-in-twenty" sample from a list.
How to take an SRS using SPSS--Handout after vacation (No SPSS HW over vacation)
3.4, Toward Statistical Inference.
Chance behavior (a random phenomenon):
Unpredictable
in the short run, predictable regular pattern in the long run.
(Random numbers: equally
likely in the long run. "Random" in this chapter is more general--pattern
is not necessarily equally likely)
25 digits from the random number table: Individual
sets of 25 show much variability. Pooled shows more
"flatness" --but still much variability. You would be right to be
skeptical when I told you that your "pick-a-number" choices were not random,
on the basis of just this class's data.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ ~
We know that a sample from a population will
not exactly represent the population. If we take a random
sample, the behavior of samples will not be individually
predictable, but there will be predictable pattern in many random
samples from the same population. Knowing the pattern will be
as good as we can do.
Sec. 3.4
Sample Chosen
from a Population
(varies)
(fixed, but usually unknown)
Calculate
Numerical summary: Statistic
(Latin)
Parameter(Greek
letter)
Examples:
Sample mean xbar Population
mean mu (µ)
Sample st. dev. s Pop.
standard dev. sigma
Sample median
Pop. median
Sample proportion p-hat Pop.
proportion p
Sample line height y-hat Pop.
regression line height y
The actual value of the Statistic will vary,
depending on the particular sample. "Sampling variability"
The Statistic "estimates" the Parameter.
We hope it is close to the parameter. If we choose simple random
samples, we can understand the pattern of values the statistic can
take.
Some examples of statistics:
Height: U.S. young
women: pop. mean= 64.5", pop. s.d. 2.5"
(text. Caveat: rounded?)
Math 151, Spring '01, xbar = 64.2,
s = 3.75.
Fall '01, xbar = 65.01, s = 3.22.
Spring '02, xbar = 64.53, s = 2.91.
Fall '02, xbar = 63.89, s = 2.48.
Spring '03, xbar = 64.98, s = 3.29
Spring '04, xbar = 65.33, s = 2.25
Spring '05, xbar = 64.31, s =2.93
Coin flip: Proportion
of heads p = 1/2
(?)
p-hat = 256/520 = .492 (combined data from many
past classes)
Thumbtack: Proportion
of point-up p =
(??)
p-hat = 441/691 = .6382 (one past class, Math
251)
Start here Wednesday
Sampling distribution of a statistic:
If we could repeat the sampling process, distribution of values for that
statistic calculated from "all possible" samples (of the given size.)
Assumes probability sampling or randomized experiment design.
Shape, center, spread.
Shape: mound-shape, often normal; wouldn't want bimodal
or outliers.
Center of sampling distribution should be close to parameter
value:
systematically "under-or over-estimates" =
"biased estimator"
Spread: Want "tight" (around parameter value!)
--SRS produces unbiased estimators for most common statistics.
--Larger (random) sample produces less variability (spread)
Size of sample
matters, not proportion of population (as long as population is at
least 10 times sample size).
Random sampling will allow us to do inferential statistics:
How far off are we likely to be from the parameter value = Margin
of error.
How plausible is a claim based on the data (significance level)
Next: Looking at some sampling distributions; then Ch. 4, Ch. 5...
| Sievers home | Math251-Fall05/Dayps19.htm | 11pm | 10/6/05 |