Math 151 , Spring 2002, Monday Day 19, March 11 Hit reload to get most current version

Mortality vs. education--outliers?
HW questions?
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Chapters 1 and 2 have covered analyzing data that was given to us--what it said about itself.
    Informally, develop guesses, suspicions, hypotheses about the world the data came from.
Ch. 3:  Producing Data:  Aim:  create data sets that will allow us to make inferences to a larger world than just the data we have.
       Observational Study:  Observes individuals, measures variables, does not influence the responses. (3.1)
                    Take Sample from a population, examine it,
                           hope it's representative so we can infer population is like sample.
                            (Not very useful for cause-and-effect--see above)
        Experiment: Imposes treatment  on individuals, to see how the treatment influences  the response. (3.2)
                            Best for cause-and-effect.

Confounding:  Two variables (explanatory or lurking) are confounded when you can't sort out their effects on a response variable.
--Used to be: coffee drinking and smoking--most people did both, or neither...
______________________
Ch. 3.1 Designing Samples

>>Population: Entire group  that we want information about
>>Sample: The part of the population we actually examine.
      Hope:  Sample will be representative of the population.

(SAMPLING) BIAS:  The design of a study is biased if it systematically favors certain outcomes.
Pick a digit (from 0,1,2,3,4,5,6,7,8,9).  Write it down.

Some refinements:
*Sampling frame: Moore p. 179 problem 3.13: the group from which the sample is actually chosen--as different from the "population"--the group you want information about. The sampling frame is often, unfortunately, smaller than the population.  The sample is (usually much) smaller than the sampling frame.
* "Chosen" sample may not turn out to be actual sample, if some individuals don't respond--"Nonresponse", p. 178.

Non-probability samples:

Probability samples--each member of population has a known chance of being chosen (pp. 174-6)
We can't guarantee the sample is representative, but with a probablility sample we can calculate how often (or seldom) it isn't. (Part 3 of the course).
The METHOD is what matters--it will guarantee that most of the time we'll get a representative sample.  (Sometimes we'll do everything right and still have bad luck.  Better than doing it wrong, systematically getting an unrepresentative sample.)
      (ACT "Potato" sample scheme, bottom of 10-1)

Simple Random Sample (SRS) of size n n individuals chosen in such a way that every possible set of n individuals has an equal chance of being chosen.
HOW?  A chance mechanism: Cards, dice, computer program, or
Table of random digits (Simulates rolling a die with 0,1,....9, over and over...) (Table B, back flyleaf)
    Every digit, every sequence of digits, is equally likely to be "next" in any direction.
To use:  label everyone in the population with a number.
    Important:  Every labeling number needs the same number of digits.
    To label 9 people, use the labels 1,2,3,....9 (1-digit chunks)
    To label 15 people, use the labels 01, 02, ...10, 11, ...15 (2-digit chunks)
    To label 125 people, use the labels 001, 002, ... 124, 125 (3-digit chunks)
Pick a place (at random) in the table, start reading across in that size chunk.  Get n eligible numbers (discard repeats)
                    Read Row 150:   07511   88915   41267   16853   84569   79367 ..
From 9 people, a sample n = 5:  0, 7, 5, 1, 1, 8, 8, 9, 1, 5, 4,     (sample is individuals 7, 5, 1, 8, 9)
From 15 people, a sample   07, 51, 18, 89, 15, 41, 26, 71, 68, 53, 84, 56, 97, 93, 67.... keep reading,
    go to next line (or back to top line) if you need more.  Individuals 7, 15,...are chosen using this line.
From 125 people, a sample 075, 118, 891, 541, 267, 168, 538, 456, 979, 367...keep reading.  Individuals 75, 118, ...

    Why the same number of digits in each label?  Each individual 3-digit chunk is as likely as any other 3-digit chunk.  But a 1- or 2-digit chunk is more likely than any 3-digit chunk. So 2 will come up more often than 12, but 02 will come up just as often as 12.

    Why across?  For consistency on HW, go the way they say (so you get the answer in the book).  In practice, you can read up, down, backwards, as long as you decide beforehand, and don't change in the middle of choosing the sample.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~  ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
We will focus on the mathematics of the SRS, the most basic.  In practice, more sophisticated sampling methods may be preferred.  The math needed to analyze their effects is beyond our course.
With Wednesday's class!:    Here are some other probability samples:
Stratified Random Sample: population is cut into natural segments ('strata').  A specific number of individuals is chosen from each stratum (within each stratum we take a simple random sample).  Advantage: Every stratum is represented with a known proportion of the sample; a simple random sample might under- or over-represent a stratum, by chance.

Multistage Sample: Useful when individuals are at the bottom of a sequence of categories: E.g. to chose a sample of college women, first select 10 colleges, at random, then from those colleges select 2 dorms at random, then from each dorm select 10 students to interview.  Total sample = 200.  Advantage: you only have to visit 10 colleges, 2 dorms in each.  An SRS from the whole country, even if you could do it, might mean 200 colleges.  (You can also mix this with stratification, for instance selecting the 10 colleges in a stratified way from large coed, small coed, womens,...)

Systematic Random Sample (p.184, problem 3.27)  Using a list, to pick a sample of 1/20 of the list: First pick a number at random from 1,2,....20.  Suppose you get 8.  The 8th individual in the list is the first one in the sample.  Then take every 20th individual after that, numbers 28, 48, 68,....   Advantage: Easy to implement, avoids "clumps" that might occur with SRS.
- - - - - - -
Sources of bias, even in probability samples:

Inference to the population: Sample results will vary.
   Different samples will represent the population with differing accuracy.
   Well-designed Random (probability) sampling will avoid systematic bias.
   In general,  A larger random sample will give more accurate information about the population than a smaller random sample.


PreClass assignment Day 19  for Day20
Sampling:  If you didn't do the asterisks in ACT ch. 7, do them: good examples!
Look up in Moore and read about Stratified, Systematic Random Samples, Multistage Sample.
Designing Experiments: Know from ACT ch. 11  (These are also all in Moore ch. 3.2)
ACT p.11-1 rules of Exp. Design  Activity 2, 
Randomized Comparative experiment; Placebo (Activity 3), Blinding & Double Blinding (Activity 4)
p. 11-2 Treatment-Response, Experimental Units (Subjects)  Factor/Level
The pencil-reviews are good.

HW assignment Day 19, Monday March 11,
Moore, from The Basic Practice of Statistics
Reading:  Ch. 3 thru 3.1.  Ahead in  3.2
Hand in, all from Moore 

Ch.3 Intro: 
p. 167, 3.1, 3.2, 3.3 exp, obs
= = = = = = = = = = = =
Sampling
 p. 170, 3.4employed women Also: What is the sampling frame? (Def. p. 179, #3.13)
 3.6 letters to Congress
- - - - - - - - - - - - - - - - 
p. 173 3.7 SRS
p. 207, 3.65 SRS
p. 184, 3.26 Random digits
- - - - - - - - - - - - - - - -
p. 185 3.30 survey questions
- - - - - - - - - - - - - - - - -
p. 181  3.16 bigger sample size
p.185 3.31 sampling error for men
= = = = = = = = = = = = = = 
Probability Samples (other):
Do with Day 20's HW
 p. 176 3.11 stratified sample, accounts
3.12 multistage design, schoolkids
p. 184, 3.27 Systematic.
3.28 same chance for each.  SRS?

Read, to discuss (all Moore)

Ch.3 Intro: 
p. 170, 3.5 pop, samp...
p.182, 3.17 obsn/exp
    3.18 novel--pop, samp.
= = = = = = = = = =
Sampling
p. 183, 3.22 president
3.23black police
- - - - - - - - - - -

- - - - - - - - - - - 

p.180 3.14 ring-no-answer
3.15 2 campaign questions

Optional 
 
 
 
 
 
 
 
 

- - - - - - - - - - -

- - - - - - - - - - -

p. 3.24SRS
 
 
 
 
 

 


Sievers home  Math151-Sp02/Day19.htm  4pm 3/11/02
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.