Math 151 , Day 17, Wednesday, March7, 2001

Questions on HW
Ch. 3.1 Designing Samples

(SAMPLING) BIAS:  The design of a study is biased if it systematically favors certain outcomes.

Population: Entire group  that we want information about
Sample: The part of the population we actually examine.
Hope:  Sample will be representative of the population.

Some refinements:
*Sampling frame: p. 179 problem 3.13: the group from which the sample is actually chosen--as different from the "population"--the group you want information about. The sampling frame is often, unfortunately, smaller than the population.  The sample is smaller than the sampling frame.
* "Chosen" sample may not turn out to be actual sample, if some individuals don't respond--"Nonresponse", p. 178.

Non-probability samples:

Probability samples--each member of population has a known chance of being chosen
We can't guarantee the sample is representative, but with a probablility sample we can calculate how often (or seldom) it isn't. (Section 3).
The METHOD is what matters--it will guarantee that most of the time we'll get a representative sample.  (Sometimes we'll do everything right and still have bad luck.  Better than doing it wrong, systematically getting an unrepresentative sample.)

Simple Random Sample (SRS) of size n:  n individuals chosen in such a way that every possible set of n individuals has an equal chance of being chosen.
HOW?  A chance mechanism: Cards, dice, computer program, or
Table of random digits (Simulates rolling a die with 0,1,....9, over and over...)
    Every digit, every sequence of digits, is equally likely to be "next" in any direction.
To use:  label everyone in the population with a number.
    Important:  Every labeling number needs the same number of digits.
    To label 9 people, use the labels 1,2,3,....9 (1-digit chunks)
    To label 15 people, use the labels 01, 02, ...10, 11, ...15 (2-digit chunks)
    To label 125 people, use the labels 001, 002, ... 124, 125 (3-digit chunks)
Pick a place (at random) in the table, start reading across in that size chunk.  Get n eligible numbers (discard repeats)
                    Read Row 150:   07511   88915   41267   16853   84569   79367 ..
For 9 people, a sample n = 5: 0, 7, 5, 1, 1, 8, 8, 9, 1, 5, 4,     (sample is individuals 7, 5, 1, 8, 9)
For 15 people, a sample   07, 51, 18, 89, 15, 41, 26, 71, 68, 53, 84, 56, 97, 93, 67.... keep reading,
    go to next line (or back to first) if you need more.
For 125 people, a sample 075, 118, 891, 541, 267, 168, 538, 456, 979, 367...keep reading

    Why the same number of digits in each label?  Each individual 3-digit chunk is as likely as any other 3-digit chunk.  But a 1- or2-digit chunk is more likely than any 3-digit chunk. So 2 will come up more often than 12, but 02 will come up just as often as 12.

    Why across?  For consistency on HW, go the way they say (so you get the answer in the book).  In practice, you can read up, down, backwards, as long as you decide beforehand, and don't change in the middle of choosing the sample.

Other probability sampling designs (pp.174-6) next time.

Sources of bias, even in probability samples:

Inference to the population: Sample results will vary.
Different samples will represent the population with differing accuracy.
Well-designed Random (probability) sampling will avoid systematic bias.
In general,  A larger random sample  will give more accurate information about the population than a smaller random sample.

HW: Read 3.1.  This hw covers all but pp.174-6.
Hand in: 
Samples: p. 170, 3.4employed women  Also:What is the sampling frame? 
 3.6 letters to Congress


p. 173 3.7 SRS
p. 207, 3.65 SRS
p. 184, 3.26 Random digits


p. 185 3.30 survey questions


p. 181  3.16 bigger sample size
p.185 3.31 sampling error for men
Read, to discuss 
3.22 president
3.23black police




p.180 3.14 ring-no-answer
3.15 2 campaign questions
Optional 

p. 3.24SRS
Learn how to use SPSS to make an SRS. Ch.3 in the SPSS manual.  Unfortunately you first have to have a variable with a row for each individual in the population (or more properly, the sampling frame).  So it's most useful if you already have a big batch of data and want to look at a sub-sample from it.


Sievers home  Math151-Sp01/Day17.htm  3/6/01
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.