Math 151 , Fall 2004, Wednesday Day 18, October 6After class Hit reload ...

HW   Reading:  Ch. 3 thru 3.1.  Ahead in  3.2
Ch.3 Intro: 
Hand in Friday:  p. 167, 3.1, 3.2, 3.3 exp, obs
= = = = = = = = = = = =
Postpone: 3.1, Sampling
 p. 170, 3.4employed women Also: What is the sampling frame? (Def. p. 179, #3.13)
 3.6 letters to Congress
- - - - - - - - - - - - - - - - 
p. 173 3.7 SRS
p. 207, 3.65 SRS
p. 184, 3.26 Random digits
- - - - - - - - - - - - - - - -
p. 185 3.30 survey questions
- - - - - - - - - - - - - - - - -
p. 181  3.16 bigger sample size
p.185 3.31 sampling error for men
Read, to discuss 
Ch.3 Intro: 
p. 170, 3.5 pop, samp...
p.182, 3.17 obsn/exp
    3.18 novel--pop, samp.
= = = = = = 
Postpone: Sampling 
p. 183, 3.22 president
3.23black police
- - - - - - - - - - - 

p.180 3.14 ring-no-answer
3.15 2 campaign questions

Optional 
 

= = = =
Postpone:
 

- - - - - - - - 

p. 3.24SRS

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Pick a digit (from 0,1,2,3,4,5,6,7,8,9) (if you didn't last time)  Write it down.
 Homework questions? (took most of class)
A. Income depends on height?!
    What is "$789", and what kind of analysis did they do? Footnote adds what?
Cars.sav ,cars.spo relevant to econ graduates problem (2.56).  (X=weight, Y=time to accelerate to 60.  Heavier car should be slower? Oops. Panel with #of cylinders, or color with horsepower.)
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Association--->> Causation?Day 17

Chapters 1 and 2 have covered analyzing data that was given to us--what it said about itself.
    Informally, develop guesses, suspicions, hypotheses about the world the data came from.
Ch. 3:  Producing Data:  Aim:  create data sets that will allow us to make inferences to a larger world than just the data we have.
       Observational Study:  Observes individuals, measures variables, does not influence the responses. (3.1)
                    Take Sample from a population, examine it,
                           hope it's representative so we can infer population is like sample.
                            (Not very useful for cause-and-effect--see  above)
        Experiment: Imposes treatment  on individuals, to see how the treatment influences  the response. (3.2)
                            Best for cause-and-effect.

Confounding:  Two variables (explanatory or lurking) are confounded when you can't sort out their effects on a response variable.
--Used to be: coffee drinking and smoking--most people did both, or neither...
Last year: women who ate at least one serving/day of whole grain (cereal, bread) much less likely to have heart attack.
   (Who eats whole grains?  Were those variables taken into account? ?)
______________________
Start Here Friday
Ch. 3.1 Designing Samples
>>Population: Entire group  that we want information about.
>>Sample: The part of the population we actually examine.
      Hope:  Sample will be representative of the population.

(SAMPLING) BIAS:  The design of a study is biased if it systematically favors certain outcomes.
    Check our "sample" of digits

Some refinements:
*Sampling frame: Moore p. 179 problem 3.13: the group from which the sample is actually chosen--as different from the "population"--the group you want information about. The sampling frame is often, unfortunately, smaller than the population.  The sample is (usually much) smaller than the sampling frame.
* "Chosen" sample may not turn out to be actual sample, if some individuals don't respond--"Nonresponse", p. 178.

Non-probability samples:

Simple Random Sample (SRS) of size n n individuals chosen in such a way that every possible set of n individuals has an equal chance of being chosen.
HOW?  A chance mechanism: Cards, dice, computer program, or
Table of random digits (Simulates rolling a die with 0,1,....9, over and over...) (Table B, back flyleaf)
    Every digit, every sequence of digits, is equally likely to be "next" in any direction.
To use:  label everyone in the population with a number.
    Important:  Every labeling number needs the same number of digits.
    To label 9 people, use the labels 1,2,3,....9 (1-digit chunks)
    To label 15 people, use the labels 01, 02, ...10, 11, ...15 (2-digit chunks)
    To label 125 people, use the labels 001, 002, ... 124, 125 (3-digit chunks)
Pick a place (at random) in the table, start reading across in that size chunk.  Get n eligible numbers (discard repeats)
                    Read Row 150:   07511   88915   41267   16853   84569   79367 ..
From 9 people, a sample n = 5:   0,7, 5, 1, 1, 8, 8, 9, 1, 5, 4,     (sample is individuals 7, 5, 1, 8, 9)
From 15 people, a sample   07, 51, 18, 89, 15, 41, 26, 71, 68, 53, 84, 56, 97, 93, 67.... keep reading,
    go to next line (or back to top line) if you need more.  Individuals 7, 15,...are chosen using this line.
From 125 people, a sample 075, 118, 891, 541, 267, 168, 538, 456, 979, 367...keep reading.  Individuals 75, 118, ...

    Why the same number of digits in each label?  Each individual 3-digit chunk is as likely as any other 3-digit chunk.  But a 1- or 2-digit chunk is more likely than any 3-digit chunk. So 2 will come up more often than 12, but 02 will come up just as often as 12.

    Why across?  For consistency on HW, go the way they say (so you get the answer in the book).  In practice, you can read up, down, backwards, as long as you decide beforehand, and don't change in the middle of choosing the sample.

Sources of bias, even in probability samples:

Inference to the population: Sample results will vary.
   Different samples will represent the population with differing accuracy.
   Well-designed Random (probability) sampling will avoid systematic bias.
   In general,  A larger random sample will give more accurate information about the population than a smaller random sample.

Sievers home  Math151-Fall04/Dayf18.htm  2pm 10/06/04
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.