Math 151 , Fall 2008 Wednesday Day 18, Oct. 8 Hit reload....After class.

HW:  (Re) read pp. 133-136. Read Ch. 7 (Summary review) (). Read p. 186.  Next:  Chapter 8.  (Read p. 200 (Other designs) last (it's optional)).  Check p. 206, 8.17-22, 26 at first., then 8.23-25 with Table B.  Ahead, Chapter 9.

Hand in After Break:
NOTHING! Postpone all! Have a good one!

 Note: We are finished with SPSS for a while. None required for the next few chapters.
& Ch. 8 Producing Data  & &
p. 192, 8.1, 8.2, 8.3 expt, obsn
p. 207, 8.27 Alcohol & heart attacks (expt/obsn)

Postpone the rest: Here in case you want to work ahead.
p. 194, 8.4, 5, 6 population/sample
. . . . . .

p. 195, 8.7 Sampling badly on campus
- - - - - - -
p. 199 8.10 Minority Managers Use the Simple Random Sample Applet, and choose a sample of size 6. Give your answer by listing their names. (I believe that everyone will get different samples.) Five of the 28 managers have East Asian surnames:  Huang, Kim, Liao, Shen, Wang.  How many of these are in your sample?  

p. 199 8.9 Apartment living, SRS. Use Table B.
p. 209, 8.36 Area code sample, SRS  Use Table B.
p. 211, 8.45 random digit dialing
p. 210, 8.41 random digit characteristics p.209-10, 8.38 b only Traffic lights
p. 208, 8.30 movie viewing
+ + + + + + + +
p. 205, 8.16, Ask more people
p. 212, 8.50, Polling Hispanics

Read, to discuss 

postpone.
p. 208, 8.29 safety of anesthetics
p. 192 8.3 TV & aggression (lurking)

& &  postpone. . .p.195, 8.8 more Sampling badly on campus
- - - -
p. 211, 8.47 guns

p. 204, 8.14, 8.15 biases.
p. 208, 8.31 world affairs
p. 211, 8.46 wording survey questions

p. 212, 8.49 Canada healthcare


Optional 

&postpone. .

p. 209, 8.35 Use table B (more practice)

p. 209, 8.34 seat belt use

Pick a digit (from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9).  Write it down.  Write it by your name on the clipboard.
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Exam 2 this Friday: Day 19 (Oct. 10).  Next class.  Let me know Right Away if you can't take the exam Friday.  Starts with Ch. 3,  Normal distribution, tables.  Thru Ch. 4, and what we cover of Ch.5 (&7)  through  today.  (All questions on the sample exam will be covered.)  Sample exam (handout), solutions (link) (NOW works) Normal probability practice  
One sheet of notes: I will give you
paper copies of the Normal table.
Don't forget:  "None of the above"
reading ,  Reading (but not creating) SPSS output (As worksheet, #2, 7, 8. Solutions),

Questions?  Last HW?  Day 17
Questions for exam?

- - -More time?  Topics chosen from below.- - -
Revisit r2:   Day 17 done
(sum of squared residuals / sum of squared dev's from y-bar) = proportion of variability in y's NOT explained by regression line on x.
 r21 - (sum of squared residuals / sum of squared dev's from y-bar) = proportion of variability which IS explained by regression line on x.

Cautions: Day 16
     Plot the data:   summary numbers (r, line) not resistant; only measure linear.
      Extrapolation--beware. Relationship may not persist outside range of data. 
             Government projections of national budget surplus/deficit: 
(www.cbo.gov publications>search)
                Wed, after break: Budget extrapolations

     "Lurking" variable has an important effect, but not one of the variables studied.

Association does not imply causation Day 17 for detail done
Establishing that x "causes" y:  difficult:

    Best: Do an experiment in which we change x, keep lurking variables under control. (E.g.   Rats.  Ch.9)
    Otherwise: Strong association. Consistent over many studies. Higher x-->stronger y.  X precedes y in time.  A plausible mechanism exists (parallel studies?)                 
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Start here Wed. after break

Chapters 1 through 5 have covered analyzing data that was given to us--what it said about itself.
    Informally, develop guesses, suspicions, hypotheses about the world the data came from.
From Exploration to Inference p. 186

Ch. 8&9:  Producing Data:  Aim:  create data sets that will allow us to make inferences to a larger world than just the data we have.

  Observational Study:  Observes individuals, measures variables, does not influence the responses. (ch.8) 
                 Sometimes observe individuals who are (more or less) conveniently at hand, or, better,
                  Take Sample from a population, examine it.... (ch.8)
  Experiment: Imposes treatment  on individuals, to see how the treatment influences  the response. (ch.9)  

Confounding:  Two variables (explanatory or lurking) are confounded when you can't sort out their effects on a response variable.  (Rats:  Mothers' grooming causes sociability, or inherited sociability from mothers who like to groom?).

If you want to work ahead, this is where were going:
Ch. 8 p. 192ff.  Sampling
>>Population: Entire group  that we want information about.
>>Sample: The part of the population we actually examine.
        Hope:  Sample will be representative of the population.
>> Sampling design:  Describes exactly how sample is to be chosen from population.

(SAMPLING) BIAS:  The design of a study is biased if it systematically favors certain outcomes.
.Your digits! .

Sample survey:  (attempt to) choose a representative sample from a large, varied population. Not Easy!
    Some issues:  What population do we want to understand?  What exactly do we want to measure?

Non-probability samples (sampling badly):


Simple Random Sample
(
SRS) of size n n individuals
chosen in such a way that every possible set of n individuals has an equal chance of being chosen.   A probability sample (p.200).
HOW?  A chance mechanism: Label everyone in the population.  Use Cards, dice, lotto balls, computer program,
       Simple Random Sample Applet, Enter population size, sample size, hit Reset, then Sample.
OR

Table of random digits (Simulates rolling a die with 0,1,....9, over and over...) (Table B, p.686)
    Every digit, every sequence of digits, is equally likely to be "next" in any direction.
To use:  label everyone in the population with a number.
    Important:  Every labeling number needs the same number of digits.
    To label 9 people, use the labels 1,2,3,....9 (1-digit chunks)
    To label 15 people, use the labels 01, 02, ...10, 11, ...15 (2-digit chunks)
    To label 125 people, use the labels 001, 002, ... 124, 125 (3-digit chunks)
Pick a place (at random) in the table, start reading across in that size chunk.  Get n eligible numbers (discard repeats)
                    Read Row 150:   07511   88915   41267   16853   84569   79367 ..
From 9 people, a sample n = 5:   0,7, 5, 1, 1, 8, 8, 9, 1, 5, 4,     (sample is individuals 7, 5, 1, 8, 9)
From 15 people, a sample   07, 51, 18, 89, 15, 41, 26, 71, 68, 53, 84, 56, 97, 93, 67.... keep reading,
    go to next line (or back to top line) if you need more.  Individuals 7, 15,...are chosen using this line.
From 125 people, a sample 075, 118, 891, 541, 267, 168, 538, 456, 979, 367...keep reading.  Individuals 75, 118, ...

    Why the same number of digits in each label?  Each individual 3-digit chunk is as likely as any other 3-digit chunk.  But a 1- or 2-digit chunk is more likely than any 3-digit chunk. So 2 will come up more often than 12, but 02 will come up just as often as 12.

    Why across?  For consistency on HW, go the way they say (so you get the answer in the book).  In practice, you can read up, down, backwards, as long as you decide beforehand, and don't change in the middle of choosing the sample.
+ + + + + + + + + + + + + + + + + +

Some more sources of bias, even in probability samples (p. 201-3):
**Undercoverage:  Some groups in the population are left out, or slighted,  in the process of choosing the sample.
  
One possible source of undercoverage: Sampling frame: Moore p. 211 problem 8.45: the group from which the sample is actually chosen--as different from the "population"--the group you want information about. The sampling frame is often, unfortunately, smaller than the population.  (Often a "list" that already exists.) The sample is (usually much) smaller than the sampling frame.
** "Chosen" sample may not turn out to be actual sample, if some individuals don't respond--"Nonresponse".
**Response bias Lies, bad memory, pleasing interviewer (nutrition surveys) Interview technique
**Wording of questions Confusing? Leading? Limiting choices?

Suppose we've done it right....
A probability sample (p.200) is from a design where impersonal chance is used to pick the individuals.  SRS is the most straightforward.  More sophisticated methods are often used, but they're optional this term. (More info)
+ + + + + + + +
We want to use the sample to make an inference about the population.  A sample will never exactly represent the population, but larger (RANDOM) samples give more accurate results than smaller random samples.
  (almost always. Quantify "more accurate" and "almost always" in chapter 14.)
(Not in text:  Surprisingly (?), this isn't usually because you have more of the population.  A tablespoon of soup gives a pretty good sample, whether it's from a quart of soup or a 10-gallon vat (as long as it's well-stirred).  A toothpickful does not.


Sievers home   Math151-Fall08/Dayf18.htm  11pm 10/14/08
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.