In lieu of class, a few paragraphs: (choose One)Hand in separately
from HW
B) A paragraph describing one of the
workshops/talks you attended,
+ a paragraph or so on a situation
where
organized
data could be useful to an activist working for a cause (either
data which was cited in a workshop you attended, or a place where you could
see that information could help make or strengthen the "case" for a cause,
or be useful in improving the activist's skill in some way.)
C) Find one or more
graphs, charts or tables of numbers in the popular press or on the web.
Hand in a copy of it/them. Explain what it's about and what
it says, and critique it as to how well it conveys the information.
If you can do it better, redo it.
D) Research Florence
Nightingale, primordial activist and statistician. Report
why/how she fits into this year's theme of "Got Passion", and why I call
her a statistician. (The Biographies link from the link here is excellent.
A Yahoo search for "Florence Nightingale" + statistics gives lots more).
E) Nothing. Counts as a class absence.
- - - - - - - - - - - - - - - - - - - - -
Homework: Reading: Sec. 2.4. Skip 2.5.
Please read ahead: Ch.
3 Intro. Ch. 3.1, (skip stratified,
multistage pp.174-5, on first reading).
| Hand in Monday
Sec. 2.4 p. 132 2.54 Dow average/stocks p. 138 2.63 math&verbal r, states/individuals C. Look again at p. 122, 2.37(calories). These values are averaged values, over a bunch of people's guesses. What would the graph look like if all the individuals' separate guesses had been graphed? Add points to your graph to give the idea. p. 133 2.55 tv watching & grades
A.(New problem) Income depends on height?!
Read the article and answer this.
Postpone ch. 3
|
Read, to discuss (all Moore)
Sec. 2.4 The Read problems I never asked (C, D) from Day 13 p.136 2.57 firefighters,
Postpone ch. 3
|
Optional
Sec. 2.4 p. 137 2.59 size of hospital
|
Cautions Sec.
2.4
Plot the data:
Summary formulas and numbers don't tell the whole story. (Anscombe's
quartet, Moore p.127, 2.46-7)
Extrapolation-- extra (outside) polation (putting a point): Using the line to predict outside the range of x's you have data for.
Averaged data will
produce a stronger relationship (higher correlation, R2) than
the merged raw data from individuals (the averaging hides much variability)
Heating-degree days graph (TA 2.1, p. 86, 107: Each value represents
a month's average temperature and average fuel. If we graphed
the daily temperature and fuel use we would see a lot more scatter.
"Lurking" variable:
has an important effect, but not one of the variables studied.
Meatloaf shrinkage vs. placement
in oven? (cooking thermometer/not had greatest influence)
Time sequence of observations
a common one. (Learning, tiring, aging)
The trouble with lurking
variables is that by definition you don't know they're there. Look
behind every tree.
Association does not imply
causation
Manatees:
Year
boat registrations
kills
If you didn't know boat registrations, would you believe that "year" was
the cause of "kills"?
(Are all boats actually registered? Possible lurking variable= unregistered
boats.)
Direction? Rooster causes sun to rise by
crowing?
Both variables "caused" by a lurking variable?
--Women with a history of heavy antibiotic use have higher rates of
breast cancer.
START HERE Monday
--Baby rats whose mothers licked and groomed
them more grew up to be more exploratory, social, less timid.
Cause? Effect? How to tell?
Establishing that x "causes" y: difficult:
Best: Do an experiment
in which we change x, keep lurking variables under control. (Sec. 3.2)
Rats.
Otherwise: Strong
association. Consistent over many studies. Higher x-->stronger y.
X precedes y in time. A plausible mechanism exists (parallel
studies?)
Generalize rat grooming to humans?
E.g. hydrogenated oils
--> heart disease? Homocysteines --> heart disease?
= = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = =
Pick a digit (from 0,1,2,3,4,5,6,7,8,9).
Write it down.
= = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = =
Chapters 1 and 2 have covered analyzing data
that was given to us--what it said about itself.
Informally, develop guesses,
suspicions, hypotheses about the world the data came from.
Ch.
3: Producing Data: Aim:
create data sets that will allow us to make inferences to a larger
world than just the data we have.
Observational
Study: Observes individuals, measures variables, does not
influence the responses. (3.1)
Take Sample from a population, examine it,
hope it's representative so we can infer population is like sample.
(Not very useful for cause-and-effect--see above)
Experiment:
Imposes
treatment
on individuals, to see how the treatment
influences the response.
(3.2)
Best for cause-and-effect.
Confounding: Two variables (explanatory
or lurking) are confounded when you can't sort out their effects
on a response variable.
--Used to be: coffee drinking and smoking--most
people did both, or neither...
Last year: women who ate at least one serving/day
of whole grain (cereal, bread) much less likely to have heart attack.
(Who eats whole grains? Were
those variables taken into account? ?)
Ch. 3.1 Designing Samples
>>Population: Entire group that we want information about.
>>Sample: The part of the population we actually examine.
Hope: Sample will be representative
of the population.
(SAMPLING) BIAS: The design of a study is biased if
it systematically favors certain outcomes.
Check our "sample" of digits
Some refinements:
*Sampling frame: Moore p. 179 problem 3.13: the group from which
the sample is actually chosen--as different from the "population"--the
group you want information about. The sampling frame is often, unfortunately,
smaller than the population. The sample is (usually
much) smaller than the sampling frame.
* "Chosen" sample may not turn out to be actual sample, if some individuals
don't respond--"Nonresponse", p. 178.
Non-probability samples:
Simple Random Sample (SRS) of size
n:
n individuals
chosen in such a way that every possible set
of n individuals has an equal chance
of being chosen.
HOW? A chance mechanism: Cards, dice, computer program, or
Table of random digits (Simulates rolling a die with 0,1,....9,
over and over...) (Table B, back flyleaf)
Every digit, every sequence of digits, is equally
likely to be "next" in any direction.
How? Next....
| Sievers home | Math151-Sp04/Days17.htm | 3pm | 3/10/04 |