Sample exam 2 available (linked here) & outside my door. (Actual exam would be 5 pages, probably) Solutions outside my door and on reserve (soon.)
Exam 2 this Friday (Day 24, Apr.1).
Covers
thru today's HW (but no more than Part III). Let me know by Wed. if you need a special time to
take the exam.
How much computational detail
from part II? You don't need to know the formula for the
correlation
coefficient, but you should be able to guess roughly the r from a
scatterplot,
and know and use the properties pp.121-2.You will need to know, among
other
things, how to find b0 and b1 from the
means, standard deviations,
and r of the x-and y-values, and to give the formula for the
regression
line, (like 17, p.154); and to graph the regression line on top of the
scatterplot.
Also find by hand the value that the line predicts for a particular
x.
You should be able to identify and calculate the
residual value
for a particular x-y point as its vertical distance from the line
(negative if the point is below the line), and identify and understand
potential
influential points. You should know that the regression
line goes through the point given by the two means, and that the
regression line "rises" r standard deviations in y for each standard
deviation increase in x (pp. 137-8); also that the regression line of
"weight" on "height" is not the same line as the regression line of
"height" on "weight" . You should be able to describe verbally the
meaning of R2 in the context of a data set.
Day 22 (Mon. March 28): Reading: D&V Ch 12, 13. Review
part
III p. 262. AS13. Bring questions; Parts II and III
Next, D&V Part IV: Ch. 14, Ch.15 thru p.
291 (then Ch. 18 &on.) ActivStats is very good for part IV--Ch11
shows Law of Large Numbers as D&V express it. Ch14, 15 correspond
well with the text and present very good examples.
| Hand
in
Chapter 13, p257ff. 1,2,4,5,6,10,11,12 You did the "observational study" ones, and started the "experiment" ones. Finish these for those that are experiments, add 17, 18 32 Shingles 35 Safety switch 36 Washing clothes From Review part III, p. 263ff.
A. Do the Chart experiment-- ActivStats 13-3,
first activity.
Save your data with your name on the file, remembering where you saved
it. Do the next two SPSS activities, on that page. (One
error
in the tutorials: Your files are NOT of type .txt; they are of type
.dat.
Safest--use "all files" to locate them.) Hand in the graphs you
made,
writing what results you see and whether you think they are
"statistically
significant". B. REDO the assignment
on the handout Using SPSS to find a Simple
Random
Sample, this time doing
Transform> Random number seed> Make sure Random Seed is selected,
and click OK. first. Also:
Find the mean duration, for your sample, and for the
whole set.. Bring to class to pool your results. More here on seeds.
= = = = = = = = = = = = = = 9 Spinner Using independence: |
Read,
to discuss Review
= = = = = |
Op- tion- al |
Homework questions? Day
21
Chapter 13: Experiment: Continue Day
21
Brief summary: All about avoiding BIAS
Principles of designing a comparative experiment
(p. 243)
Block
designs:
(not "completely randomized")
(Randomized) Block design: Sort
experimental
units into "Blocks" = groups homogeneous on potentially
confounding
variables: Within each block, randomize
the treatments.
Compare
results within each block, then summarize all results.
Matched pairs is a special case of block design--each pair is a
little
"block":
Matched pairs: In experiment, to
compare Control and experimental
treatments
(i.e. 2 levels)
Sort experimental units into "matching"
pairs.
One member of pair gets control, other gets experimental.
Randomize which. Compare within pair (find
difference),
then summarize all comparisons.
Matched with self is common. Eliminates
extraneous variability.
(Matching is also often used in
observational studies)
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = =
Part IV: Randomness and
Probability. (Why?)
We know that a sample from a population will
not exactly represent the population. If we take a random
sample, the behavior of samples will not be individually
predictable, but there will be predictable pattern in many random
samples from the same population. Knowing the pattern will
be
as good as we can do.
Need probability.
Recall (Day19): p. 227
Sample Chosen
from a Population
Numerical
summary: Statistic
(Latin)
Parameter(Greek
letter)
The actual value of the Statistic will vary,
depending on the particular sample. "Sampling variability"
= "Sampling error"
The Statistic "estimates" the Parameter.
We hope it is close to the parameter. If we choose simple
random
samples, we can understand the pattern of values the statistic can
take.
Some examples of statistics:
Height: U.S. young
women: pop. mean= 64.5", pop. s.d.
2.5"
(text p.66. Caveat: rounded?)
Math 151, Spring '01, xbar =
64.2,
s = 3.75.
Fall '01, xbar = 65.01, s = 3.22.
Spring '02, xbar = 64.53, s = 2.91.
Fall '02, xbar = 63.89, s =
2.48.
Spring '03, xbar = 64.98, s = 3.29
Spring '04, xbar = 65.33, s = 2.25
Coin flip: Proportion
of heads p = 1/2
(?)
p-hat = 256/520 = .492 (combined data from
many
past classes)
Thumbtack: Proportion
of point-up p =
(??)
p-hat = 441/691 = .6382 (one past class,
Math
251)
Chance behavior (a random
phenomenon):
Unpredictable
in the short run, predictable regular pattern in the long run.
(Prof. Persi Diaconis (a table magician) can flip
a coin so precisely it always comes up the way he wants. His
coinflipping
is not a random phenomenon. Mine is.
"Probability" of particular
something
happening:
proportion of times it would happen in a very long
series of independent
repetitions (trials) of the phenomenon: "long-run relative
frequency".
(independence:
outcome of one trial must not influence the outcome of any
other.)
Law of Large Numbers (LLN): Relative frequency of
repeated independent trials gets closer to the "true" relative
frequency as the number of trials increases.
(But it may take a long time: Large Numbers of trials.
Use http://www.whfreeman.com/scc
--
"Probability " 1 toss at a time--settles down slowly.
)
(&&Another version of LLN says the mean from a
sample of size n gets closer and closer to the true = "population"
mean, as you take bigger samples (as n increases). Activstats
presents this, 14-1, and we'll return to this soon.)
Aberrations won't be compensated for; they will only be swamped
out. (Misconception of "law of averages.")
Probability Model:
A Random phenomenon,
Sample space S: set
of all possible outcomes (no overlap of descriptions)
(def. p. 284)
Event: any
set of outcomes
(including one outcome, & even the set containing no
outcomes)
Probability model:
S, and a way of assigning a probability to each event.
&&Sample space depends on what you
want to know:
Phenomenon: Flip coin twice.
S1 = {HH, HT, TH,
TT} S2 = {0, 1, 2} number of
heads
S3 = {Y, N} both are heads?
Probability rules: pp. 274-5, in
words, then in notation.
A an event in sample space S, P(A)
is "the probability
that A occurs"
These rules are all true for
proportions
in long run (Probabilities), prop.of counts, proportions of areas.
1. 0 <
P(A) < 1
2. P(S) = 1
3. For any event A,
P(A
does not occur) = 1 - P(A)
4. A and B
are
disjoint if they have no outcomes in common (can't happen
simultaneously.)
If
A and B are disjoint, their probabilities add: P(A or B) =
P(A)
+ P(B)
Pick one person from U.S. Pop. (Age 25 +)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Finite sample
spaces (you can list the outcomes):
Assign a probability to each outcome (>0)
so they add to 1. (Sometimes equal values "equally likely"
make sense.)
Prob. of an event is sum of
prob's of its outcomes.
Phenomenon: Flip coin twice.
S1 = {HH, HT, TH,
TT} S2
= {0, 1, 2} number of heads
S3 = {Y, N} both are heads?
Sample space | HH | HT | TH |
TT
|
Prob's
|
.25| .25| .25| .25| P(tail followed by head)=?
Sample space | 2
|
1 | 0 | P(at
least 1 tail)=? P(1 of each) = ?
Prob's
|
.25| .50 | .25| P(at least 1
Head)=
?
P(2 Heads) = ?
Sample space | Y
|
N |
Prob's
|
.25| .75 |
Flipping-coin-twice was built from a simpler phenomenon;
flipping coin once: P(H) = .5, P(T) = .5
Rule 5. If A and B are two independent events, the
probability that both A and B occur is the product of the
probabilities of the two events. P(A and B) =
P(A)×P(B), if (and only if) A and B are
independent.
Rule 5: Can be used to build probabilities for
complex phenomena from simpler ones (Ch. 14); to check
structure in existing sample space (Ch. 15.)
e.g. Pick 2 people at random from U.S. pop. (Pop. is
so big that it's hardly changed by removing first. Independence OK)
P(First has 4+ yrs college, and 2nd didn't graduate HS) =
.280×.183 = .051
P(First didn't graduate HS, and 2nd has 4+ yrs college) =
.183×.280 = .051
P(one didn't graduate HS, and the other has 4+ yrs
college) = .051+.051= .102
| Sievers home | Math151-Sp05/Days22.htm | 4pm | 3/26/05 |