| Hand in Friday . p. 397, 16.10 searching for ESP p. 408 16.40 success of trainees p. 408 16.41 schizophrenia markers p. 423, 17.35 brains p. 424, 17.37 support groups Review meaning of P, significance: p. 384, 15.48 Cicadas p. 385, 15.52 P? p. 385, 15.53 sig. def.? Postpone the rest: Feel free to try them, keep your paper. p. 434, 18.1 and 2, s<-->standard error p. 436, 18.3 Critical values: Use Table C and also the Excel t-procedures sheet; be sure your answers are consistent. p. 436, 18. 4 Critical values: Use Table C . For b, make a sketch. Note the decimal place is different in (a) and (b) p. 437 18.5 Critical values for CI. p. 437 18.7 Ancient air CI Make a dotplot or stemplot to examine the data. It will look somewhat skewed, but with so little data this kind of scatteredness can happen easily from a normal distribution. We should report that the skewness may make our CI only approximately accurate. Xbar = 59.5889% and s = 6.2553% are what you would get if you calculated from the data; use these to make your CI. Optional: Check with Excel t-procedures p. 453 18.29 absenteeism CI p. 455 18.36 a. Calcium and blood pressure CI (to use Table C where the degrees of freedom aren't given, go to the row with the lower degrees of freedom, here 50. You're giving up a little bit of sharpness rather than overstate your case. Optional: To see how much difference the "correct" t* would give, use Excel t-procedures ) p. 441 18.8 and 9 is it significant? Also, use Excel t-procedures to find the P-values more exactly. p. 432 18.25 read carefully. (one or more t-values was incorrectly computed.) p. 432 18.10 Ancient air test See note to 18.7, for mean and s.d. Optional: check with Excel t-procedures & & & Leftover problems from Day 30 & & & & & & & & These ideas are related to those in Ch. 15. You can get the answers visually by using the Statistical Significance Applet p. 290, 11.39 Pollutants in auto exhausts For 11.39: You might want to know L so that if you tested your 25 cars and found a high value of x-bar, you would be able to compare it with L; if it was greater than L, you would go back to the manufacturer and say "I believe you sold me a batch of bad cars, because the chances of getting an average emission level this high if the exhaust system is working properly is only 1 in 100. It is more reasonable to believe the exhaust system is not working, than that we "are" that 1 in 100 possibility." p. 290, 11.38 Glucose testing If we use this cutoff level L to say that people (with a mean of 4 tests) over L "have diabetes", then the chances of declaring that someone "has diabetes" when they really are OK (with mean 125mg/dl) is .05. .05 or 5% is the chance of a "false positive" using this protocol, when the real mean is 125. & & & & & & & & & & & & & & & & & & |
Read, to discuss |
Optional (more practice) Review meaning of P, significance: p. 384, 15.47 Rich? p. 385, 15.51 5%vs.1%? |
Exams returned, to
those absent Monday.
Comments
Buffer
against
one low hour exam:
The final % exam grade minus 10 points will be substituted for the
lowest hour exam grade, if it is higher.
| Examples: | Ex1 | Ex2 | Ex3 |
Ex4 | final % | final -10 | |
| Student 1 | Original | 85 | 80 | 85 |
60 | 85 | 75, replaces lower 60 |
| Treated | 85 | 80 | 85 |
75 | 85 | <--ß These will be used. | |
| Student 2 | Original | 85 | 80 | 80 |
70 | 75 | 65, lower than 70, don't replace. |
| Treated | 85 | 80 | 80 |
70 | 75 | ||
| Student 3 | Original | 85 | 50 | 75 |
55 | 85 | 75, replaces lower 50 |
| Treated | 85 | 75 | 75 |
55 | 85 | <--ßThese will be used |
This is to encourage all to try to put it together for the (cumulative!) final.
Ch. 15: "Significance
tests use
an elaborate
vocabulary, but the basic idea is simple: an outcome that would
"rarely" happen if a claim were true--is good evidence that the claim
is
NOT true." (p.363 top)
HW questions? Day
38
Measuring the strength of the evidence against H0 (a
common measuring stick for all distributions and parameters):
P-value of
a test: The probability, computed assuming
that H0 is true, that the observed outcome would
take a value as extreme or more extreme than that actually observed
(if
we could repeat taking-data again). p. 368.
The smaller the P-value, the stronger the data's
evidence against H0 ( for Ha).
For a test of µ , using xbar (sigma
known),
the P-value is
--the area of the tail beyond the observed xbar, in
the
direction of Ha (one-sided)
(--or twice that area (two-sided).)
Applet: P-value
of a
test of significance automates this.
A "Significance level" alpha is a probability level
we
decide on in advance as being the "rarely" amount that
will
push us over into believing (well, sort of) that the H0
claim is not true. Simple benchmark numbers for it,
like .10 (1 in 10),
.05 (1 in 20), .01 (1 in 100).
When the P-value is less than (or equal to) a particular
significance
level alpha (say .05), we say,
"The results are significant at the alpha = .05
level," or "The results are significant (P< .05)"
. Giving actual P is better, if you can.
IF you use a particular alpha as a "cutoff" between "reject H0
" and "failing to reject H0"--we can talk about probability
of rejecting H0 when it's true--and alpha is that
probability!
Look at shoeboxes, and simulations.
For the shoeboxes, the white numbers
(where the mean is really 20) rejected H0 : µ
= 20 (incorrectly)
at the alpha = .10 level in 3 of 18 samples (16.7%)
the yellow numbers (where the mean is really
bigger than 20--24 I think) rejected H0 : µ
= 20 (correctly!)
at the alpha = .10 level in 16 of 19 samples (84.2%)
Simulation: so far, with mean = 20,
reject H0 : µ = 20 (incorrectly)
at alpha = .10 in 16/125 or .128 of samples (close to
.10)
with mean = 24, reject H0
: µ = 20 (correctly!) at alpha = .10 in 91/125
or .728 of samples
>>Multiple Tests: beware! pp.
395-6
If you do 100
tests and use the alpha = .05 significance level for each, then
the structure of testing requires this:
When all 100 null
hypotheses H0 are true, out of your 100, about 5 of
the
100 (.05) will give "significant" results by chance alone (falsely
indicating the alternative hypothesis is to be preferred.)
Moral: if you use the
testing machinery as a screening instrument for many questions,
a proportion will give falsely significant results. You
can't
accept the results from such multiple tests as good evidence, only as
indicating
questions requiring further, more specific study. The game gives you
one
shot, not a hundred shots.
(This is becoming an important issue for developing new statistical
techniques, for instance in biology, where microarrays can do a
thousand tests at once.)
(not in text)You
cannot legitimately test a hypothesis on the same data that first
suggested
that hypothesis. Every data set will turn up with some
unusual
pattern if you examine it hard enough.
(If you must explore and confirm
with the same data set, one way is to (randomly) take half the data
set,
explore and generate hypotheses; then use the other half for
confirmatory
tests. You can use P-value to describe unusualness, but
be
wary of making decisions with it if you didn't expect that particular
unusualness.)
>>All the warnings about
designing experiments and surveys still apply.
& & & & & & &
& &
Today Look
back at
11.38, p. 297. "backward
normal" problem. From a proportion/probability, find a z*,
from that a raw value (here an x-bar). We can think of this as a
significance testing question. n = 4, sigma = 10 mg/dl.
H0:
µ =125mg/dl (Sheila is normal), Ha:
µ > 125 (Sheila has gestational diabetes.)
Find the L so that only .05 of
random samples of 4 tests would have mean above L, among
people(Sheila) whose real mean is 125.
L is the "cutoff" for doing an alpha =
.05 test. 5% of "healthy" people will be diagnosed diabetic
(false positive).
Doctors like a "decision
making rule", want an alpha cutoff to apply, rather than
calculating a P-value for each indivual's set of 4 tests..
Note that table C gives us another way to get z*'s for some
probabilities! Bottom row, "one sided P". The table is set
up to go from "tail" probability to z*, without having to calculate
"probability to the left." z* = 1.645.
Unstandardize z*: Remember that the standard deviation for
xbars from samples of 4 will be sigma/sqrt(4) = 10/2 = 5.
1.645 s.d.'s above the mean is (Mean +
1.645× sd) = 125 + 1.645×5 = 125 + 8.225 = 133.225.
So L = 133.225, and if doctors use that as a cutoff:
"Gestational diabetes if mean of 4 tests > 133.225" they will
call only 5% of healthy people sick.
(We haven't calculated what percent of sick people won't
be "caught" by this test--we haven't defined "sick" with a number.)
You can check this visually, approximately, using Statistical Significance Applet
L marks the "cutoff."
& & & & & & & & &
Ch. 18: Inference for population mean
(realistic)
The most
unrealistic of our "simple conditions" for inference (p. 344) was that
we knew the population standard deviation sigma. We remove that
condition here.
If we substitute s, the sample standard
deviation, for sigma, the population standard deviation, in our
Normal distribution formulas:
If n is quite big, the value of the sample standard
deviation will be close to the same as the value from the
population, and our work's approximately right.
But if n is smaller, estimating sigma by s will add
in extra variability! Problem solved by modifying the
Z-distribution!
Standard error of the (sample) mean =
Standard deviation of xbar, estimated from the data.
"Standard
error of the mean": s/sqrt(n) SEM, SEXbar,
etc.
Just
like sigma/sqrt(n), only s from data replaces sigma.
When you estimate the standard
deviation of a statistic,
the resulting estimate is called the "standard error" of
the
statistic.
t-distribution
family: like standard normal only slightly fatter in the
tails, slightly more spread.
Mean = 0. Symmetrical around 0.
"Degrees of freedom" tell which member of
the t family.
t(k) is the t distribution
with k degrees of freedom.
Comparison with normal (Excel
file)
Lower d.f.--fatter tails. Higher d.f.--more
like standard normal.
Table C: "critical" t-value in the body,
probabilities at top and bottom. Set up for P-->t.
Example. t(20) = 2.086
corresponds to
Confidence level 95% = "middle" probability between -2.086 and +2.086
one-sided P
= .025, probability in the one tail above +2.086 = probability in
the one tail below -2.086
Two-sided P = .05, probability in the
two tails beyond -2.086 and +2.086.
(For z distribution, the corresponding z* is 1.96; notice t is further
out.) Excel t-procedures sheet
will find P's from t's.
Standardizing xbar with s instead of sigma results in
the one-sample t statistic
which has the t-distribution with n-1degrees
of freedom.
We'll now repeat all the stuff from Chapters 14 & 15, only
wherever there
was
a z, we'll substitute a t.
Here we go....
Conditions for inference about a mean: (p. 434)
++ SRS (or reasonable facsimile)
++ Population is Normal. (Can relax
to symmetric, single-peaked unless n "very small")
"One-sample"
t- procedures:
SRS
of size n. Use Xbar
to estimate µ.
Substitute s for sigma in the
standardizing
formula. We get t instead of z, with n-1 degrees of freedom.
Check
for at least approximate normality in the data set.
Confidence intervals:
Choose t*
from table C, using the n-1
row,
and confidence level C.
Special case of common
pattern: estimate + t* SEestimate
Significance tests:
State hypotheses
as in Ch. 15, find
t from data, by:
Calculating the one-sample
t-statistic, using the null hypothesis value of µ (call
it
µ0)
Then
proceed
as if it were a "z", only using the (n-1)
d.f.
row in table C,
to find P-values for the t*'s it's between,
write
"P-value is between ___ and___".
(Or use software which will find P-value exactly.
)
Example: bacteria per milliliter in 10
specimens of raw milk from one producer.
Parameter: actual mean bacteria/ml.
5370, 4890,
5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870
| 4|5
4|77 4|889 5|11 5|23 |
n =
10,
xbar = 4950, s = 268.45 SEM = 268.45/sqrt(10) =268.45/3.162=84.89. deg. of freedom = 9 90% CI: from t(9) in table, t* = 1.833 CI is 4950+1.833x268.45/sqrt(10) 4950 +1.833x84.89, or 4950+155.6 bacteria/ml. If we had KNOWN Population sigma = 268.45, we'd have used z* = 1.645, gotten a narrower CI. (but we don't know sigma!) Test: H0 : µ
= 4800
t = (4950 - 4800)/SEM
= 150/84.89 =
1.767 |
| Sievers home | Math151-Fall06/Daym39.htm | 2pm | 11/29/06 |