HW Day40: (Re)read Ch. 16, especially
Multiple tests, pp 395-6
Read Ch. 17, p. 414 and p. 417 I
Reading Ch. 18: We'll repeat the CI and test work, only with
s instead of sigma, and t instead of z. First to p. 441, Next, the rest. Read
it all!. Check p. 451 18.15, 15, 17, 18, 19, 20, 21, 22 first, Next
23, 24.
Ahead, Review
Ch.9, p. 219 and around (Completely randomized experiment, especially with 2
treatments only), and p. 224 (Matched pairs experimental design) .
Read Ch. 19, pp. 460-61 only!
(Comparing 3 or more independent groups requires Analysis of Variance, Ch. 25)
This plus Reading SPSS output will be the last work of the term.
| Hand in Wednesday . More Cautions p. 397, 16.10 searching for ESP p. 408 16.40 success of trainees p. 408 16.41 schizophrenia markers Review concepts p. 423, 17.35 brains p. 424, 17.36 support groups p. 424, 17.37 CA brush fires, r2 Review meaning of P, significance: p. 384, 15.48 Cicadas p. 385, 15.52 P? p. 385, 15.53 sig. def.? : Yes. Some errors were corrected from before class: t- procedures p. 434, 18.1 and 2, s<-->standard error p. 436, 18.3 Critical values: Use Table C. Check by plugging in your t to the Excel t-procedures sheet; be sure your answers are consistent. p. 436, 18. 4 Critical values: Use Table C . For b, make a careful sketch to see what to do. Note the decimal place is different in (a) and (b) p. 437 18.5 Critical values for CI. (sample size to d.f.) p. 437 18.7 Ancient air CI Make a dotplot or stemplot to examine the data. It will look somewhat skewed, but with so little data this kind of scatteredness can happen easily from a normal distribution. We should report that the skewness may make our CI only approximately accurate. Xbar = 59.5889% and s = 6.2553% are what you would get if you calculated from the data; use these to make your CI. Optional: Check with Excel t-procedures p. 453 18.29 absenteeism CI p. 455 18.36 a. Calcium and blood pressure CI Sample size is 27. Check with Excel t-procedures) (b will be assigned Wed.) p. 441 18.8 and 9 is it significant? Also, use Excel t-procedures to find the P-values more exactly. p. 452 18.25 "read carefully". (one or more t-values was incorrectly computed. Fix.) p. 441 18.10 Ancient air test See note to 18.7, for mean and s.d. Optional: check with Excel t-procedures |
Read, to discuss |
Optional (review) p.424, 17.6 support groups |
Reviewof significance testing, in brief:
Before taking data, define
H0: "Null hypothesis" A claim about
the population we would like to show is NOT
true.
A parameter = a particular value. H0:
µ =1000 hrs. ("Average
lightbulb life".)
Ha: "Alternative hypothesis" A claim or statement about
the population we are trying to find evidence FOR.
The parameter <, or > the particular value
(one-sided/tailed) Or NOT= the particular value (two tailed).
Take data. Calculate test statistic. For µ,
test statistic is the z-score of xbar. (Start with xbar,
standardize using mean of H0)
Is it an unlikely result if H0
is true? Then that is evidence against H0.
Evidence: how strong? P-value of a test: The
probability, computed assuming that H0
is true, that the observed outcome would take a value as
extreme or more extreme than that actually observed (if we
could repeat taking-data again). p. 368.
Small P = strong evidence.
One-tail alternative: P = the tail beyond the
observed value, in the direction of Ha
Two-tail alternative: P = the sum of both
tails, farther out than the observed value in either direction.
Results are significant at level alpha if P <
alpha, not otherwise. Significance levels are usually
"benchmarks." What's "statistically significant" can vary
by field. (.05 is usually good.)
Cautions: see Day 39 for
details.
16.9, p. 395: "Statistically
insignificant"--the differences could easily be due just to chance
variability. Why is it important to know differences
"small"? Because a large difference could be "real" but
"statistically insignificant" just because the sample size was too
small to confirm it.
Other Homework
questions? Day
39
New: Multiple
Tests: beware! pp.
395-6
If you do 100
tests and use the alpha = .05 significance level for each, then
the structure of testing requires this:
When all 100 null
hypotheses H0 are true, out of your 100, about 5 of
the
100 (.05) will give "significant" results by chance alone (falsely
indicating the alternative hypothesis is to be preferred.)
Details Day 39
Look at shoeboxes, and simulations so far..Real
shoeboxes last term,
Your shoeboxes 1/15 falsely
significant at alpha = .10
For the real shoeboxes, the white numbers (where the mean is
really 20) rejected H0 : µ = 20 (incorrectly)
in favor of Ha : µ > 20
at the alpha =
.10 level in 3 of 18 samples (16.7%)(Fall '06)
1of 16 samples(6.3%) (Sp. '07): 4 of 34 (11.8%) combined
1 of 13 samples(7.7%) (Fall '07, &251):5 of 47(10.6%)
combined
1 of 15 graphed samples(6.7%) (Sp.
'08): 6 of 62(9.7%)
combined
the yellow numbers (where the mean is really
bigger than 20--24 I think) rejected H0 : µ
= 20 (correctly!)
at the alpha =
.10 level in 16 of 19 samples (84.2%) (Fall '06)
13 of 17 samples(76.5%) (Sp. '07): 29 of 36 (80.6%) combined
11 of 13 samples(84.6%) (Fall '07, 251): 40 of 49
(81.6%) combined
10 of 17 graphed (58.8%) (Sp '08): 50 of 66 (75.8%) combined
..
Ch. 18: Inference for population mean
(realistic)
The most
unrealistic of our "simple conditions" for inference (p. 344) was that
we knew the population standard deviation sigma. We remove that
condition here.
If we substitute s, the sample standard
deviation, for sigma, the population standard deviation, in our
Normal distribution formulas:
If n is quite big, the value of the sample standard
deviation will be close to the same as the value from the
population, and our work is approximately right.
But if n is smaller, estimating sigma by
s will add
in extra variability! Problem solved by modifying
the
Z-distribution!
Standard error of the (sample) mean =
Standard deviation of xbar, estimated from the data.
"Standard
error of the mean": s/sqrt(n) SEM, SEXbar,
etc.
When you estimate the standard
deviation of a statistic, the resulting
estimate is called the "standard error" of
the
statistic.
t-distribution
family: like standard normal only slightly fatter in the
tails, slightly more spread.
Mean = 0. Symmetrical around 0.
t(k)
is the t distribution
with k degrees of freedom.
Comparison with normal (Excel
graph)
Lower d.f.--fatter tails. Higher d.f.--more
like standard normal.
Table C:
"critical"
t-value in the body,
probabilities at top and bottom. Set up for P-->t.
Example. t(20) = 2.086
corresponds to
Confidence level 95% = "middle" probability between -2.086 and +2.086
one-sided P
= .025, probability in the one tail above +2.086 = probability in
the one tail below -2.086
Two-sided P = .05, probability in the
two tails beyond -2.086 and +2.086.
(For z distribution, the corresponding z* is 1.96; notice t is further
out.)
Excel t-procedures sheet will find P's from t's.
Standardizing xbar with s instead of sigma results in
the one-sample t statistic, t-distribution with n-1degrees
of freedom.
Conditions for inference about a mean: (p. 434)
++ SRS (or reasonable facsimile)
++ Population is Normal. (Can relax
to symmetric, single-peaked unless n "very small")
"One-sample"
t- procedures:
SRS
of size n. Use Xbar
to estimate µ.
Confidence intervals:
Choose t*
from table C, n-1
d.f., level C.
Significance tests:
State hypotheses
as in Ch. 15, find
t from data, by:
Calculating the one-sample
t-statistic, using the null hypothesis value of µ (call
it
µ0)
Then
proceed
as if it were a "z", only using the (n-1)
d.f.
row in table C,
to find P-values for the t*'s it's between,
write
"P-value is between ___ and___".
(or Excel t-procedures)
Example: bacteria per milliliter in 10
specimens of raw milk from one producer.
Parameter: actual mean bacteria/ml.
5370, 4890,
5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870
| 4|5
4|77 4|889 5|11 5|23 |
n =
10,
xbar = 4950, s = 268.45 SEM = 268.45/sqrt(10) =268.45/3.162=84.89. deg. of freedom = 9 90% CI: from t(9) in table, t* = 1.833 CI is 4950+1.833x268.45/sqrt(10) 4950 +1.833x84.89, or 4950+155.6 bacteria/ml. If we had KNOWN Population sigma = 268.45, we'd have used z* = 1.645, gotten a narrower CI. (but we don't know sigma!) Test: H0 : µ
= 4800
t = (4950 - 4800)/SEM
= 150/84.89 =
1.767 |
optional: SPSS for "raw data"
Next time:
MATCHED PAIRS t procedures--
"Paired samples"(SPSS), "Paired comparisons"
before--after, left hand--right hand,
Drug A vs. Drug B on the same person or on a matched pair.
For each pair, find the difference in
the observed values. Then treat these differences as if they are "the"
data set, from a normal population, and do One-sample t procedures.
Usually (always?) the null hypothesis will be "
µ = 0", there is "no difference" between
the treatments.
Example: wax paper sandwich
bags:
Is the wax layer the same inside and out?
25 bags: measure (wax outside - wax inside)
for each. (pounds per square foot).
Differences: xbar
= .093, s = .723 n =
25
SEM = .723/5 = .1446
H0 : µ
= 0 (mean
difference
is
0)
t = (.093 - 0)/SEM
= .093/.1446
= .643.
Ha : µ
Not = 0 (there is a
difference)
t is less than .685 (d.f. = 24)
which is right-tail t* for probability .25
Because test is 2-sided, double the tail: .50. P value is greater
than .50.
No evidence for difference.
Excel
t-procedures: for t = .643, d.f.24, two-sided P =
.526
- - - - - - - - - - - - - - - - - - - - -
ROBUST procedures: a confidence
interval or significance test is called robust if the
confidence
level or P-value doesn't change very much when the assumptions of the
procedure
are violated. pp. 447-450.
Assumption: Population is Normal.
t-procedures are quite robust against
nonnormality.
But
sensitive
to outliers, bad skewness. Look at data. Need SRS!!
Details: n <15
t ok if data roughly symmetric, single peak, no outliers. Don't
use if skewed or outliers.
(How out is an outlier?)
n > 15 t ok unless there is strong skewness, or
outliers.
n > 40 or so: t ok even if there is skewness.
(Outliers?
I suggest trying with and without them, see what changes).
Matched-pairs data (differences) are often more normal in
shape than the separate variables ("oddness" is often the same for both
items in a pair, and disappears in subtraction. Another reason
why
this is a nice experimental design.
)
| Sievers home | Math151-Sp08/Days40.htm | 3:30pm | 5/5/08 |