| Hand in:
p. 351ff 5.9b random numbers 5.5 proofreading, mean and s.d. Also, graph p(1-p) for 0< p <1 to reinforce your understanding of part c. 5.12 random walk stocks (For part b, calculate the values using the formula!) 5.22 trip marketing 5.24 binomial coefficient facts. (Not sure they're true? Check with your computations in #5.12 for plausibility. Remember in any binomial situation you can reverse the roles of Success and Failure. Note b and c together give nC1 = n. Postpone the rest:
|
Read, discuss | Optional
5.26, 27 Waiting time to first success = "geometric" distribution. Like the "chain" email problem. Postpone Q)
|
Sec. 5.1 X Binomial B(n, p) , continued
Mean and s.d. : Binomial X = S1+S2+...+Sn
where
each of the Si's is a Bernoulli trial, 1 if Success,
0 if Failure
s 0
1
prob 1-p p Mean
= p (p. 340) Variance = p(1-p) (Homework)
Then µx = np, sigmax2
= np(1-p) by algebra of means and variances.
Die: "Success" is getting a 3. p = 1/6. Roll 36 dice.
Mean number of 3's = 36 ·1/6=6. Variance
= 36 ·1/6·5/6=5 S.d. = sqrt(5) = 2.24
Notice standard deviation grows like square root of n, not as fast
as the n possibilities. Applet (normal
approx to binom)--try with n= 10, n = 100
Formula: P(X = k) p. 349
For any branch with k successes, there are exactly
(n-k)
failures
So the probability of that single branchis
pk(1-p)(n-k) .
But (except for k = 0 and k = n) there are several branches
with exactly k successes. How many? Each is a
list of k S's and (n-k) F's. How many different lists are there?
Choose which k places in the list of n places get the S's; the others
get the F's.
Call the number of different lists "n choose k", written nCk
or
,
the
"binomial coefficient", the "number of combinations of n things taken k
at a time".
We see from the tree that 3C0 = 3C3 = 1, 3C1 =3C2 = 3. The case
of flipping a coin 4 times (fig. p.280) shows that 4C0 = 4C4 = 1,
4C1 = 4C3 = 4, 4C2 = 6.
Each of the branches with k successes has the same probability, so
we can multiply that probability by how many there are.
So P(X = k) =
pk(1-p)(n-k)
How to calculate nCk without actually writing down all the possibilities?
Since n! = n(n-1)(n-2).....3·2·1, nCk "simplifies"
(when you write it down and do a lot of canceling) to n(n-1)(n-2)...(n-k+1)/k!
(k terms on top and k on the bottom). (The study of
"counting" or combinatorics, of which these are a part, is very rich.
More in Math 300)
0! = 1, which makes nC0 and nCn work out to
= 1, as they should.
Start here Friday:
Often we're interested in the sample proportion --
p-hat =X/n , where X is B(n, p) p. 341-3
Mean of p-hat: = mean of (X/n) = (np)/n = p
Sample proportion is an "unbiased
estimator" of p
S.d. =
(Homework)
Applies to results of SRS from population, approximately, if
(rule
of thumb) Pop. size N > 20·n.
(in
fact, if sample "uses up" a good chunk of the population, the s.d. only
gets smaller than this number, so some authors use 10·n.)
Mean still works, no matter how big the sample.) The exact s.d. is
the above, times sqrt [(N-n)/(N-1)], the "Finite population correction."
You can see that this is always <1, and if N is very large compared
to n, it's very close to 1.
Normal approximation: Both X and p-hat are approximately Normal,
for
large n, using their means and s.d.
Important for theory. And for quick and dirty calculations.
Rule of thumb: Normal approx. is barely OK if both
np and n(1-p) > 10; better the farther from 10 they are..
e.g. Flip a fair coin 100 times B(100, .5). X =
# of heads, p-hat = X/n = proportion of heads. np=n(1-p) = 50 >>10.
µx = 50, sigmax = sqrt(100·.5·.5)
= sqrt(25) = 5 X is approximately N(50, 5)
µp-hat = .5, sigmap-hat =
sqrt(.5·.5/100) = sqrt(.0025) = .05 p-hat
is approximately N(.5,.05)
Probability that # of heads X is between 40 and 60 = P(40
< X < 60)
= P[(40-50)/5 < Z < (60-50)/5] = P(-2 < Z < 2) = 95%,
approximately.
Probability that the Proportion of heads p-hat is between
40% and 60% = P(.4 < p-hat < .6)
= P[(.4-.5)/.05 < Z < (.6-.5)/.05] = P(-2 < Z < 2) =
95%, approximately.
Approximating a discrete distribution by a continuous one--the ">" vs. ">" issue gets lost. ("Continuity correction," pp. 347-8 helps with this; but we don't usually bother to do it with p-hats, so we'll skip it. If you must be more accurate, use SPSS.)
Next--5.2: Sampling distribution of the Mean of an SRS
| Sievers home | Math251-Fall05/Dayps26.htm | 10:50a.m. | 10/28/05 |