MATH 251, Probability and Statistics I, Fall 2007, Wed. Oct. 24 Day 26.after class.

Read Sec. 5.1.  Review  pp. 334-41, now  the binomial formula pp. 348-50, Memorize the binomial formula and the mean and s.d. for a binomial distribution. Then pp. 341-347 (proportions and normal approx.)  Skip "continuity correction" pp 347-48 (or read optionally).  Next--5.2: Sampling distribution of the Mean of an SRS
QUIZ FRIDAY(YES):  Computing mean and s.d. for a discrete R.V., using algebra of means and variances (pp.298, 300)
Hand in: 

p. 351ff 
5.9b random numbers
5.5 proofreading, mean and s.d.  Also,
   graph f(p) = p(1-p) for 0< p <1 to reinforce your understanding of part c. 
5.12 random walk stocks (For part b, calculate the values using the formula!)
5.22 trip marketing
5.24 binomial coefficient facts.  (Not sure they're true?  Check with your computations in #5.12 for plausibility. Remember in any binomial situation you can reverse the roles of Success and Failure.  Note b and c together give nC1 = n.

Postpone
Normal approximation

5.17 alcohol study (this reasoning is that of significance tests, coming up)
5.23 proportion of sample
5.25 variability on exam

Read, discuss

4.81, 4.82 insurance
Optional 
Postpone
Q) Calculate the "finite population correction,"
sqrt [(N-n)/(N-1)]
-- for n = 25, and N (population) = 250 (10n, some texts' rule of thumb). And for N = 500 (20n, IPS's rule of thumb). 
  Diaconis article.  How does he feel when he sits down to work?
QUIZ FRIDAY? YES! You need to be able to find the mean, variance, and standard deviation of a simple discrete distribution, given in a table.  (So yes, you have to memorize the formulas or the processes for doing that.)
Example from the Homework in the text: p. 306, 4.60 (variances also). (4.64 a has answers for the variances)

You need to memorize and be able to use  the rules for the "Algebra of Means and Variances"; except I will not ask you to remember the rule for the variance of a sum of two NON-independent variables.  But you will have to know that the simpler rule for the variance of a sum depends on their being independent.  What you need to memorize/understand for the quiz is the things on the front page of the handout from the last class.  Examples from the text HW, 4.64 a, 4.69, others.

Sec. 4.4  Algebra of Means and variances of R.V.'s X, Y, questions?Day 23
Pair up.    Work these problems.  If you understand, explain: if you don't, ask:
A) X, Y independent.   µX = 3,  µY = 4, sigmaX =1, sigmaY =2   W = 2 - 4X + Y
::Find the mean and standard deviation of W.
B)  X1, X2, X3 are independent; each has mean 5 and standard deviation 2.
::Find the mean and standard deviation of  their sum,  X1 + X2 + X3.  
::Find the mean and standard deviation of  their average, Xbar = (X1 + X2 + X3)/3
C)  Find the mean and variance and standard deviation for X:
     x      2    3    4    6
   p(x)   .1   .2   .5   .2

Solutions

Handout from last time link here
  Homework questions? Day 24, Day 25
 5.1 and 5.2
 p. 351. Addition:  Investment portfolio (or other "mixings" like 4.76)  Applet  Two-asset portfolios aX+(1-a)Y.  What the applet doesn't do is tell you which proportion (which "a") gives a particular result, so you can't tell how to minimize your risk (s.d.).

Sec. 5.1  X Binomial   B(n, p) , continued 
  Mean and s.d. : Binomial X = S1+S2+...+Sn where each of the Si's is a Bernoulli trial, 1 if Success, 0 if Failure
   s     0     1
   prob  1-p   p    Mean = p (p. 340)   Variance = p(1-p) (Homework)
Then  µx = np,   sigmax2 = np(1-p) by algebra of means and variances.
Die:  "Success" is getting a 3.  p = 1/6.  Roll 36 dice.
   Mean number of 3's = 36 ·1/6=6.  Variance = 36 ·1/6·5/6=5  S.d. = sqrt(5) = 2.24
(do this Friday:) Notice standard deviation grows like square root of n, not as fast as the n possibilities. Applet (normal approx to binom)--try with n= 10, n = 100

Formula:  P(X = k)  p. 349
   For any branch with k successes, there are exactly (n-k) failures
    So the probability of that single branch is pk(1-p)(n-k)   .
  But (except for k = 0 and k = n) there are several branches with exactly k successes.  How many?   Each is a list of  k S's and (n-k) F's.  How many different lists are there?  Choose which k places in the list of n places get the S's; the others get the F's.
Call the number of different lists "n choose k", written nCk ornchoosek, the "binomial coefficient", the "number of combinations of n things taken k at a time".
We see from the tree that 3C0 = 3C3 = 1, 3C1 =3C2 = 3.  The case of flipping a coin 4 times (fig. p.280) shows that 4C0 = 4C4 = 1,  4C1 = 4C3 = 4, 4C2 = 6.
Each of the branches with k successes has the same probability, so we can multiply that probability by how many there are.
So P(X = k) = nchoosekpk(1-p)(n-k)
How to calculate nCk without actually writing down all the possibilities? 
Since n! = n(n-1)(n-2).....3·2·1, nCk "simplifies" (when you write it down and do a lot of canceling) to n(n-1)(n-2)...(n-k+1)/k! (k terms on top and k on the bottom).  (The study of "counting" or combinatorics, of which these are a part, is very rich.  More in Math 300)
   Note: 0! = 1, which makes nC0 and nCn work out to = 1, as they should.

.Start here Friday. Look back at mean and s.d. of binomial.
Often we're interested in the sample proportion -- p-hat =X/n  , where X is B(n, p)  p. 341-3
  Mean of p-hat: = mean of (X/n) = (np)/n = p
     Sample proportion is an "unbiased estimator" of p
   S.d. =  (Homework)
  Applies to results of SRS from population, approximately, if (rule of thumb) Pop. size N  > 20·n. (in fact, if sample "uses up" a good chunk of the population, the s.d. only gets smaller than this number, so some authors use 10·n.) Mean still works, no matter how big the sample.)  The exact s.d. is  the above, times sqrt [(N-n)/(N-1)], the "Finite population correction."  You can see that this is always <1, and if N is very large compared to n, it's very close to 1.

Normal approximation:  Both X and p-hat are approximately Normal, for large n, using their means and s.d.
  Important for theory.  And for quick and dirty calculations.
  Rule of thumb:  Normal approx. is barely OK if both np and n(1-p) > 10; better the farther from 10 they are..

e.g. Flip a fair coin  100 times  B(100, .5).  X =  # of heads, p-hat = X/n = proportion of heads.  np=n(1-p) = 50 >>10.
   µx = 50, sigmax = sqrt(100·.5·.5) =  sqrt(25) = 5   X is approximately N(50, 5)
 Probability that # of heads X is between 40 and 60 =  P(40 < X < 60)
            = P[(40-50)/5 < Z < (60-50)/5]  = P(-2 < Z < 2) = 95%, approximately.

  µp-hat = .5, sigmap-hat = sqrt(.5·.5/100) =  sqrt(.0025) = .05   p-hat is approximately N(.5,.05) 
Probability that the Proportion of heads p-hat is between 40% and  60% =  P(.4 < p-hat < .6)
            = P[(.4-.5)/.05 < Z < (.6-.5)/.05]  = P(-2 < Z < 2) = 95%, approximately.

Approximating a discrete distribution by a continuous one--the ">" vs. ">" issue gets lost.  ("Continuity correction," pp. 347-8 helps with this; but we don't usually bother to do it with p-hats, so we'll skip it.  If you must be more accurate, use SPSS or Excel or some other computational aid.)

Next: 5.2, Sampling dist. of the mean of an SRS


Sievers home  Math251-Fall07/Day2s26.htm  11 a.m.   10/24/07
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.