Math 151 , Fall 2005, Day 30 Friday, Nov. 4 Hit reload After class

Day 30 (Friday, April 15): Reading: Ch. 19, Confidence Intervals for Proportions.  Quite well written, packed with stuff.  Ch. Activstats Ch. 19 does a good job with confidence intervals for proportions.  Next, Ch. 20
Hand in (All D&V)
For Monday:  A) Repair Day 29 HW, to hand in Monday.
B) If you haven't finished it: Take your p-hat for the 30 slips from the shoebox:
      add and subtract your estimate of the SD SE(p-hat) = , getting (two numbers)
      Multiply 1.96 times , and add and subtract the result to p-hat, getting   The thing you're adding and subtracting is the margin of error, ME, for a confidence interval for p.   You've found: # of 1's, p-hat. SE(p-hat). p-hat + SE. ME for 95% = 1.96SE.  p-hat + 1.96SE
      Bring these to class to add to the pooled class results.

Postpone all HW below:
Ch. 19p. 386
5, 6 Conclusions. Do these with the "Don't misstate..." section, pp. 361-2.
9 Cars
A.   Use the Normal table to find z* for a 99% CI (This is the z* for which 99% of the area is between -z* and +z*).  Find z* for a 90% CI. (Text p.358 shows that z* = 1.645 to 3 decimal places.  Normal table only gives 2 places.)  Check your results by comparing with the corresponding results in Table T p. A-53.
13 Teenage drivers 
11 Ghosts 
21 Rickets (98% CI)
3, 4 Conditions
16 Local news--new prison?

ME, C, n pp. 356-7, 361-3.  Problems p. 368
7, 8  Relationships, ME, C, n
23 Deer ticks CI + find n.
25 Graduation--find n The answers in the back use the 25% as the p to plug in.  Redo part a (only) using 50% as the p (what you would do if you had no idea what p would be.). How many subjects do you "save" by using the 25%?
26 Hiring--find n
28 Hiring again--find n
29  Pilot study--find n Use the pilot study to guesstimate p.  Now: if you want to be  conservative, you'll use a p closer to .5 than this.  What sample size would you recommend if you (conservatively) use p = .25?  How many more cars will it "cost" to be conservative?

Read,
  to 
discuss
Optional 
Review sampling distribution of p-hat and y-bar. Day 29
Homework questions? Day 29
  Take your p-hat for the 30 slips from the shoebox:
      add and subtract your estimate of the SD, getting (two numbers)
      Multiply 1.96 times , and add and subtract the result to p-hat, getting 
    Add your results to the list circulating.
      Also draw your on the graph circulating    |------o------|

Next job: (Ch. 19) We usually DON'T KNOW the population parameter; use the statistic from our sample to ESTIMATE it.
YOU don't know the real proportion of 1's in the green shoebox.  Each of you has an estimate.  In "real life" you won't have a bunch of classmates with other samples; you'll only have your own. (Also, in this case, I know the real proportion. Not so in "real life")  How "good" is your estimate of the real p?

You know how the sampling distributions of sample proportions (and sample means) behave; we'll use that.  But we want to know how much they are spread, and for that we need the parameter p (and q) for proportions, (and the parameter sigma for means)
And we don't know those!  So we use the sample statistics p-hat and s in place of them.

Standard Error (p. 347):  When we estimate the standard deviation of a sampling distribution of a statistic, using the data from our sample, we call that the Standard Error  of the statistic.
Start here Monday.
Confidence Interval Estimate of p: (Chapter 19)
p-hatis your best guess at p, but it's bound to be wrong, almost always.  (see p. 356)
Make an interval estimate of p, by adding and subtracting a Margin of Error (ME)
   For instance, 39% + 2%.
Say "This interval contains (captures) the true proportion p."  Wrong.  It may or may not, and you have no way of knowing.

What we can do  is use a rule to construct the ME so that intervals made using the rule will contain p a known proportion of the time.  The "known proportion" is our confidence level.  If our rule makes ME's that capture p 95% of the time, we've made 95% confidence intervals.  "I have 95% confidence that this interval captures the true proportion p"

A level C confidence interval for a parameter  is an interval, usually of the form estimate + margin of error,
  found from data, in such a way that
C% of all random samples will yield intervals that capture the true parameter value.

Rule for ME:   ME = z* SE(p-hat), where z* is the "critical value" from the Standard Normal table that has C% of the area in the symmetric central interval between -z* and +z*.
Level C confidence interval for population proportion p:  "One -proportion  z-interval"

(Why it works:  later.)
Example:  You drew a sample of size n =30. p is the (unknown) proportion of 1's in the shoebox. You found the sample  proportion, and you calculated the SE for the sample proportion. Use z* =1.  Then C is about 68%.
Calculate:  if I got 12/30, p-hat = .400.  "q-hat" = 1 - p-hat = 1-.4 = .6.   SD formula: square root of (p ·q/n)= square root of (.4 ·.6/30) = square root of .008 = .089 = SE(p-hat).
68% Confidence Interval:  .400 + .089, or  (.311, .489).
Whose intervals captured the real proportion?  (Expect roughly 68% of you to do so.)

Usually, want higher Confidence Level:  90%, 95%, 99%....
     For 95%:  z* = (approximately 2) = 1.96
           (How?  95% in the middle.  2.5% in each tail.  .0250 is to the left of what?? -1.96.)
           && Shortcut: Table T, p. A-53, bottom two rows.  ("infinity" row is the Standard Normal values)
      z*·SE(p-hat) = 1.96·.089 = .174  95% Confidence Interval:  .400 + .174, or  (.226, .574).

Note Trade-off:  Higher Confidence ---Wider interval (bigger ME. Less "precision")

Assumptions/conditions:  Assumes Central Limit Theorem for proportions is appropriate.
  Independence:¿¿Data values shouldn't affect each other.   ¿¿ Randomization helps!   ¿¿n < 10% of population.
  Sample Size:  Expect at least 10 successes and 10 failures (rephrase of  np, nq > 10)

  Bias?  Here's why we studied bias in sampling.  Biases or other bad sampling methods can make our computations worthless! p. 363.
Reprise:  Level C confidence interval for population proportion p: 
 "One -proportion  z-interval"   Chapter 19

Note Trade-off:  Higher Confidence ---Wider interval (bigger ME. Less "precision")
Desire:  Small Margin of Error ME + High confidence C.  p. 361-2
But they grow and shrink together: High confidence--Low precision ; High precision (small ME)--low confidence.

Way out:  increase n, the sample size.  (Shrinks SE)  How big a sample size for desired ME and C?
   Plan ahead:  Decide on desired ME and C (thus z*).  Guesstimate p (p=1/2 requres largest sample size--safest).
Solve equation for n.   (Some results pre-calculated, p. 362)
Notes:  --To cut ME in half, need 4 times the sample size.  Certainty/precision are expensive!
    -- If you're sure your p will be far from 1/2,  you can get a smaller n by using a closer guesstimate for p.

Green shoebox:  To get a 90% CI, ME = .04:  use p = 1/2 = .5.    z* = 1.645.
          n = (1.6452) ( . .5) / (.042 )  =  2.706025 · .25/.0016= .67650625/.0016 = 422.8  Round UP! to 423.

Why does it work??  Why does the ME calculated this way give intervals that capture the real p C% of the time??
   Think about the Sampling distribution of p-hat.  It's Normal, center at the real (population) p. SD(p-hat) is its standard deviation. SE(p-hat) approximates SD(p-hat)
Now ME = z*SE(p-hat),  where + z* cut off the center C% of the standard normal model.
So, in the Sampling distribution model, Realp+ ME  spans the center C% of this normal curve.
So the probability that p-hat falls in the range Realp+ME  is C%; That is, with many random samples, the proportion of p-hats that fall in the range Real p+ME  is C%.
That is, the proportion of p-hats that are within the distance ME of p---is C%

Now:  If p-hat is within ME of p, then p is within ME of p-hat.  The "arms" (+ ME ) that a p-hat interval sticks out from p-hat will capture p, if and only if p-hat is within ME of p.  But the proportion of p-hats that do that is C%.



 
Sievers home  Math151-Fall05/Dayf30.htm  4pm 11/04/05
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.