Math 151 , Spring 2005,  Day 23 Wednesday March 30 Hit reload ...After class

Sample exam 2 available (linked here) & outside my door. (Actual exam would be 5 pages, probably) Solutions outside my door and on reserve (soon.)  7d covers material not required this term.  Everything else is included.

Exam 2 next class (Day 24, Apr.1).  Covers thru Day 22 HW (Not Block or Matched Pair designs). Sign in today if you need a special time to take the exam.
How much computational detail from part II?  You don't need to know the formula for the correlation coefficient, but you should be able to guess roughly the r from a scatterplot, and know and use the properties pp.121-2.You will need to know, among other things,  how to find b0 and b1 from the means, standard deviations, and r of the x-and y-values,  and to give the formula for the regression line, (like 17, p.154); and to graph the regression line on top of the scatterplot.  Also find by hand the value that the line predicts for a particular x.  You should be able to identify and calculate the residual value for a particular x-y point as its vertical distance from the line (negative if the point is below the line), and identify and understand potential influential points.  You should know  that the regression line goes through the point given by the two means, and that the  regression line "rises" r standard deviations in y for each standard deviation increase in x (pp. 137-8); also that the regression line of "weight" on "height" is not the same line as the regression line of "height" on "weight" . You should be able to describe verbally the meaning of R2 in the context of a data set.

Day 23 (Wed. March 30): Finish: D&V Ch 12, 13. Review part III p. 262.  AS13.
    Next, D&V Part IV: Ch. 14, Ch.15 thru p.  291 (then Ch. 18 &on.) ActivStats is very good for part IV--Ch11"Randomness" shows Law of Large Numbers as D&V express it. Ch14, 15"Intuitive Probability"&"Probability Rules" correspond well with the text and present very good examples.
Hand in Postpone all till after Monday's lecture
Chapter 13, p257ff.
 1,2,4,5,6,10,11,12 You did the "observational study" ones, and started the "experiment" ones. Finish these for those that are more complex experiments,  add 18
32 Shingles  part d
35 Safety switch

= = = = = = = = = = = = = = 
ActivStats, Ch.11 HW, ACT 1 and ACT 2
Chapter 14, p. 280ff
1 Roulette
Winter
Crash

9 Spinner
11 a Car repairs
13 a M&M's

Using independence:
11 b Car repairs
13 b M&M's
19 Champion bowler
15 Disjoint or indep?  Read p.290 top, with this.

Read,
  to 
discuss 
 

Review Part III: 
p. 263 ff: 1 thru 
17 odds, +12, 18 
 
 

= = = = =
Chapter 14: Read p.283 
#25, read answers in back. (a) should have 0.001, not 0.00 for the answer.

Op-
tion-
al 

Homework questions? Day 21  Day 22
Questions for exam? Took whole class, good discussion.  Start here Monday
Chapter 13: Experiment: Continue Day 21   Brief summary:   All about avoiding BIAS
Principles of designing a comparative experiment (p. 243)

Results:  Measure differences in the response variable for different treatments
 "Statistically Significant" differences--too big to have plausibly occurred by chance

Block designs: (not "completely randomized")
(Randomized) Block design:  Sort experimental units into "Blocks" = groups homogeneous on potentially confounding variables:     Within each block, randomize the treatments. Compare results  within each block, then summarize all results.

Matched pairs is a special case of block design--each pair is a little "block":
Matched pairs: In experiment, to compare Control and experimental treatments (i.e. 2 levels)
   Sort experimental units into "matching" pairs.   One member of pair gets control, other gets experimental.
                Randomize which.  Compare within pair (find difference), then summarize all comparisons.
  Matched with self is common.  Eliminates extraneous variability.
     (Matching is also often used in observational studies)

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Part IV: Randomness and Probability(Why?)

We know that a sample from a population will not exactly represent the population.  If we take a random sample, the behavior of samples will not be individually predictable, but there will be predictable pattern in many random samples from the same population.  Knowing the pattern will be  as good as we can do.  Need probability.
Recall (Day19): p. 227       Sample Chosen from a  Population
  Numerical summary: Statistic (Latin)    Parameter(Greek letter)

The actual value of the Statistic will vary, depending on the particular sample. "Sampling variability" = "Sampling error"
The Statistic "estimates" the Parameter.  We hope it is close to the parameter.  If we choose simple random samples, we can understand the pattern of values the statistic can take.
Some examples of  statistics:
    Height:   U.S. young women: pop. mean= 64.5", pop. s.d. 2.5"  (Moore stats text p.66.  Caveat: rounded?)
                                          Math 151, Spring '01,  xbar = 64.2,     s = 3.75.
                                                             Fall '01,      xbar = 65.01,    s = 3.22.
                                                             Spring '02,  xbar = 64.53,    s = 2.91.
                                                             Fall '02,       xbar = 63.89,    s = 2.48.
                                                            Spring '03,  xbar = 64.98,    s = 3.29
                                                             Spring '04,  xbar = 65.33,    s = 2.25
                                                                 Fall '04,  xbar = 64.68,     s = 3.54
                                                             Spring '05,  xbar =64.31 ,    s =2.93
    Coin flip: Proportion of heads  p = 1/2 (?)       p-hat =  256/520 = .492  (combined data from many past classes)
    Thumbtack:  Proportion of point-up p =  (??)       p-hat =  441/691 = .6382  (one past class, Math 251)

Chance  behavior (a random phenomenon): Unpredictable in the short run,  predictable regular pattern in the long run.
(Prof. Persi Diaconis (a table magician) can flip a coin so precisely it always comes up the way he wants.  His coinflipping is not a random phenomenon.  Mine is.
"Probability" of particular something happening: proportion of times it would happen in a very long series of independent repetitions (trials) of the phenomenon: "long-run relative frequency".
    (independence:  outcome of one trial must not influence the outcome of any other.)

Law of Large Numbers (LLN):  Relative frequency of repeated independent trials gets closer  to the "true" relative frequency as the number of trials increases.
  (But it may take a long time: Large Numbers of trials. Use  http://www.whfreeman.com/scc -- "Probability " 1 toss at a time--settles down slowly.   )
(&&Another version of  LLN says the mean from a sample of size n gets closer and closer to the true = "population" mean, as you take bigger samples (as n increases).  Activstats presents this, 14-1, and we'll return to this soon.)

Aberrations won't be compensated for; they will only be swamped out.  (Misconception of "law of averages.")

Probability Model:
A Random phenomenon,
    Sample space S:  set of all possible outcomes (no overlap of descriptions) (def. p. 284)
    Event:  any  set of outcomes(including one outcome, & even the set containing no outcomes)
    Probability model: S, and a way of assigning a probability to each event.
&&Sample space depends on what you want to know:
Phenomenon: Flip coin twice.
    S1 = {HH, HT, TH, TT}     S2 = {0, 1, 2} number of heads   S3 = {Y, N} both are heads?

Probability rules:  (pp. 274-6, in words, then in notation).
A an event in sample space S, P(A) is "the probability that  A occurs"
    These rules are all true for proportions in long run (Probabilities), prop.of counts, proportions of areas.
    1.  0 < P(A) < 1
    2. P(S) = 1
    3. For any event A, P(A does not occur) = 1 - P(A)
    4.  A and B are  disjoint if they have no outcomes in common (can't happen simultaneously.)
        If A and B are disjoint, their probabilities add:  P(A or B) = P(A) + P(B)

Pick one person from U.S. Pop. (Age 25 +)
Sample space:
No HS degree
       HS only     .
1-3 yrs College
 4 + yrs College
Proportion in pop.
18.3%
33.9%
24.8%
23.0%
Probability 
.183
.339
.248
.230
P(No 4-year "degree") = ?
P( HS or less) = ?

Finite sample spaces (you can list the outcomes):
Assign a probability to each outcome (>0) so they add to 1.   (Sometimes equal values--"equally likely" make sense.)
    Prob. of an event is sum of prob's of its outcomes.

Phenomenon: Flip coin twice.
    S1 = {HH, HT, TH, TT}     S2 = {0, 1, 2} number of heads   S3 = {Y, N} both are heads?
Sample space  | HH | HT | TH | TT |
       Prob's | .25| .25| .25| .25|  P(tail followed by head)=?
Sample space  | 2  |    1    |  0 P(at least 1 tail)=?   P(1 of each) = ?
       Prob's | .25|   .50   | .25|  P(at least 1 Head)= ?  P(2 Heads) = ?
Sample space  | Y  |       N      |
       Prob's | .25|     .75      |

Flipping-coin-twice was built from a simpler phenomenon; flipping coin once: P(H) = .5, P(T) = .5

Rule 5.  If A and B are two independent events, the probability that both A and B occur  is the product of the probabilities of the two events.  P(A and B) = P(A)×P(B), if (and only if) A and B are independent.
   Rule 5 can be used to build probabilities for complex phenomena from simpler ones (Ch. 14); to check structure in existing sample space (Ch. 15.)

e.g. Pick 2 people at random from U.S. pop.  (Pop. is so big that it's hardly changed by removing first. Independence OK)
   P(First has 4+ yrs college, and 2nd didn't graduate HS) = .280×.183 = .051
   P(First didn't graduate HS, and 2nd has 4+ yrs college) = .183×.280 = .051
   P(one didn't graduate HS, and the other has 4+ yrs college) = .051+.051= .102


Sievers home  Math151-Sp05/Days23.htm 2pm 3/30/05
This page belongs to Sally Sievers who is solely r esponsible for its content. Please see our statement of responsibility.