MATH 251, Probability and Statistics I, Fall 2005, Mon. Dec. 3, Day 41after class

Hand in:  Nothing due.   Missing late HW accepted up till  noon Thurs. of finals week.
Bring any questions from anywhere in the course. (Email me first to get the best quality answer!)
Go over your exam, missing HW.  See you Wed.  I hope to have your Final ready by then.
Read, discuss 

Optional 
Exams are finished.  Many very good, a J shaped distribution.

Exam  solutions 

Homework questions? 8.2  Day 40

A glimpse of the wider world of statistics, in your textbook:
3 or more independent samples:  comparing proportions--
           use two-way tables and "Chi-square" statistics (Ch. 9) to test if proportions different;  Extension: two "dimensions" of table (color of medicine package, willingness to buy it) are ?? independent. (Research methods of Sociology)
           "Chi-square goodness-of fit" (9.4)--Biology models.
3 or more independent samples:  comparing means--
              Analysis of variance (Ch. 12&13) (Quantitative Research Methods of Psychology)
               Analysis of variance nod

Got to here Monday.

Inference for regression:  (Ch. 10)  Assuming a population where:
  for each possible x, corresponding y's are normal, same s.d. for all x's ("homoscedasticity"), and means of y's from each x lie on a straight line.
     Is the slope significantly different from 0?
     Confidence intervals: for slope and intercept
                for a specific x:  1) interval for what the predicted y-hat (line) value would be. (mean)
                                           2) interval for where 95% of individual y's would be (more scatter)"Prediction interval"
    Example 10-1, miles per gallon on miles per hour (data from a single car).  Logmph makes it straighter.  SPSS file
       In graph, Insert>fit line>Regression, click on the regression line. Edit >Regression Parameters, "Prediction Interval", mean, individual.
Multiple regression: (Ch 11)  Instead of one x-variable, 2 or more all predicting y. (Econometrics)
       Miles per gallon as a linear function of logmph and miles traveled (age of car).

Chapters on the CD (or downloadable)

Logistic regression (Ch 16) gives an introduction to a way to make predictions where the y is a yes/no, true/false variable (the prediction is a version of the probability of yes).  And where one (or more) of the predictor (x) variables is also a yes/no variable. (Econometrics?)

"Bootstrap" methods (Ch. 14) -- (nice, fairly new, need powerful software)  "Resampling" methods.  Make no assumptions about underlying population distributions; take a zillion subsamples of your sample to get a handle on variability of your sample, therefore of the population it "represents."

Nonparametric tests (Ch. 15)  The most commonly used tests for non-normal populations.  Based mostly on "order statistics" (percentiles), but usually presented as easily computed results. (Sign test is one.) (Biology, other fields)

Quality control (Ch 17)  for improving the processes involved in producing a product or service, by collecting and monitoring data.  Some nice ideas, not hard mathematically, but a lot of jargon.  American ideas (Mr. Deming), taken up in Japan and very instrumental in Japanese manufacturing success around the world; ideas returned to U.S. corporations.  Fad level "Total quality management" but settling into fundamental use now, I think.

And much much more out there; variations and embroideries and refinements and different names for the same things ---  but "Everything" is about
   Exploratory (descriptive) data analysis
or Confirmatory (inference) on data assumed to be collected in some way as free of biases as possible, and as much like a random sample as possible (Design of samples, experiments).


Sievers home  Math251-Fall07/Day2s41.htm    4pm   12/4/07
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.