| Hand in:
Nothing due. Missing late HW accepted up till noon
Thurs. of finals week. Bring any questions from anywhere in the course. (Email me first to get the best quality answer!) Go over your exam, missing HW. See you Wed. I hope to have your Final ready by then. |
Read, discuss |
Optional |
Homework questions? 8.2 Day 40
A glimpse of the wider world of statistics, in your textbook:
3 or more independent samples: comparing proportions--
use
two-way tables and "Chi-square" statistics (Ch. 9) to test if
proportions
different; Extension: two "dimensions" of table (color of
medicine
package, willingness to buy it) are ?? independent. (Research methods
of
Sociology)
"Chi-square
goodness-of fit" (9.4)--Biology models.
3 or more independent samples: comparing means--
Analysis of variance (Ch. 12&13) (Quantitative Research
Methods
of Psychology)
Analysis of variance nod
Got to here Monday.
Inference for regression: (Ch. 10)
Assuming a population where:
for each possible x, corresponding y's are normal, same s.d. for
all x's ("homoscedasticity"), and means of y's from each x lie on a
straight line.
Is the slope significantly different from 0?
Confidence intervals: for slope and
intercept
for a specific x: 1) interval for what the predicted y-hat (line)
value would be. (mean)
2) interval for where 95% of individual y's would be (more
scatter)"Prediction interval"
Example 10-1, miles per gallon on miles per
hour (data from a single car). Logmph makes it straighter. SPSS file
In graph, Insert>fit
line>Regression, click on the regression line. Edit >Regression
Parameters, "Prediction Interval", mean, individual.
Multiple regression: (Ch 11) Instead of one x-variable, 2 or
more all predicting y. (Econometrics)
Miles per gallon as a
linear function of logmph and miles traveled (age of car).
Chapters on the CD (or downloadable)
Logistic regression (Ch 16) gives an introduction to a way to make predictions where the y is a yes/no, true/false variable (the prediction is a version of the probability of yes). And where one (or more) of the predictor (x) variables is also a yes/no variable. (Econometrics?)
"Bootstrap" methods (Ch. 14) -- (nice, fairly new, need powerful software) "Resampling" methods. Make no assumptions about underlying population distributions; take a zillion subsamples of your sample to get a handle on variability of your sample, therefore of the population it "represents."
Nonparametric tests (Ch. 15) The most commonly used tests for non-normal populations. Based mostly on "order statistics" (percentiles), but usually presented as easily computed results. (Sign test is one.) (Biology, other fields)
Quality control (Ch 17) for improving the processes involved in producing a product or service, by collecting and monitoring data. Some nice ideas, not hard mathematically, but a lot of jargon. American ideas (Mr. Deming), taken up in Japan and very instrumental in Japanese manufacturing success around the world; ideas returned to U.S. corporations. Fad level "Total quality management" but settling into fundamental use now, I think.
And much much more out there; variations and embroideries and
refinements and different names for the same things --- but
"Everything" is about
Exploratory (descriptive)
data analysis
or Confirmatory (inference) on
data assumed to be collected in some way as free of biases as possible,
and as much like a random sample as possible (Design
of samples, experiments).
| Sievers home | Math251-Fall07/Day2s41.htm | 4pm | 12/4/07 |