In lieu of class, a few paragraphs: (choose One)
B) A paragraph describing one of the
workshops/talks you attended,
+ a paragraph or so on a situation
where
organized
data could be useful to an activist working for a cause (either
data which was cited in a workshop you attended, or a place where you could
see that information could help make or strengthen the "case" for a cause,
or be useful in improving the activist's skill in some way.)
C) Find one or more
graphs, charts or tables of numbers in the popular press or on the web.
Hand in a copy of it/them. Explain what it's about and what
it says, and critique it as to how well it conveys the information.
If you can do it better, redo it.
D) Research Florence Nightingale, primordial activist and
statistician. Report why/how she fits into this year's
theme of "Got Passion", and why I call her a statistician.
E) Nothing. Counts as a class absence.
- - - - - - - - - - - - - - - - - - - - - - -
Exams: Solutions outside my door, on
reserve. Comments
total #1 #2 #3 #4 #5 #6 #7 #8
10|0
possible100
8 10 19 16 7 19 17
4 9|88
max100
8 10 19 16 7 19 17
4 9|012244
Q3 92 8 10 18 16 7 18.75 16
4 8|788
Med
87.5 8 10 16.5 16 7 17.5 12.5 2 8|0234
Q1 78.5 8 9 15.25 14 6 17 7.25 1
7|68
min
69 5 5 12 10 4 13
2 0 7|004
6|9
HW assignment Day 15
Reading: Finish 2.3, read 2.4. Skip 2.5. Ahead in
Ch. 3.
| Hand in LATER:
Nothing to hand in Monday. Happy weekend!
With four facts, from Day 14: See details there. C. govsal on avgpay 2.33, 2.30, 2.35--Note Text &Excel files are put in order, so look different,+ Text is MISSING the 23rd point, (5,56). You can just type it in. 2.47, 2.51 E. RSquared = = = = = = = = = = = = = = = = = = A. Use ResidualsRSquared from the website or the lab to graph these data sets, along with a graph of the residuals. Print the results, and describe the shape of the residuals (it may help to connect the dots with pencil, to see the pattern.) a) x 1 2 8 4 6 9 y 1 3 6 6 7 5 b) x 1 2 7 4 6 9 y 7 6 2 4 2 1 Moore p. 122, 2.36 speed&gas again a, b, c, d. There is a data file for problem 2.36, and its third column is the residuals (check them against the book). B. Use Author's website, http://www.whfreeman.com/scc, ...Correlation/regression. Make a cloud of data (about 15 points), put in the regression line. Play with an outlier: drag a point to the far left (right) and drag it up and down. Try it if it's in the middle range of x's. Write answer: Where is it most influential? Now add a bunch more points (50 is max.) Play with an outlier again. Does the outlier have more or less influence with a larger data set? Moore p. 123, 2.38 Gesell first word-point in middle of x range. Get the data into SPSS, delete child 19, graph and get the regression line and r2. Use the formula on p.117 and graph the line for the full data set by hand on your printout. r2 for the full data set is on p. 122. Moore p. 122, 2.37 Calories (You saved these, I think--or, from Moore's files, in TA02-04) Graph and get lines in SPSS with and without the outliers. Graph the line for "without outliers" by hand on the printout for "with outliers" so you can compare them better. Print one more graph (with outliers) and keep it for problem C below. |
Read, | Optional
==== = = = = = =
|
Regression-- Review comments
ANY Straight line y = a + bx (or bx + a): b,
the coefficient of x, is the slope of the line. If
x changes one unit, y changes b units, so b is the rate of change of
y with respect to x. (If y is weight in pounds, and x is height
in inches, b is the number of pounds we expect to see
weight go up by, per inch that height goes up by.
"Regression line of weight on height":
height = horizontal (x) axis, weight = vertical (y) axis.
We finished Fact 1; have the
rest to go.
Four Facts: Day
14
LEAST
SQUARES PROPERTY
"Residual at x" = y - yhat =
distance between observed y and predicted y (what's left over
after predicting)
( Positive if observed is bigger than predicted,
negative if observed is smaller than predicted)
Least squares principle: Find the line that minimizes
the sums of the squared residuals.(Here,
or
in Mac 101, ClassMaterials\Math151\ RegressionDemos\RegressionLine.xls,
Squares tab)
This method
of finding a "best fit" straight line for predicting y's from x's was derived
mathematically to work well with "joint normal" data--elliptical clouds.
For data of this sort, the line does give the mean of the
y's for each given x (at least in the abstract.)
Drawback if the data is not the "elliptical cloud" type:
Outliers get their residual distance
squared: May be very influential in determining where
line sits.
Especially if at lowest or highest x-values, may change slope of
line a lot.
Author's website,http://www.whfreeman.com/scc,
...Correlation/regression. Play with an outlier.
(Outliers
toward the middle x's may not change the slope, but may affect r and r2.)
Plotting residuals: This amounts to making the regression
line into a new x-axis--If you plot the residuals themselves vs.
the original x values, without the distraction of the slanted line, outliers
and patterns other than the linear (if any) can emerge.
(Here
or
ClassMaterials\Math151\RegressionDemos\ResidualsRSquared.xls
, Graph of Residuals tab.(doesn't have tiny unlined graph)
SPSS can make a new variable of residuals, which you then can use
to make a scatterplot. Optional HW.
| Sievers home | Math151-Sp04/Days15.htm | 4pm | 3/5/04 |