### MATH 251, P & S I, Fall 2011, F Sept 2, Day 4 After classhit reload..

Meet in classroom Monday; in Mac 101 lab Wednesday probablyfor big SPSS intro.  Bring a disk or flash drive.
Unless otherwise noted, all assignments are in IPS.
Day 4 Assigned:(Re) Read:  1.2 thru p. 42 (spreads).Then  43-45 ( linear transformations)
Ahead, 1.3, pp. 50-4 (densities), then ahead (Normal distributions), up to Normal quantile plots p. 65.
Use SPSS  Handout for computation of mean and std. dev, unless it says do it by hand.

B.  linear transformations algebra You have a data set x1, x2,... , xn,  which has mean xbar and standard deviation s.
a) We noted that the sum of all the deviations-from-the-mean's, sum(xi -xbar) always should equal 0.  Prove this is true by algebra. (If you are not skilled at working with big sigmas, do it for n = 3 (x1, x2, x3) (and write out all the sums with +'s.)   (This is ex. 1.92 in IPS)
b) You make a linear transformation xi*= a+b xi, on each data point.  (The book uses xnew  instead of x*, p. 43)
a can be + or - , but b should be positive.  (In practical terms, negative b would "flip" the data, reversing the order.)
Show that the mean xbar* of the transformed data set = a + b xbar,
and that the standard deviation of the transformed data set,  s* = bs . (The text shies at making formulas...)
(If you are not skilled at working with big sigmas, do it for n = 3 and write out all the sums with +'s.)
Do the proof by starting with the formula for the mean expressed in the xi*'s, e.g.  xbar*= (x1*+x2*+x3*)/3.
Plug in  xi* = a+b xi, and work the algebra to arrive at the desired expression (a+b xbar) involving the mean of the xi's.  Repeat for the  standard deviation formula.  Hint:  xbar* appears in the standard deviation expression:  substitute a +b xbar for it, since you already proved they were equal.

Check for Homework questions? Day 3 Especially 1.76 (0's). "Read, to discuss" problems?  Remaining #s on board.
Helpers more or less up to date.  Class members posted.  Math251@wells.edu working.

Wednesday, probably, day for SPSS in Mac 101, at class time.   FirstSPSS Handout,   Morganstore instructions on back.
Coming Friday (probably) Quiz:  In class, closed book:  Stemplot, 5#summary and boxplot.  Mean & s.d. by hand, showing all steps.

- - - - - - - - - - - - - - - - - - - - - - - - - - -
Revisit or meet  mean/median, 5#summary, boxplot , Day 3
Compare boxplot with histogram:  longer boxplot sections mean lower histogram height and vice versa.
<<-- Example:   from Connexions:
Collaborative Statistics
Barbara Illowsky, Ph.D., Susan Dean.

Some other "averages"/ measures of middle:   (Many exist)
Trimmed mean: throw away, say top and bottom 5%, take mean of rest.  Resistant, but hard to work with.  (SPSS, later)
Midrange:  Point midway on the ruler scale between smallest and largest:  Min = 5, Max = 15, Midrange = (5+15)/2= 10.
Highly sensitive, non-resistant, not too useful, but quick!
The Mode/modal class: (Mode: most "popular")  Group with the most individuals; point of peak of the histogram "curve".

Standard deviation (goes with mean) . Square root of:
Variance:  (almost) average of squared deviations from the mean.
(deviations sum to 0)
(Divide by (n-1) "degrees of freedom"--dimension of vector space spanning the deviations from the mean)
Demo:  1,1,2,4, mean = 2, sum of squared deviations = 6, variance = 2, s = 1.41 (table is good)
1,1,2,4,12, mean = 4, sum of squared deviations = 86, variance = 21.5, s = 4.64.
(Midcomputation check:  Sum of deviations from the mean (before squaring each) always = 0 )

--s is Always > 0  (0 only if all observations are =)
--s units the same as those of the observations (squared and squarerooted).

Very sensitive to outliers (the outliers  contribute much more than their share to the Sum of Squared Deviations from the Mean)

Mean and Standard Deviation are for Symmetric Unimodal  distributions without big outliers.
(ideally "Bell-shaped" = Normal)

SPSS to find mean and s.d.   Handout

We've been looking at SHAPE of distributions, and the ways irregularities can point us to knowledge about the data. (Living histograms.)  Note p.39 middle:  Statistical [summary] measures and methods based on them are generally meaningful only for distributions of sufficiently regular shape. ... [Q]uickly resorting to fancy calculations is the mark of a statistical amateur.  Look, think, and choose your calculations selectively.

Summaries of Middle & Spread "Systems:"
-- (Midrange, Range  Very sensitive to outliers--they use only the max and min!)
-- Median, IQR  (+ Quartiles Q1, Q3, 5-number summary), based on percentiles (j'th percentile is > j% of the data)
-- Mean, StandardDeviation "x-bar" (or "y-bar", etc.), "s"  (good for symmetric unimodal, no outliers)

... --------------------------  -----------------------------------------
Linear transformations (pp. 43-5) do not change the shape of a distribution :   A "good" measure of center or spread should "act naturally" if you change units of measurement by shifting (translating) (everyone eats one more cookie)
or by stretching or shrinking (changing scale) (all cookies are broken in half; count half-cookies) .
Fahrenheit<--> Celsius.
(Community Handbook?)
New x* = a + bx, for each observation.
Measures of spread are unaffected by the shifting! Only affected by the scale change.
Page 44 gives the rules explicitly.  Problem B has you prove them for mean and standard deviation.
(Shifting is often done to put numbers in a nice range, with 0 not too far away.  E.g. years from 1970)

 Sievers home Math251-Fall11/Dayq4.htm 11am 9/2/11
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.
- - - - - - - - - - - - - - - - - - - - -
Table for calculating sum of squared deviations, for n = 4 observations.
 x x-xbar= x-2 (x-xbar)2 1 -1 +1 1 -1 +1 2 0 0 4 2 4 8 = Sum.  xbar = 8/4=2 0 = sum (always!) 6 = sum of squared deviations