### MATH 251, P & S I, Fall 2011, Mon. Sept. 5,Day 5.After class.  Hit reload!

Meet in Mac 101 lab Wednesday YES  for big SPSS intro.  Bring a disk or usb, book.
Day 5: Reading: IPS7e 1.3 Density curves 50-54,  Normal distribution, pp.54-64.

Meet in Mac 101 lab Wednesday (unless I email otherwise)  Day 6 for big SPSS intro.  Bring a disk or stick.
Friday Day 7 Quiz:  In class, closed book:  Stemplot, 5#summary and boxplot.  Mean & s.d. by hand, showing all steps.
Boxplot Cartoon

Handout: Density handout (solutions, not handed out)
HW questions?  Esp.
Linear transformations Notes, Day 4
Solutions
1.3   Density function or curve: idealized histogram.   "Model"
Area = relative frequency.

Any curve that is above the x-axis and has area exactly 1 under it can be thought of as the idealization of some set of observations, and can be called a Density curve.  We carry over our terms for shape, and our summary measures.
Densities
(When values can take on any of a continuous interval of numbers)
Example:  Spinner:  Label edge with continuous values from 0 to 1. Spinning should produce 1/10 of all spins in each colored sector.  Simulations of 500, 3000 spins show roughly true. More spins would get closer to  Uniform shape.

Abstraction, idealized histogram ("Probability Model") =
Density curve. Describes a theoretical distribution of data.
Any such model is a curve
--always on or above the horizontal axis
--has area exactly 1 underneath it.

This allows area to represent proportion (relative frequency) of "histogram" between specified values.
(We will assume the proportion of observations precisely equal to a value is 0.  "So proportion less than 2" is the same number as "proportion less than or equal to 2.")
Many, many models are possible, modeling many phenomena:  (Histograms of data for some models)
• For the spinner, the density  is "Uniform on 0 to 1".
• If you have two spinners like this, spin both at once and add the results--the corresponding density  is "triangular, symmetric, on 0 to 2"
• A more complicated mechanism  produces data corresponding to the model  I've called "trapezoid, -1 to 2"
• A very important one is the "normal" distribution family--bell-shaped.
Median, mean, percentiles, standard deviation are defined for a density model in analogy to those for a histogram.
-- median has half of area below and half above.
-- mean is balance point.  On the long-tail side of median if distribution is skewed. Same as median if symmetric.
--First quartile has 1/4 of area below, 3/4 above. Etc. for others.
--Greek labels "mu" for mean and "sigma" for std. dev. of a Density.
Complex models require tables to find proportions.
Make some cumulative proportion tables: Handout Density  (Solutions)

.Got to here Monday..
"Normal" Density
:
("Gaussian", "Bell-shaped")  Normal Curve Applet  http://www.whfreeman.com/ips7e/

•  What "mechanism"?  &&Thing measured is the result of many small independent influences.
• Family:  all the same basic shape, except for measurement units.
• Mean gives middle, standard deviation gives spread.
• Knowing parameters mean (µ "mu") and standard deviation ("sigma") tells you everything. "N(mean, s.d.)"
• Symmetric.  Mean at middle. Curvature changes at 1 standard deviation on either side of mean.
• "Standard normal": mean = 0, s.d. = 1  Standardized : how many "s.d.'s from the mean".
• "Middle s. d." && A center region one standard deviation wide spans about 38% (Not in IPS)
• 68-95-99.7 rule:  68% within 1 s.d. of mean, 95% within 2 s.d. of mean, 99.7% within 3 (roughly).

• What percent are further than 3 s.d. from the mean?  What percent  are  higher than 2 s.d. from the mean?  Etc.
Back-of-the-envelope:  Sketch the curve, mark mean, + 1, +2, +3 s.d.'s-- in real units.
Example:  A standard psychology test "W" has  scores that are approximately N(110, 25).
mean=110, mean +1s.d. = 135, mean + 2s.d.'s = 160,  mean -1s.d. = 95, etc.  See picture below.
Standardizing: A "raw value" x is standardized by telling how many standard deviations above the mean it is.
Find z:  Subtract the mean from x.  Now you know how far "above" the mean x is, in "raw" units. (If it's below the mean, the number will be negative.)  Find how far this is in "standard deviations" by dividing by the standard deviation.
That's the z-score.

Standardizing:   A way of comparing an individual against its pack.
Comparing individuals from different packs, each relative to its own.
Removes "units of measurement" from the discussion.
Enables use of the standard normal table.

Examples: Psychology test "W" scores are approximately N(110, 25)
A score of   85 is 1 s.d. below the mean.  Computation:  z = (85 110)/25 = (–25 raw points)/25 = –1 s.d. from mean.
(About the 16th percentile--16% get scores < 85)
145 is how many s.d.'s above the mean?
Computation: z = (145110)/ 25=  (35 raw points above mean)/25 = 1 2/5 = 1.4 s.d. above mean

~ ~~ ~ ~ ~ ~ ~ ~ ~ ~ First standard normal table use, then with "real" values~ ~ ~ ~ ~ ~ ~ ~ ~
..

OPTIONAL aids:  Normal curve template (you can count squares), Normal practice handout
Standard Normal N(0, 1).  Our tables give area to the left of a z value--"Cumulative Proportion".  Table A, back flyleaf
Using standard normal table:  p. 77
z |  .00     .01     .02 .....
...|
1.4 | .9192   .9207   .9222 ....
P(Z < 1.40) = .9192,   P(Z < 1.41) = .9207  P(Z < 1.42) = .9222.
?z has more than 2 dec. places?  Round to 2.

Sketch the density, mark the area you're looking for.
Figure out how to get it using areas to the left of one or more z-values.
Think cutting up paper bell-curves. (Remember whole area is 1.)  Like handout.

Example:  Proportion of observations between 0.5 and 1.4  P(0.5 < Z <1.4) =
Proportion of observations below 1.4  minus Proportion of observations below 0.5
P (Z < 1.4)  -  P(Z < 0.5)  = .9192 - .6915 = .2277

.
Example:  Proportion of observations above  0.5,    P( Z > 0.5) =
ONE minus proportion of observations below 0.5,   1 -  P(Z < 0.5) = 1-.6915 = .3085
What z value has area ..... to the left/right of it?
Sketch  roughly.
Restate (if needed) as "What z value has area A to the LEFT of it."
Look in body of table for the value closest to A.
Go to edge(s) of table to find what z that goes with.
Example:  "What z value has 10%  of the observations above it?"  This is the same z as the one for:
"What z value has 90% of the observations below (to the left of) it." (What z is the 90th percentile.)

Find in the table  .8997 and .9015 --  .9000, our number, is between them.
.8997 is a little closer to.9000, so use it.
For .8997, the z value is 1.28.   1.28 is the 90th percentile.
1.28 has 10% of the observations above it.

Real/Raw data:
Check with
Normal Curve  Applet  http://www.whfreeman.com/ips7e/
"What proportion"problems:
•     Sketch a normal curve. Mark mean, 1, 2 s.d.'s.  Label with raw values, and z-values below.
•     Mark end points for problem, roughly, and shade area desired.
•     Standardize end point(s).  Use standard normal table to find area. (Draw helper sketches if needed)
•     Check picture to see if it's plausible.

Example:  Proportion with scores between 100 and 145?

x = 145 gives z = 1.4  (done above.)      Area to left of z = 1.4 is .9192
x = 100 gives z =  –.4                           Area to left of z = –.4 is  .3446
Desired area = Difference=  .5746;  about 57%.  Looks about right from picture.or
P ( 100 < X < 145)  = P ( –.4 < Z < 1.4) = P( Z < 1.4) – P(Z < –.4) = .9192 – .3446 = .5746

Read "Proportion of x's with 100 <x<145"  for P(100<X<145)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Backward problems"  "What raw (x) value has area ___ to the left/right of it?"
Sketch  the curve, labeled with x values and z values, and the Area, roughly.
Restate (if needed) as "What z value has area A to the LEFT of it."
Look in body of table for the value closest to A.
Go to edge(s) of table to find what z that goes with.
Convert the z to an x: z is the number of standard deviations above the mean.
Multiply z by the size of 1 standard deviation.  Now you have distance above the mean, measured in raw units.
Add the mean.  Now you have the "raw" value x. (You have "unstandardized")
Example: "W" test:  What x value has 10%  of the observations above it?  This is the same x as the one for:
What x value has 90% of the observations below (to the left of) it.

The table gives z = 1.28, approximately.
The "W" score x= mean + z (s.d.) =  110 + 1.28 (25)=  110 + 32  = 142
Percentiles:  a "W" score of 142 has 90% of the scores at or below it.  142 is the 90th percentile.

 Sievers home Math251-Fall11/Dayq5.htm 1:30pm 9/5/11
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.