| Hand in:
Sec. 2.6: Transforming: For the following you may need to Transform your x or y-data to a new variable in SPSS. Use Transform>compute: Use the function LG10( ) for the log base 10, LN( ) for natural log, x^3 for x cubed. Use log base 10 unless told otherwise; but it really doesn't matter much. Problems are on handout. SPSS files will be linked to from here when
I get them tracked down and relabeled, this afternoon. Solutions
9.9 a, b, c, d nonresponse
|
Read, discuss
2.134, 2.135 strength, weight. |
Optional
|
2.5 Causation: The rooster believes the sun will
not rise if he doesn't crow...
A silly website
spoofs statistical obfuscation in general, especially
mindless data crunching.
- - - - - - - - - - - - - - - - - - - - - - - -
Transforming variables (handout, plus Sec. 2.6) See
Day 12
- - - - - - - - - - - - - - - - - - - - -
Relationships: We know how to analyze/summarize quantitative
vs. quantitative (scatterplot), and categorical vs. quantitative
(side-by -side histograms, stemplots, boxplots) . Now
Categorical vs. Categorical
Sec. 9.1 "Two way tables"
"Two way table" "Contingency
table" "Crosstab(ulation)" Hair color vs. Class year.
A thousand people are interviewed by the census bureau, and the results
tabulated in this two way table.
Working Status vs. Sex.
| Women | Men | Total | |
| In Labor Force | 350 | 450 | 800 |
| Not in Labor Force | 150 | 50 | 200 |
| Total | 500 | 500 | 1000 |
What is the "Percent of women in the labor force" ?
Write your
answer down on a scrap of paper.
When you write or see percents, be clear what
is on the bottom of the fraction (even if it takes longer to
say)!!.
From the New Yorker magazine, traditionally
the most literary and error-free of all, Feb.14/21, '05:
CORRECTION: The Mail of January 3rd contained the incorrect statistic that four-fifths of Bush voters identified moral values as the most important factor in their decision. In fact, four-fifths of those identifying moral values as the most important factor of their decision were Bush voters.Marginal distribution: Distribution of one variable, ignoring/summingover the other.
|
|
Conditional distribution: Distribution of one variable,
with the individuals being only those which satisfy a condition in the
other variable.
For women, their conditional distribution
as to working status; For
men, their distribution as to working status.
"Column %s"--columns add to 100%: "conditional distributions
of working status by sex ".
| Women | Men | Total | |
| In Labor Force | 350/500 = 70% | 450/500 = 90% | 80% |
| Not in Labor Force | 150/500 = 30% | 50/500 = 10% | 20% |
| Total | 500/500=100% | 500/500=100% | 100% |
For those in the labor force, conditional
distribution as to sex.
For those not in the labor
force, conditional distribution as to sex.
"Row
%s"--rows add to 100%: "conditional distributions of sex by working
status."
| Women | Men | Total | |
| In Labor Force | 350/800 = 43.8% | 450/800 = 56.2% | 800/800=100% |
| Not in Labor Force | 150/200 = 75% | 50/200 = 25% | 200/200=100% |
| Total | 50% | 50% | 100% |
Graphs to compare proportions: parallel sets of
bar graphs, see text, p. 603,.
Segmented (stacked) bar charts, of % (so
total length the same) (Redundant if there are only 2 segments)
% Women O
% Men X
OOOOOOOOOOOOOOXXXXXXXXXXXXXXXXXX In Labor Force
OOOOOOOOOOOOOOOOOOOOOOOOXXXXXXXX Not in Labor Force
Can do segmented bars of raw numbers, conveys different info:
25 Women O
25 Men X
OOOOOOOOOOOOOOXXXXXXXXXXXXXXXXXX In Labor Force
OOOOOOOOXX
Not in Labor Force
Next: SPSS (first handout, p. 6), Simpson's paradox.
Then back to Ch. 3
| Sievers home | Math251-Fall05/Dayps13.htm | 1:45pm | 9/23/05 |
| Women | Men | Total | |
| In Labor Force | 350 | 450 | 800 |
| Not in Labor Force | 150 | 50 | 200 |
| Total | 500 | 500 | 1000 |