CS 103 Day 5, Wednesday April 4, 2001
Class notes:
Deriving information from data: Chapter 3, pp. 62-72
-
Organizing batch of quantitative data using frequency table,
cumulative frequency table, displaying as histogram. (Bar chart
or
Pareto (ordered bar chart) for qualitative data.)
-
Summary numbers to describe the data set (these are some):
-
Percentiles: Special ones are Median, 1st, (2nd), 3rd quartile.
Min and Max.
-
Spread: Range (=Max - Min) and Interquartile range (IQR)
=3rd quartile - 1st quartile
-
IQR tells the spread of the middle half of the data.
-
The Mean (p. 62) is the measure of middle (p. 85) we usually
think of as the "average". You find it by summing all the
items and dividing by the number of items. The median is another
measure of middle, with half the items smaller and half larger. The
mode,
or modal interval, is the most common ("popular?") item or interval.
-
"Five-number-summary"= Min, 1st quartile, Median, 3rd quartile, Max.
Used
for boxplot (next)
Learning Excel: Chapter 3, pp. 62-72
-
p. 64 Analysis ToolPak add-ins. Add-ins tend to be more
fragile than the base package--stay mellow. Pitfall:
add in Analysis ToolPak AND Analysis ToolPak VBA.
(VBA stands for Visual Basic--needed to draw the histograms here.
If you leave it off, the error message you get is very misleading!)
-
p.68 Histogram. We'll learn how to change the looks
of charts in more detail later.
-
In the histogram options is "Pareto": "ordered histogram". This
is only suitable for bar-chart (qualitative) data, since the bin
ranges are pretty arbitrary and the order can change. But very nice
for qualitative data.
-
Statistician's quibbles: A histogram of continuous data should
not have bars with spaces between them. Bars with spaces between
are for "bar charts"--for categorical/qualitative data. (In
fact, the "add-in" is using the Excel "bar chart" to make this histogram.)
Also, the histogram labels should actually label the right hand edges of
the bars, not the middles as they do (which would be appropriate
for the bar chart Excel thinks it's making)
-
If you want to try getting rid of the spaces between the bars, try
this: Click on a bar. Then go to the Format menu>Data series>Options:
Set gap width to 0.
-
p. 69 Changing bin intervals:
-
Problem: When
you return from making a histogram, the spreadsheet will have scrolled
way down in the list. Because the top row is in Freeze Pane mode,
you may not notice this. Be sure you have scrolled all the way to
the top (have rows 1, 2, 3...) before starting this section, or your bin
values will be lost in space.
-
#4 You want your bin values to go from 1000 to 7500
by 500's.
-
#7 Error Select
H1:H15 not H2:H15.
You will check the labels box in #8, which will make the first row a label.
So you need to include "New Bin", the label, in your range.
-
p. 71 Add two
rows to the summary statistics:
IQR
(Number)
Mean (Number)
Here's how: In A26 type IQR,
in
B26 type = B23-B21 (i.e. 3rd quartile -1st quartile)
In A27 type Mean, in B26 type =Average(Price)
Next: Ch. 3 continued. Read ahead for language and content.
Errors: p. 79-80, A19 and A22 should be either 1st and 3rd Q (quartile)
OR
25th and 75th P (Percentile)
To
Sievers Home Page
CS103-Sp01/day5.htm 4/4/01
This page belongs to Sally Sievers who is solely responsible
for its content. Please see our statement
of responsibility.