CS 103 Day 6, Friday April 6, 2001

Class notes:
Deriving  information from data: Chapter 3, pp.72-87 Learning Excel:    Chapter 3, pp. 72-87 Next:  More on logarithms.  Chapter 4:  The details of making graphs has changed--don't try to work ahead.  You can read for the goals--what can we learn from these data sets-- "the chart shows..."

Assignments available as a handout
From Chapter 3, pp. 87-89
 Unless otherwise noted, print out and hand in the spreadsheet showing what you created. 
 Some questions call for written answers based on your examination of the data.  This is just as
 important as making the machinery produce the right tables!  You may write on the same page as the
 printed spreadsheet. 

Assignment 5,  Due Wednesday, April 12 (Day 8, in Week 3) 
#1.  Make the boxplot with pencil and paper, by hand!  Do a plain vanilla boxplot.  (boxplot, skewness) 

#9a, b.  Boxplots in bunches give a compact way to compare related data sets. 
       Use the original POLU.XLS file.  Put your boxplot below your data by using "Send output to: Cell: A19". 
       Stretch the graph sideways so each boxplot is more or less under its column of data. 
        PRINT OUT the sheet with the data and the boxplots. 
        For part b:  Describe the general trend in air pollution over time, and discuss the outlier(s). 

# 3. all.  (boxplots, sales per employee)  Use the 3WBUS.XLS file from p. 77. 
    (You don't need any of the things added to the file in the text to do this problem.) 
    "Sales per $1000” means sales measured in thousands of dollars.  The formula should be 
    sales/1000 /employees 

Assignment 6,  Due Friday, April 14 (Day 9, in Week 3) 
        Note Percentile(range, percentile) needs the percentile in decimal form--use .20 for 20%, etc.
#4 all.  (Baseball) Print histogram, frequency table, boxplot, statistics in part b, and filtered data. (you can do this before you do logarithms, but  hand it in with the rest of  Asst. 6). 

#5. (modified)  Transform baseball salary data using the log transformation, making a variable log_salary. 

  • Make and print  a boxplot of the transformed data (log_salary ), compare it to the one in problem 4.
  • Print a histogram for the tranformed data  (log_salary )
  • Answer 5a and 5b. 
  •  Do 4b and 4c (that is, find 10th and 90th percentiles, filter for the players in upper 10%) on the tranformed data  (log_salary).  Do you get the same list of players in the top 10% using log_salary as you got for the original salary data?  Why/why not?
#2. a,b,e  (Wisc. business: compare skewness, kurtosis for raw and log data) 


To Sievers Home Page
CS103-Sp01/day6.htm   4/10/01
This page belongs to Sally Sievers who is solely responsible for its content. Please see our statement of responsibility.