Assn 1: Simple SELECTs

Due: 5:00pm, 5 Feb 2014. Value: 25 points.

Your assignment is to compose SQL queries for a variety of tasks based on a pre-built database.

The database

To get the SQLite database, you can use one of two techniques:

graddata.db: Download and use the pre-built database.
dbcreate.py: Run the Python program to create the database file.

(I mention the second technique just in case the first file somehow has an incompatible format.)

To start SQLite, enter the directory where the file is located and issue the command “sqlite3 graddata.db”. From there, you can simply type SQL queries at the keyboard. (You can use the arrow keys to go back through your history and to edit the previous queries.) However, I recommend opening a text editor as well, composing your query in the text editor, and then copying and pasting into SQLite.

The database includes three tables based on publicly available data at the U S Department of Education, including data reported by each college on how many students from the college graduated with each possible major. Rather than working with their full data, though, this database includes just nine small Southern private colleges (Birmingham-Southern College, Centenary College of Louisiana, Centre College, Hendrix College, Millsaps College Rhodes College, Sewanee — University of the South, Southwestern University, and Trinity University), with Bachelor's degree data only. It includes data for four years, academic year 2008–09 through 2011–12.

tablecolumns
colleges id a numeric key for the college (integer)
name the name of the college (text)
city the name of city where the college is (text)
state the two-letter postal state abbrevation (text)
zip the zip code (text)
majors: cip a numeric key for the major (integer)
major the name of the major (text)
grads: id a reference to colleges.id (integer)
year the year the data is from, between 2009 and 2012 (integer)
cip a reference to majors.cip (integer)
isfirst 1 if this count is of first majors, 0 for second majors (integer)
num the number of students in this category (integer)

The problems

Your assignment is to compose and test queries that accomplish each of the following using the above-described database. Your solution file, submitted through Moodle, should be an ASCII text file including your solutions. Please number your solutions according to the below list.

  1. For each college, list the number of 2009 graduates with a first major in computer science (CIP 110701), along with the college's name. (Don't worry about colleges with no computer science students.)

  2. Do the same, but include colleges with no computer science students. I found I needed to move the conditions in the WHERE clause into the JOIN's ON clause so that zero-student colleges are included.

  3. For each college, list the total number of students graduating across all majors (count first major only) over the four-year period, along with the college's name.

  4. For each college, list the total number of students majoring in computer science (first or second major) over the four-year period, along with the college's name, sorted in increasing order by the total number of computer science graduates. (Don't worry about colleges with no computer science students.)

  5. List the number of students completing each major at Hendrix (ID 107080) over the four-year period, along with the major name, sorted by the number of students.