Assn 1: Simple SELECTs
Due: 5:00pm, 5 Feb 2014. Value: 25 points.
Your assignment is to compose SQL queries for a variety of tasks based on a pre-built database.
The database
To get the SQLite database, you can use one of two techniques:
graddata.db:Download and use the pre-built database. dbcreate.py:Run the Python program to create the database file.
(I mention the second technique just in case the first file somehow has an incompatible format.)
To start SQLite, enter the directory where the file
is located and issue the command
“sqlite3 graddata.db”.
From there, you can simply type SQL queries at the keyboard. (You
can use the arrow keys to go back through your history and
to edit the previous queries.)
However, I recommend opening a text editor as well,
composing your query in the text editor, and then copying and
pasting into SQLite.
The database includes three tables based on publicly available data at the U S Department of Education, including data reported by each college on how many students from the college graduated with each possible major. Rather than working with their full data, though, this database includes just nine small Southern private colleges (Birmingham-Southern College, Centenary College of Louisiana, Centre College, Hendrix College, Millsaps College Rhodes College, Sewanee — University of the South, Southwestern University, and Trinity University), with Bachelor's degree data only. It includes data for four years, academic year 2008–09 through 2011–12.
table columns colleges:ida numeric key for the college (integer) namethe name of the college (text) citythe name of city where the college is (text) statethe two-letter postal state abbrevation (text) zipthe zip code (text) majors:cipa numeric key for the major (integer) majorthe name of the major (text) grads:ida reference to colleges.id(integer)yearthe year the data is from, between 2009 and 2012 (integer) cipa reference to majors.cip(integer)isfirst1 if this count is of first majors, 0 for second majors (integer) numthe number of students in this category (integer)
The problems
Your assignment is to compose and test queries that accomplish each of the following using the above-described database. Your solution file, submitted through Moodle, should be an ASCII text file including your solutions. Please number your solutions according to the below list.
For each college, list the number of 2009 graduates with a first major in computer science (CIP 110701), along with the college's name. (Don't worry about colleges with no computer science students.)
Do the same, but include colleges with no computer science students. I found I needed to move the conditions in the WHERE clause into the JOIN's ON clause so that zero-student colleges are included.
For each college, list the total number of students graduating across all majors (count first major only) over the four-year period, along with the college's name.
For each college, list the total number of students majoring in computer science (first or second major) over the four-year period, along with the college's name, sorted in increasing order by the total number of computer science graduates. (Don't worry about colleges with no computer science students.)
List the number of students completing each major at Hendrix (ID 107080) over the four-year period, along with the major name, sorted by the number of students.

