More with SQL SELECT

Contents:
1. The ORDER BY clause
2. Joins
3. Options for WHERE clause
4. Table aliases
5. Subqueries
6. The HAVING clause
7. Using NULL
8. Aggregate functions
9. Combining results

We'll examine SQL SELECT queries based on the following three relations for storing information about student grades stored in courses.

students(id, name)
grades(id, num, grade)
sections(num, title, credits, instructor_id)

1. The ORDER BY clause

A SELECT query can optionally have at its end an ORDER BY clause that sorts the output. The following shows all students' names in alphabetical order, for example.

SELECT name FROM students ORDER BY name

Or if we want it in reverse order, we would do the following:

SELECT name FROM students ORDER BY name DESC

By default this does case-sensitive ordering (so that Decker would follow DeLeon, since the ASCII value for c is greater than the ASCII value for L.) If we want it a case-insensitive ordering (so that Decker comes before DeLeon), we can sort of the all-capitals version of each name:

SELECT name FROM students ORDER BY UPPER(nameDESC

2. Joins

Suppose we want to show total credits earned by each student (excluding those who never enrolled, and F's don't count for credit):

SELECT nameSUM(credits)
FROM students JOIN grades ON students.id = grades.id
              JOIN sections ON grades.num = sections.num
WHERE grade <> 'F'
GROUP BY students.id

Both of these, though, would exclude students who never completed any courses successfully. To include these students, we need LEFT JOIN.

SELECT nameSUM(credits)
FROM students LEFT JOIN
        (grades JOIN sections ON grades.num = sections.num)
     ON students.id = grades.id
WHERE grade <> 'F'
GROUP BY students.id

3. Options for WHERE clause

Let's suppose we also want to exclude from our sum courses where the student earns a W (for withdrawing from the course). The simple way is to use:

WHERE grade <> 'F' AND grade <> 'W'

But we could also use NOT IN:

WHERE grade NOT IN ('F''W')

There's also a LIKE operator that provides basic regular expressions: Suppose we want to list all CSCI courses.

SELECT numtitle FROM sections WHERE num LIKE 'CSCI%'

A percent sign stands for any sequence of characters, and an underscore stands for any single character. All other characters stand for themselves, though comparison is case-insensitive.

Another somewhat unusual operator is BETWEEN. This lists courses whose credit value is between 0.01 and 0.99 (inclusive).

SELECT numtitle FROM sections WHERE credits BETWEEN 0.01 AND 0.99

4. Table aliases

Suppose we want to list all students who have taken a course from an instructor named Burch. In this case, we want to join students to grades to sections to students. Since students is listed twice, we need to provide a distinct name to one of them; in this case, we'll use ins to name the second instance of students.

SELECT students.name
FROM students JOIN grades ON students.id = grades.id
              JOIN sections ON grades.num = sections.num
              JOIN students AS ins ON instructor_id = ins.id
WHERE ins.name = 'Burch'

5. Subqueries

We are allowed to nest one SELECT query within another. We can use this to get an alternative way of listing all students who have taken a course taught by Burch.

SELECT students.name
FROM students JOIN grades ON students.id = grades.id
WHERE num IN (SELECT num
              FROM sections JOIN students ON instructor_id = id
              WHERE name = 'Burch')

Or we can include a subquery in the FROM clause:

SELECT students.name
FROM students JOIN grades ON students.id = grades.id
              JOIN (SELECT num
                          FROM sections JOIN students ON instructor_id = id
                          WHERE name = 'Burch'AS bsecs
                 ON sections.num = bsecs.num

6. The HAVING clause

Suppose we want to show all students who never enrolled in a course. We could do this using subqueries, but an alternative appoach would do a left join of students with grades, grouped by student IDs, and find those students who aren't associated with any grades.

SELECT name
FROM students LEFT JOIN grades ON students.id = grades.id
GROUP BY students.id
HAVING COUNT(grades.id) = 0

(This uses the COUNT aggregate function, which like SUM tallies up a column throughout each group. In the case of COUNT, though, it counts up the values appearing in the group.)

The HAVING clause is like the WHERE clause, except that it is performed after breaking the output into groups. This allows us to use aggregate functions, which would not have any meaningful effect in a WHERE clause. By the way, including both a WHERE clause (before GROUP BY) and a HAVING clause (after GROUP BY) is typical.

7. Using NULL

What if we want to exclude courses that the student has not yet completed? We might require that all such courses be marked with a temporary grade like IP (in progress); but we might also choose to use a NULL value for such grades, to be filled in later once the course is complete.

Suppose we use NULL. You might be tempted to write “WHERE grade <> NULL”. This will not work because SQL uses three-values logic, whose values are true, false, and unknown Nearly anything involving NULL results in unknown — even NULL = NULL is considered unknown! And NULL <> NULL is also unknown! The reasoning is that NULL could mean anything, so a database shouldn't consider it as being equal to anything or unequal to anything.

A WHERE clause only selects a row when its value is true; a value of unknown is treated like false. So if we write “WHERE grade <> NULL”, in fact we would get no results at all.

To get around this, you would use the special operator “IS NULL” or “IS NOT NULL”.

SELECT nameSUM(credits)
FROM students LEFT JOIN
        (grades JOIN sections ON grades.num = sections.num)
     ON students.id = grades.id
WHERE grade NOT IN ('F''W'AND grade IS NOT NULL
GROUP BY students.id

8. Aggregate functions

We've seen the use of SUM to add up a column within each group, and we briefly saw the COUNT. These are aggregate functions, which combine cells from a group. SQL defines several.

AVG(x) Yields average of non-NULL values of x in group
COUNT(x) Yields number of non-NULL values of x in group
COUNT(*) Yields number of rows in group
MAX(x) Yields largest value of x in group (ignoring NULL)
MIN(x) Yields largest value of x in group (ignoring NULL)
SUM(x) Yields total of non-NULL values of x in group

With the exception of COUNT with an asterisk in the parentheses, all NULL values are ignored. The result of these aggregate functions is never NULL unless all values in the group are NULL (COUNT yields 0 in this case, but the others yield NULL).

9. Combining results

You can combine the results of two queries using the UNION operator. Following lists the names of all students who have earned an A in a CSCI course or who have taken a course from Burch.

SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE grade = 'A' AND num LIKE 'CSCI%'
  UNION
SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE num IN (SELECT num
              FROM sections JOIN students ON instructor_id = id
              WHERE name = 'Burch')

If you use UNION ALL, then the DBMS will not attempt to remove duplicates.

You can use INTERSECT to find the intersection of two sets. This lists the names of all students have have both earned an A in a CSCI course and have taken a course from Burch.

SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE grade = 'A' AND num LIKE 'CSCI%'
  INTERSECT
SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE num IN (SELECT num
              FROM sections JOIN students ON instructor_id = id
              WHERE name = 'Burch')

And finally you can use EXCEPT to list those that appear in one set but not another. Following lists all students who received in an A in a CSCI course, but who have not taken a course from Burch.

SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE grade = 'A' AND num LIKE 'CSCI%'
  EXCEPT
SELECT name
FROM students JOIN grades ON students.id = grades.id
WHERE num IN (SELECT num
              FROM sections JOIN students ON instructor_id = id
              WHERE name = 'Burch')