SQL SELECT

Introduction

This exploration is designed to help guide you through selecting items using SQL syntax. This is the most common operation you are going to run into as most databases tend to be read from a lot more than they are written to. When we are selecting things in SQL we can specify the table, the columns and the conditions we will use to select. In addition we have the ability to aggregate results by taking things like averages or sums across rows based on the content of the row.

SELECT Syntax

In SQL it is convention to capitalize keywords. It isn’t required but helps make it more clear what is a SQL keyword and what is a table or column name. So the basic syntax for a SELECT statement is as follows:

SELECT column1, column2, ... columnN FROM table WHERE condition;

So we have the following parts of this

SELECT
This is the keyword indicating that a SELECT command is being run.
column1 … columnN
This is a list of column names we want to select from our table. We simply specify them by name and separate them with ,. If we want we can rename this by replacing column1 with column1 AS new_name. Assuming there was previously a column called column1 it would show up as new_name in our result set. This is helpful when you are selecting data from several tables and there might be naming conflicts between them.

You can use SELECT * to select all columns. This is fine for working with data. But if you are planning to use the query in a program you should explicitly specify all the columns. If someone alters the structure of the table you can run into trouble because the columns that used to exist don’t any longer.

FROM table
This is where we specify what table we want to select from. table is simply replaced with the name of the table we want to select from.
WHERE condition
Finally this is where we provide conditions to filter out rows we don’t want. You might see things like WHERE salary > 100000. We will talk more about the sorts of conditions we can use shortly.
;
Queries are terminated with a ;. Sometimes the interpreter will be nice and add one for you but it is better to add it yourself and absolutely required if you run multiple queries in a row.

An example query might look like

SELECT fname, lname FROM bsg_people WHERE age > 40;

This would select the first name and last name of people in the bsg_people table only if their age is greater than 40.

A note on ` and ‘

This trips up a lot of people. There are two very similar characters that are used in SQL one is the accent grave ` the other is the single quote '. The accent grave is typically found on the key to the left of the 1 key and it leans to the left. The single quote is usually found to the left of the return key and is vertical.

You can surround names of things in SQL with accent graves. For example `bsg_people` is acceptable and would be interpreted as a table name. You put single quotes around strings. So 'this is a string' would be interpreted as a string as would 'bsg_people'. Neither of these would work as a table name because they are strings.

If you see automatically generated SQL code it will almost always have accent graves around all table and column names.

You might ask, “Why? Why do such a thing!?” Well, the reason is because SQL wants to allow people to do terrible terrible things like create a table name with spaces `our employees` for example. Never do this. Pretend like I never even mentioned it was a possibility. Whoever decided it was a good idea when designing the SQL spec needs to consider other lines of work. Kind of like this guy.

But, back to the topic at hand which involves absolutley no spaces in table or column names.

Conditions

MySQL has documentaion (which is fairly hard to navigate) but does include a list of operators. They are pretty much what you might expect. I will talk about a couple notable exceptions.

=
This is the test of equality. One = not two. Weird right?
<> or !=
The not equal notation. The weird one is <>
IS
This is how we check for a boolean value. We can see if something IS TRUE or IS FALSE.
IS NULL
We will talk about NULL in a second but this is how we see if something is NULL.
LIKE
LIKE is used to do simple pattern matching. You can use the wildcards % to match any number of characters and _ to match exactly one character. So we could search for WHERE fname LIKE 'A%' to find peoples who’s first name starts with the letter ‘A’ or something like A %time ago to match phrases like “A long time ago” or “A long long time ago” or just “A time ago”.

You can string together operators just like in Python to do something like WHERE age > 18 AND age < 65 to select people between 18 and 65.

Null Values

In SQL you need to be a bit cautious around NULL. Essentially logic in SQL is three valued. You can have TRUE, FALSE and NULL. A null value represents an unknown or missing value. So say you were comparing the departments of employees from two tables you might have a condition like WHERE employee1.manager = employee2.manager. You might expect to find that this would return all employees who have the same manager, which it does. It does not, however, return employees who both have a manager of NULL. To also select them you would need a condition like employee1.manager IS NULL AND employee2.manager IS NULL.

So be careful around nulls. If one is involved in a boolean expression it will generally end up NULL rather than TRUE or FALSE.

Some Examples

Lets start with the following table of some truly amazing fictional movies for rent from a video rental store, this table is called demo_table:

film_id title description release_year rental_duration rental_rate length replacement_cost
50 BAKED CLEOPATRA A Stunning Drama of a Forensic Psychologist And a Husband who must Overcome a Waitress in A Monastery 2006 3 2.99 182 20.99
100 BROOKLYN DESERT A Beautiful Drama of a Dentist And a Composer who must Battle a Sumo Wrestler in The First Manned Space Station 2006 7 4.99 161 21.99
150 CIDER DESIRE A Stunning Character Study of a Composer And a Mad Cow who must Succumb a Cat in Soviet Georgia 2006 7 2.99 101 9.99
200 CURTAIN VIDEOTAPE A Boring Reflection of a Dentist And a Mad Cow who must Chase a Secret Agent in A Shark Tank 2006 7 0.99 133 27.99
250 DRAGON SQUAD A Taut Reflection of a Boy And a Waitress who must Outgun a Teacher in Ancient China 2006 4 0.99 170 26.99
300 FALCON VOLUME A Fateful Saga of a Sumo Wrestler And a Hunter who must Redeem a A Shark in New Orleans 2006 5 4.99 102 21.99
350 GARDEN ISLAND A Unbelieveable Character Study of a Womanizer And a Madman who must Reach a Man in The Outback 2006 3 4.99 80 21.99
400 HARDLY ROBBERS A Emotional Character Study of a Hunter And a Car who must Kill a Woman in Berlin 2006 7 2.99 72 15.99
450 IDOLS SNATCHERS A Insightful Drama of a Car And a Composer who must Fight a Man in A Monastery 2006 5 2.99 84 29.99
500 KISS GLORY A Lacklusture Reflection of a Girl And a Husband who must Find a Robot in The Canadian Rockies 2006 5 4.99 163 11.99

Example 1

Lets do a simple query to find the rentals that are less than $1.

SELECT * FROM `demo_table` WHERE rental_rate < 1;
film_id title description release_year rental_duration rental_rate length replacement_cost
200 CURTAIN VIDEOTAPE A Boring Reflection of a Dentist And a Mad Cow who must Chase a Secret Agent in A Shark Tank 2006 7 0.99 133 27.99
250 DRAGON SQUAD A Taut Reflection of a Boy And a Waitress who must Outgun a Teacher in Ancient China 2006 4 0.99 170 26.99

This one was pretty straight forward, it just returned the rows with a rental_rate of $0.99 because they were less than $1.


Example 2

Lets say we are only interested in stunning films. We could use the LIKE keyword as follows:

SELECT * FROM  `demo_table` WHERE description LIKE  '%stunning%';
film_id title description release_year rental_duration rental_rate length replacement_cost
50 BAKED CLEOPATRA A Stunning Drama of a Forensic Psychologist And a Husband who must Overcome a Waitress in A Monastery 2006 3 2.99 182 20.99
150 CIDER DESIRE A Stunning Character Study of a Composer And a Mad Cow who must Succumb a Cat in Soviet Georgia 2006 7 2.99 101 9.99

As we can see LIKE is not case sensitive but we found all the movies which have stunning in their descriptions.


Example 3

Finally lets do a little more complex query. Lets say we want to find the title and description of the movies that cost more than $1 to rent, are longer than 2 hours and that we can rent for 7 days.

SELECT title, description FROM demo_table WHERE rental_rate > 1 AND length > 120 AND rental_duration = 7;
title description
BROOKLYN DESERT A Beautiful Drama of a Dentist And a Composer who must Battle a Sumo Wrestler in The First Manned Space Station

We get but a single result, BROOKLYN DESERT, it being the only row that met all of our conditions.

Sorting

Sorting is straight forward. We add a ORDER BY clause to the end of our SELECT statement.

SELECT * FROM bsg_planets ORDER BY population DESC;
id name population language capital
17 Canceron 6700000000 NULL Hades
3 Caprica 4900000000 Caprican Caprica City
22 Virgon 4300000000 NULL Boskirk
1 Gemenon 2800000000 Old Gemenese Oranu
2 Leonis 2600000000 Leonese Luminere

The syntax is

ORDER BY column1 {DESC|ASC}, column2 {DESC|ASC} ...

So we simply specify the columns we want to sort on in the order we want them to have precedence and then say if we want to sort the items in descending or ascending value.

Limiting

We can also limit the number of results by adding LIMIT X to the end of the query where X is the number of rows we want returned.

SELECT * FROM bsg_planets ORDER BY population DESC LIMIT 1;

Returns only the first result

id name population language capital
17 Canceron 6700000000 NULL Hades

Subqueries

Next we are going to talk about subqueries. These let us us the results of one query in another query. Sometimes you can use a subquery rather than a join and have things be quite a bit faster. If you are writing a program and need a primary key of a row it is a lot better to use a subquery to get it rather than hard code it.

The only interesting thing about subqueries is that you can alias the table so that you can refer to it. We do that by wrapping it parens and using AS. For exmaple, if we wanted to find all the people from Caprica we could use a subquery like this

SELECT fname, lname FROM bsg_people
WHERE homeworld = (SELECT id FROM bsg_planets WHERE name = 'Caprica');
fname lname
William Adama
Lee Adama
Laura Roslin
Kara Thrace
Gaius Baltar

This query returned all the characters from Caprica.

Aggregate functions

These are a bit more advanced and something we won’t probably be using too much of. But we can generate aggregate results using aggregate functions and the GROUP BY clause.

Lets look at an example.

SELECT fname, lname, COUNT( * ) AS certs
FROM bsg_people
INNER JOIN bsg_cert_people ON bsg_cert_people.pid = bsg_people.id
GROUP BY bsg_people.id
ORDER BY fname ASC 

This selects the first name, last name and the number of times someone shows up in our table. In this case we joined people with certifications and got a count of the number of times they showed up. We also renamed the COUNT(*) as to certs using the AS keyword.

fname lname certs
Callandra Henderson 1
Galen Tyrol 1
Kara Thrace 1
Karl Agathon 1
Laura Roslin 1
Lee Adama 2
Margaret Edmondson 1
Saul Tigh 1
Sharon Valerii 1

So this table shows us how many certifications each person has. COUNT(*) just tells us how many times the thing we GROUP BY shows up. In this case we GROUP BY bsg_people.id, so it gives us a count of how many times each id shows up. Because we joined people with the people_cert table there is one row for each person/certification combo that exists.

Finally lets finish this by finding the maximum number of certifications someone has.

SELECT MAX(temp_table.certs) FROM 
(SELECT fname, lname, COUNT( * ) AS certs
FROM bsg_people
INNER JOIN bsg_cert_people ON bsg_cert_people.pid = bsg_people.id
GROUP BY bsg_people.id
ORDER BY fname ASC) AS temp_table;
MAX(temp_table.certs)
2

This illustrates pretty much everything you might want to do in a select statement, we alias things, use aggregate functions and join tables. I will offer one word of warning this query will not return what you think.

SELECT fname, lname, MAX(temp_table.certs) FROM 
(SELECT fname, lname, COUNT( * ) AS certs
FROM bsg_people
INNER JOIN bsg_cert_people ON bsg_cert_people.pid = bsg_people.id
GROUP BY bsg_people.id
ORDER BY fname ASC) AS temp_table;

We want the name of the person with the maximum number of certifications. We saw that was Lee Adama earlier.

But this table returns the following

fname lname MAX(temp_table.certs)
Callandra Henderson 2

What is going on?

Well we didn’t GROUP BY anything. So it looks at the entire result set as a group and finds the maximum, which is 2. Thats fine. The problem is that it assumes any row in the group is representative of the group. So it takes the first row to pull the fname and lname from. Thats bad news.

A correct way to do this would be as follows.

SELECT fname, lname, COUNT( * ) AS certs
FROM bsg_people
INNER JOIN bsg_cert_people ON bsg_cert_people.pid = bsg_people.id
GROUP BY bsg_people.id
ORDER BY certs DESC LIMIT 1;
fname lname certs
Lee Adama 2

That gets us the result set we are looking for.

Activity

I would suggest working with a partner if you can. Take turns where one person comes up with a list of criteria for a selection query and the other implements that query.

This will give you practice with two different things. One is writing selection queries, but the other is interpreting language into a query which, once you get some practice will be the far harder task. When someone says they want PG-13 and G rated movies shorter than 100 minutes, they they want all PG-13 movies and only G movies shorter than 100 minutes? You will find out that they way you communicate these sorts of queries is often ambiguous and requires a lot of clarification.