SQL

Table of Contents

Basic Concepts

SQL is used for managing, manipulating, and retrieving data stored in relational databases.

Relational databases are clusters of data relations usually defined by data sets, their attributes, and their combination. Relations are presented as sequences of unique attributes known as tuples.

When problem domain concepts model data sets as relations such as tables, views, and join tables, their attributes define the columns, as well as the valid data type for attribute values. Instances of a concept are thus presented as rows. A unique attribute combination may indicate the need for a new relation.

Attribute values may be further constrained to make sure only good quality data is stored. For instance, since hardly any concept is properly represented by null it’s consider good practice to define default attribute values.

Another concept that shows up frequently when problem domain concepts model data sets is that of relationship. Although outside of the scope of this cheat sheet, let’s just say that relationships describe how instances of concepts may relate to one another.

Retrieve Data

Retrieve all of <a>, <b>, and <c>’s attribute values from <relation>.

Select <attribute_a>, <attribute_b>
From <relation>

To retrieve all data in <relation>.

Select *
From <relation>

Unique values

Blindly discard duplicate sequences of attributes eg. rows.

Select DISTINCT  <attribute>
FROM <relation>;

Attribute Functions

SQL allows us to analyze attribute values through various functions. These functions compute single results from a set of multiple attribute values. The follow a formula similar to:

Select <attribute>, <func>(<attribute>)
FROM  <relation>;

Whenever we need to

description function
Add all of <attribute>’s values Sum(<attribute>)
Get the average <attribute> value Avg(<attribute>)
Round <attribute>’s values to <N> decimals Round(<attribute>, <N>)
Get highest <attribute> value Max(<attribute>)
Get lowest <attribute> value Min(<attribute>)
Get number of rows with any value except Null Count(<attribute>)

Data Aggregation

Collections

When we need to put identical data together we can:

Select *, Avg(<attribute>)
FROM <relation>
GROUP BY <clause>;

We can further filter that aggregate appending a <restriction> using

Having <restriction>;

Having, unlike other filtering functions, only works on data aggregates.

Filtering and <restriction>s are covered below.

Conditions

SQL is capable of returning different data relations having met one of possibly multiple conditions.

Case
When <restriction_a> Then <relation_a>
When <restriction_b> Then <relation_c>
Else
  <default_relation>
End

Note:

  • Else case is optional.
  • Case returns Null when there’s no matching case.

The resulting relation can be aliased, using As, to make it easier to refer to it.

Aliasing

Aliases can help handling data aggregates, and relations.

As can be used to temporarily rename an attribute or existing data relation. It can also temporarily name resulting relations to make it easier to refer to them.

Select <extremly_long_attribute_name>  As <attribute>
From <extremly_long_relation_name_1> As <relation_1>
Join <extremly_long_relation_name_2> As <relation_2>
  On <relation_1>.id = <relation_2>.id
Order By <attribute> Desc ;

-- another example

Select <attribute>,
  CASE
  WHEN <restriction_1> THEN <relation_1>
  WHEN <restriction_2> THEN <relation_2>
  ELSE <default_relation>
  End As <result_relation>
FROM <relation>

Filter Data

Restrictions

A common way of refining queries is the use of restrictions. Some of the most common restriction operators are for:

Comparison.

=, !=, <>, <, >, <=, >=
-- eg.
<attribute> = <value>

Inclusion.

<attribute> Between <value_a> And <value_b>

Exclusion.

<attribute> Not Between <value_a> And <value_b>

Belonging.

<attribute> In(<a>, <b>, <c>)

Absence.

<attribute> Not In(<a>, <b>, <c>)

Similarity.

<attribute> Like <pattern>

Difference.

<attribute> Not Like <pattern>

Like, and Not Like may include the % wildcard to match zero or more characters, and/or _ which only matches a single character.

Like %on% # matches common, honor
Like _ove # matches love, rove, cove

When for valid reasons we must allow Null as an attribute value, we need to query for its presence with Is Null or Is Not Null. = Null shouldn’t work.

Conditional filters

The most common way to specify the data we wish to retrieve is through Where.

Select <attribute>
From <relation>
Where <restriction>;

To refine our query we can use the conditionals And, Or.

Select <attribute>
From <relation>
Where <restriction_a>
And <restriction_b>
Or <restriction_c>;
Data Arrangement

To present data either in ascending (Asc) or descending (Desc) order:

Select *
From <relation>
Where <restriction_a>
Order By <attribute> Desc;

We can Limit the maximum number of tuples our query returns. Even retrieve it in chunked sections:

Select *
From <relation>
Where <restriction_a>
Order By <attribute> Desc
Limit <section_size> OFFSET <skipped_section_size>;

Data Set Aggregation

As we gather data we might find useful to combine existing relations to retrieve information.

References

We can avoid attribute name clashes by referencing the relation it belongs to.

SELECT composer.first_name
FROM composer
WHERE composer.last_name = 'Doe';

Join

A join aggregates data from two or more relations based on their relationship. That is, how each set of tuples relate to one another.

INNER JOIN

       +--------------+
       |              |
       |              |
+--------------+      |
|      |///////|      |
|      |///////|      |
|      |///////|      |
|      +--------------+
|              |
|              |
+--------------+

Inner joins return all tuples from two or more tables that meet the restriction.

SELECT *
FROM <relation_a>
  INNER JOIN <relation_b>
  ON <relation_a>.<key> = <relation_b>.<foreign_key>;

Although, we used * and = in the example above, the join condition can be any of the operators described in the restrictions section.

LEFT JOIN

       +--------------+
       |              |
       |              |
+--------------+      |
|//////|///////|      |
|//////|///////|      |
|//////|///////|      |
|//////+--------------+
|//////////////|
|//////////////|
+--------------+

Sometimes referred to as the Left Outer Join. Returns all rows from the left-hand relation and the tuples from the right-hand relation that meet the restriction.

SELECT <attributes>
FROM <left_relation>
LEFT JOIN <right_relation>
ON <left_relation>.<attribute> = <right_relation>.<attribute>
No Intersection
       +--------------+
       |              |
       |              |
+--------------+      |
|//////|       |      |
|//////|       |      |
|//////|       |      |
|//////+--------------+
|//////////////|
|//////////////|
+--------------+

A variant on the left join. It only returns tuples from the left-hand relation.

SELECT <attributes>
FROM <left_relation>
LEFT JOIN <right_relation>
ON <left_relation>.<attribute> = <right_relation>.<attribute>
Where <right_relation>.<attribute> Is Null

RIGHT JOIN

       +--------------+
       |//////////////|
       |//////////////|
+--------------+//////|
|      |///////|//////|
|      |///////|//////|
|      |///////|//////|
|      +--------------+
|              |
|              |
+--------------+

Also known as Right Outer Join. Returns all tuples from the right-hand relation, as well as the tuples from the left-hand relation that also meet the restriction.

Select <attributes>
From <left_relation>
Right Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
No intersection
       +--------------+
       |//////////////|
       |//////////////|
+--------------+//////|
|      |       |//////|
|      |       |//////|
|      |       |//////|
|      +--------------+
|              |
|              |
+--------------+

Retrieve only the tuples in the right-hand relation that aren’t also part of the left-hand relation.

Select <attributes>
From <left_relation>
Right Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
Where <left_relation>.<attribute> Is Null

FULL JOIN

       +--------------+
       |\\\\\\\\\\\\\\|
       |\\\\\\\\\\\\\\|
+--------------+\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////+--------------+
|//////////////|
|//////////////|
+--------------+

Retrieve all of left-hand, and right hand’s tuples. When tuples in both relation don’t match, SQL returns Null for every attribute value missing.

Select <attributes>
From <left_relation>
Full Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
No intersection
       +--------------+
       |\\\\\\\\\\\\\\|
       |\\\\\\\\\\\\\\|
+--------------+\\\\\\|
|//////|       |\\\\\\|
|//////|       |\\\\\\|
|//////|       |\\\\\\|
|//////+--------------+
|//////////////|
|//////////////|
+--------------+

To retrieve all data related to both relation’s attributes other than their shared ones, and excluding Null:

Select <attributes>
From <left_relation>
Full Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
Where <left_relation>.<attribute> Is Null
Or <right_relation>.<attribute> Is Null

Combination

Another way of aggregating data is by combining together two or more relations’ attributes.

When we need to exclude duplicates:

Select <attributes>
From <relation_a>
Union
  Select <attributes>
  From <relation_b>;

To include duplicates:

Select <attributes>
From <relation_a>
Union All
  Select <attributes>
  From <relation_b>;

Whenever we need to retrieve the set of attributes present in two or more relations:

Select <attributes> From <relation_a>
Intersect
  Select <attributes> From <relation_b>;

In order to retrieve the sequences of attributes present in <relation_a> but not in <relation_b>:

Select <attribute_1>, <attribute_2>
From <relation_a>
Where <restriction_a>
Except
  Select <attribute_1>, <attribute_2>
  From <relation_b>
  Where <restriction_b>;

Manipulation

Add Data

We can add data through:

Insert Into <relation>(<attribute_1>, <attribute_2>)
Values(<value_1>, <value_2>);

Listing attributes is optional but consider best practice for clarity.

Edit Data

To edit several attribute values at once we can do:

UPDATE <relation>
SET <attribute_1> = <value_a>,
    <attribute_2> = <value_b>
WHERE <restriction>;

Excluding the Where filter updates all of an attribute’s values with the same value. As a preventive step, retrieve (Select) data before any changes.

Remove Data

To remove tuples of data based on a restriction

Delete From <relation>
Where <restriction>;

Beware, if the filter Where is left out we’ll delete all tuples in the relation.

Table Management

Add

To create a new table.

Create Table <name> (
  <attribute_1> <data_type_a> <attribute_constrains>,
  <attribute_2> <data_type_b> <attribute_constrains>,
  <attribute_3> <data_type_c> <attribute_constrains>,
);

Data types

Common data types:

Description Data Type
True, false; on, off Bool
action’s date and time Timestamp
fixed length string, spaced padded Char(<N>)
variable length string, up to ; no padding Varchar(<N>)
variable length string, “unlimited” Text
16-bits integers ranging -32768, 32767 SmallInt
32-bits integers ranging -214783648, 214783647 Int
64-bits integers ranging BigInt
min precision , max 8 bytes Float(<N>)
double-precision floating-point number Real
auto-incrementing number Serial
money, 64-bits numbers ranging -9EB, 9EB Money

Data Constraints

These are some of the possible attribute value constraints we can optionally set in a table.

Description Constraint
column is not allowed to contain null values Not Null
column’s fallback value when none is given Default
column values must be unique Unique
Boolean expression verified to add or edit data Check (expression)

We also have Primary Key, and Foreign Key. The Primary Key defines a column or set of columns as the main way of identifying a tuple. The sequence of values must be unique and not null. The Foreign Key must only contain values that match those of the Primary Key of some tuple of the referenced table.

Note: foreign key constraints cannot be defined between temporary tables and permanent tables.

Change

To change a table’s name:

Alter Table <old_name>
Rename To <new_name>;

Add a new table attribute ie. column

Alter Table <name>
Add <attribute> <data_type> <optional_table_constraints>, Default <value>;

Remove an attribute, and all it’s values.

Alter Table <name>
Drop <attribute>;

Delete

Remove table, and all it’s data.

Drop Table <name>;

Resources

Practice online: