Table of Contents
Basic Concepts
SQL is used for managing, manipulating, and retrieving data stored in relational databases.
Relational databases are clusters of data relations usually defined by data sets, their attributes, and their combination. Relations are presented as sequences of unique attributes known as tuples.
When problem domain concepts model data sets as relations such as tables, views, and join tables, their attributes define the columns, as well as the valid data type for attribute values. Instances of a concept are thus presented as rows. A unique attribute combination may indicate the need for a new relation.
Attribute values may be further constrained to make sure only good quality data is stored. For instance, since hardly any concept is properly represented by null
it’s consider good practice to define default attribute values.
Another concept that shows up frequently when problem domain concepts model data sets is that of relationship. Although outside of the scope of this cheat sheet, let’s just say that relationships describe how instances of concepts may relate to one another.
Retrieve Data
Retrieve all of <a>, <b>, and <c>’s attribute values from <relation>.
Select <attribute_a>, <attribute_b>
From <relation>
To retrieve all data in <relation>.
Select *
From <relation>
Unique values
Blindly discard duplicate sequences of attributes eg. rows.
Select DISTINCT <attribute>
FROM <relation>;
Attribute Functions
SQL allows us to analyze attribute values through various functions. These functions compute single results from a set of multiple attribute values. The follow a formula similar to:
Select <attribute>, <func>(<attribute>)
FROM <relation>;
Whenever we need to
description | function |
---|---|
Add all of <attribute>’s values | Sum(<attribute>) |
Get the average <attribute> value | Avg(<attribute>) |
Round <attribute>’s values to <N> decimals | Round(<attribute>, <N>) |
Get highest <attribute> value | Max(<attribute>) |
Get lowest <attribute> value | Min(<attribute>) |
Get number of rows with any value except Null | Count(<attribute>) |
Data Aggregation
Collections
When we need to put identical data together we can:
Select *, Avg(<attribute>)
FROM <relation>
GROUP BY <clause>;
We can further filter that aggregate appending a <restriction> using
Having <restriction>;
Having
, unlike other filtering functions, only works on data aggregates.
Filtering and <restriction>s are covered below.
Conditions
SQL is capable of returning different data relations having met one of possibly multiple conditions.
Case
When <restriction_a> Then <relation_a>
When <restriction_b> Then <relation_c>
Else
<default_relation>
End
Note:
Else
case is optional.Case
returnsNull
when there’s no matching case.
The resulting relation can be aliased, using As
, to make it easier to refer to it.
Aliasing
Aliases can help handling data aggregates, and relations.
As
can be used to temporarily rename an attribute or existing data relation. It can also temporarily name resulting relations to make it easier to refer to them.
Select <extremly_long_attribute_name> As <attribute>
From <extremly_long_relation_name_1> As <relation_1>
Join <extremly_long_relation_name_2> As <relation_2>
On <relation_1>.id = <relation_2>.id
Order By <attribute> Desc ;
-- another example
Select <attribute>,
CASE
WHEN <restriction_1> THEN <relation_1>
WHEN <restriction_2> THEN <relation_2>
ELSE <default_relation>
End As <result_relation>
FROM <relation>
Filter Data
Restrictions
A common way of refining queries is the use of restrictions. Some of the most common restriction operators are for:
Comparison.
=, !=, <>, <, >, <=, >=
-- eg.
<attribute> = <value>
Inclusion.
<attribute> Between <value_a> And <value_b>
Exclusion.
<attribute> Not Between <value_a> And <value_b>
Belonging.
<attribute> In(<a>, <b>, <c>)
Absence.
<attribute> Not In(<a>, <b>, <c>)
Similarity.
<attribute> Like <pattern>
Difference.
<attribute> Not Like <pattern>
Like
, and Not Like
may include the %
wildcard to match zero or more characters, and/or _
which only matches a single character.
Like “%on%” # matches common, honor
Like “_ove” # matches love, rove, cove
When for valid reasons we must allow Null
as an attribute value, we need to query for its presence with Is Null
or Is Not Null
. = Null
shouldn’t work.
Conditional filters
The most common way to specify the data we wish to retrieve is through Where
.
Select <attribute>
From <relation>
Where <restriction>;
To refine our query we can use the conditionals And
, Or
.
Select <attribute>
From <relation>
Where <restriction_a>
And <restriction_b>
Or <restriction_c>;
Data Arrangement
To present data either in ascending (Asc
) or descending (Desc
) order:
Select *
From <relation>
Where <restriction_a>
Order By <attribute> Desc;
We can Limit
the maximum number of tuples our query returns. Even retrieve it in chunked sections:
Select *
From <relation>
Where <restriction_a>
Order By <attribute> Desc
Limit <section_size> OFFSET <skipped_section_size>;
Data Set Aggregation
As we gather data we might find useful to combine existing relations to retrieve information.
References
We can avoid attribute name clashes by referencing the relation it belongs to.
SELECT composer.first_name
FROM composer
WHERE composer.last_name = 'Doe';
Join
A join aggregates data from two or more relations based on their relationship. That is, how each set of tuples relate to one another.
INNER JOIN
+--------------+
| |
| |
+--------------+ |
| |///////| |
| |///////| |
| |///////| |
| +--------------+
| |
| |
+--------------+
Inner joins return all tuples from two or more tables that meet the restriction.
SELECT *
FROM <relation_a>
INNER JOIN <relation_b>
ON <relation_a>.<key> = <relation_b>.<foreign_key>;
Although, we used *
and =
in the example above, the join condition can be any of the operators described in the restrictions section.
LEFT JOIN
+--------------+
| |
| |
+--------------+ |
|//////|///////| |
|//////|///////| |
|//////|///////| |
|//////+--------------+
|//////////////|
|//////////////|
+--------------+
Sometimes referred to as the Left Outer Join
. Returns all rows from the left-hand relation and the tuples from the right-hand relation that meet the restriction.
SELECT <attributes>
FROM <left_relation>
LEFT JOIN <right_relation>
ON <left_relation>.<attribute> = <right_relation>.<attribute>
No Intersection
+--------------+
| |
| |
+--------------+ |
|//////| | |
|//////| | |
|//////| | |
|//////+--------------+
|//////////////|
|//////////////|
+--------------+
A variant on the left join. It only returns tuples from the left-hand relation.
SELECT <attributes>
FROM <left_relation>
LEFT JOIN <right_relation>
ON <left_relation>.<attribute> = <right_relation>.<attribute>
Where <right_relation>.<attribute> Is Null
RIGHT JOIN
+--------------+
|//////////////|
|//////////////|
+--------------+//////|
| |///////|//////|
| |///////|//////|
| |///////|//////|
| +--------------+
| |
| |
+--------------+
Also known as Right Outer Join
. Returns all tuples from the right-hand relation, as well as the tuples from the left-hand relation that also meet the restriction.
Select <attributes>
From <left_relation>
Right Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
No intersection
+--------------+
|//////////////|
|//////////////|
+--------------+//////|
| | |//////|
| | |//////|
| | |//////|
| +--------------+
| |
| |
+--------------+
Retrieve only the tuples in the right-hand relation that aren’t also part of the left-hand relation.
Select <attributes>
From <left_relation>
Right Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
Where <left_relation>.<attribute> Is Null
FULL JOIN
+--------------+
|\\\\\\\\\\\\\\|
|\\\\\\\\\\\\\\|
+--------------+\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////|XXXXXXX|\\\\\\|
|//////+--------------+
|//////////////|
|//////////////|
+--------------+
Retrieve all of left-hand, and right hand’s tuples. When tuples in both relation don’t match, SQL returns Null
for every attribute value missing.
Select <attributes>
From <left_relation>
Full Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
No intersection
+--------------+
|\\\\\\\\\\\\\\|
|\\\\\\\\\\\\\\|
+--------------+\\\\\\|
|//////| |\\\\\\|
|//////| |\\\\\\|
|//////| |\\\\\\|
|//////+--------------+
|//////////////|
|//////////////|
+--------------+
To retrieve all data related to both relation’s attributes other than their shared ones, and excluding Null
:
Select <attributes>
From <left_relation>
Full Join <right_relation>
On <left_relation>.<attribute> = <right_relation>.<attribute>
Where <left_relation>.<attribute> Is Null
Or <right_relation>.<attribute> Is Null
Combination
Another way of aggregating data is by combining together two or more relations’ attributes.
When we need to exclude duplicates:
Select <attributes>
From <relation_a>
Union
Select <attributes>
From <relation_b>;
To include duplicates:
Select <attributes>
From <relation_a>
Union All
Select <attributes>
From <relation_b>;
Whenever we need to retrieve the set of attributes present in two or more relations:
Select <attributes> From <relation_a>
Intersect
Select <attributes> From <relation_b>;
In order to retrieve the sequences of attributes present in <relation_a> but not in <relation_b>:
Select <attribute_1>, <attribute_2>
From <relation_a>
Where <restriction_a>
Except
Select <attribute_1>, <attribute_2>
From <relation_b>
Where <restriction_b>;
Manipulation
Add Data
We can add data through:
Insert Into <relation>(<attribute_1>, <attribute_2>)
Values(<value_1>, <value_2>);
Listing attributes is optional but consider best practice for clarity.
Edit Data
To edit several attribute values at once we can do:
UPDATE <relation>
SET <attribute_1> = <value_a>,
<attribute_2> = <value_b>
WHERE <restriction>;
Excluding the Where
filter updates all of an attribute’s values with the same value. As a preventive step, retrieve (Select
) data before any changes.
Remove Data
To remove tuples of data based on a restriction
Delete From <relation>
Where <restriction>;
Beware, if the filter Where
is left out we’ll delete all tuples in the relation.
Table Management
Add
To create a new table.
Create Table <name> (
<attribute_1> <data_type_a> <attribute_constrains>,
<attribute_2> <data_type_b> <attribute_constrains>,
<attribute_3> <data_type_c> <attribute_constrains>,
);
Data types
Common data types:
Description | Data Type |
---|---|
True, false; on, off | Bool |
action’s date and time | Timestamp |
fixed length string, spaced padded | Char(<N>) |
variable length string, up to | Varchar(<N>) |
variable length string, “unlimited” | Text |
16-bits integers ranging -32768, 32767 | SmallInt |
32-bits integers ranging -214783648, 214783647 | Int |
64-bits integers ranging | BigInt |
min precision | Float(<N>) |
double-precision floating-point number | Real |
auto-incrementing number | Serial |
money, 64-bits numbers ranging -9EB, 9EB | Money |
Data Constraints
These are some of the possible attribute value constraints we can optionally set in a table.
Description | Constraint |
---|---|
column is not allowed to contain null values | Not Null |
column’s fallback value when none is given | Default |
column values must be unique | Unique |
Boolean expression verified to add or edit data | Check (expression) |
We also have Primary Key
, and Foreign Key
. The Primary Key
defines a column or set of columns as the main way of identifying a tuple. The sequence of values must be unique and not null. The Foreign Key
must only contain values that match those of the Primary Key
of some tuple of the referenced table.
Note: foreign key constraints cannot be defined between temporary tables and permanent tables.
Change
To change a table’s name:
Alter Table <old_name>
Rename To <new_name>;
Add a new table attribute ie. column
Alter Table <name>
Add <attribute> <data_type> <optional_table_constraints>, Default <value>;
Remove an attribute, and all it’s values.
Alter Table <name>
Drop <attribute>;
Delete
Remove table, and all it’s data.
Drop Table <name>;
Resources
Practice online: