You can join a table to itself — that is, create a self-join — if you want to find rows in a table that have values in common with other rows in the same table. For example, you can use a self-join to find pairs of authors who live in the same postal code.
As with any join, a self-join requires at least two tables. The difference is that, instead of adding a second table to the query, you add a second instance of the same table. That way, you can compare a column in the first instance of the table to the same column in the second instance, which allows you to compare the values in a column to each other. Each instance of the table must be unique, so the Query Designer assigns an alias to the second instance of the table.
For example, if you are creating a self-join to find all authors with the same postal code, you compare the zip
column in the first instance of the table against the zip
column in the second instance. The resulting join condition might look like the following:
FROM authors INNER JOIN authors authors1 ON
authors.zip = authors1.zip
Creating a self-join often requires multiple join conditions. The primary join condition is the one on which the join is based. In the example of the authors’ postal code, the primary join condition is based on finding an exact match in the zip
column.
However, if you join based on only this condition, each row in the table appears in the result set at least twice. Each row matches itself, resulting in a duplicate. In addition, the join results are reversed for rows that are identical except for the order of the join values.
To eliminate these duplicates, you can include a second join condition to filter out duplicated rows. The second join condition might, for example, compare the primary key (in this example, the au_id
column) with a less than (<) operator. The resulting join condition might look like this:
FROM authors INNER JOIN authors authors1 ON
authors.zip = authors1.zip AND
authors.au_id < authors1.au_id
When you create a self-join, the Query Designer usually creates the second join condition automatically, because it is based on a primary key. You can then manually add the primary join condition.
To create a self-join
The Query Designer assigns an alias to the second instance by adding a sequential number to the table name. In addition, the Query Designer creates a join line between the input source windows based on the primary key. In most cases, this join can function as the second join condition.