What Is a Self Join?
A self join is not a special join keyword — it is simply a regular INNER JOIN or LEFT JOIN where both sides reference the same table. You must give each reference a different alias so the database can distinguish them.
In a self join you must use aliases because SQL cannot tell which instance of the table each column refers to. The convention is to use descriptive aliases like e for employee and m for manager — or a and b for generic row pairs.
Self Join Syntax
-- Template: same table referenced twice with different aliases
SELECT a.col1, b.col2
FROM table_name a
JOIN table_name b ON a.some_col = b.other_col;
-- The join type (INNER, LEFT, etc.) depends on your need
-- INNER JOIN: only rows where both sides match
-- LEFT JOIN: keep rows even when no match (e.g. employees with no manager)
Employee–Manager Hierarchy
The employees table has a manager_id column that references the id of another row in the same table. A self join lets you display the manager's name alongside each employee.
-- employees table:
-- id | name | manager_id
-- 1 | Alice | NULL ← top-level (no manager)
-- 2 | Bob | 1 ← reports to Alice
-- 3 | Carol | 1 ← reports to Alice
-- 4 | Dave | 1 ← reports to Alice
-- INNER JOIN: only employees WITH a manager
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id;
-- Result:
-- employee | manager
-- Bob | Alice
-- Carol | Alice
-- Dave | Alice
-- (Alice is excluded — no manager match)
-- LEFT JOIN: include ALL employees; NULL for top-level managers
SELECT
e.name AS employee,
COALESCE(m.name, 'No Manager') AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
-- Result:
-- employee | manager
-- Alice | No Manager
-- Bob | Alice
-- Carol | Alice
-- Dave | Alice
Multi-Level Hierarchy
Add additional self join levels to traverse multi-level reporting structures.
-- Three levels: employee → manager → director
SELECT
e.name AS employee,
m.name AS manager,
d.name AS director
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
LEFT JOIN employees d ON m.manager_id = d.id;
-- Employees table extended sample:
-- id | name | manager_id
-- 1 | CEO | NULL
-- 2 | Alice | 1 ← reports to CEO
-- 3 | Bob | 2 ← reports to Alice
-- 4 | Carol | 2 ← reports to Alice
-- Result:
-- employee | manager | director
-- CEO | NULL | NULL
-- Alice | CEO | NULL
-- Bob | Alice | CEO
-- Carol | Alice | CEO
Finding Pairs: Employees in the Same Department
Self joins are also used to find pairs of rows that share a common attribute. The a.id < b.id condition prevents duplicate pairs (A-B and B-A) and a row being paired with itself.
-- All pairs of employees in the same department
SELECT
a.name AS employee_1,
b.name AS employee_2,
a.dept_id
FROM employees a
JOIN employees b ON a.dept_id = b.dept_id
AND a.id < b.id -- avoids (Alice,Bob) AND (Bob,Alice) AND (Alice,Alice)
ORDER BY a.dept_id;
-- Find customers who share the same city
SELECT
a.name AS customer_1,
b.name AS customer_2,
a.city
FROM customers a
JOIN customers b ON a.city = b.city
AND a.id < b.id;
-- Find products with the same price (possible duplicate pricing)
SELECT
a.product_name AS product_1,
b.product_name AS product_2,
a.price
FROM products a
JOIN products b ON a.price = b.price
AND a.id < b.id
ORDER BY a.price;
Summary
📋 Summary
- Self join — Join a table to itself using two different aliases. No special keyword required.
- Aliases are mandatory — Without them SQL cannot distinguish the two instances of the table.
- Employee–manager pattern —
manager_idFK in the same table is the canonical self join example. - LEFT JOIN for top-level rows — Use LEFT JOIN to include employees with no manager (NULL manager_id).
- Chain for depth — Add more self joins to traverse additional hierarchy levels.
- Finding pairs — Use
a.id < b.idto get each unique pair exactly once, excluding self-pairs.
FAQ
A self join performs the same as any other join of similar tables — the database does not know it is the same physical table, it just joins two result sets. Performance depends on table size, indexes, and the join condition. Ensure the self-referencing column (e.g. manager_id) is indexed if the table is large. The optimizer may handle self joins efficiently since statistics for both sides are identical.
For arbitrary-depth hierarchies (trees, org charts, bill-of-materials), a recursive CTE (Common Table Expression) is the modern approach: WITH RECURSIVE org_tree AS (SELECT id, name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id, t.level + 1 FROM employees e JOIN org_tree t ON e.manager_id = t.id) SELECT * FROM org_tree. This handles any depth without knowing the maximum level in advance.
a.id != b.id prevents a row pairing with itself but still generates both (Alice, Bob) and (Bob, Alice) as separate rows — you get double the pairs. a.id < b.id enforces strict ordering, ensuring each unique pair appears exactly once. Use != when you need directed pairs (order matters), and < when you need undirected pairs (order does not matter).
Yes — you can self join on any column. Common examples include joining on department_id (same-department pairs), zip_code (same-location customers), or product_category (same-category products). The only requirement is that you use different aliases for the two instances of the table and provide a meaningful join condition.