Ad – 728Γ—90
πŸͺ† Subqueries

SQL Correlated Subqueries – Row-by-Row Subqueries

A correlated subquery is a subquery that references a column from the outer query. Because it depends on the outer row, it is re-evaluated once for every row the outer query processes. This makes correlated subqueries powerful but potentially slow on large tables.

⏱️ 20 min read 🎯 Advanced πŸ“… Updated 2026

What Is a Correlated Subquery?

In a regular (non-correlated) subquery, the inner query is independent and runs once. In a correlated subquery, the inner query references a column from the outer query, so it must run once for each row processed by the outer query.

FeatureRegular SubqueryCorrelated Subquery
References outer query?NoYes
Runs how many times?OnceOnce per outer row
Can be cached?Yes (run once)No (changes per row)
PerformanceUsually fastPotentially slow (O(N) executions)

How It Works

The outer query passes the current row's values into the inner query. The inner query uses those values, returns a result, and the outer query decides whether to include that row:

SQL
-- Find employees earning more than their department average
SELECT e.name, e.department, e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department   -- e.department comes from the outer query
);
-- For each row in the outer query, the inner query computes AVG for THAT department
ℹ️
Spot a correlated subquery

Look for a column reference in the inner query that uses an alias defined in the outer FROM clause (e.g., e.department where e is the outer table alias). If the inner query has no reference to the outer query, it is a regular subquery.

Ad – 336Γ—280

Correlated vs Regular Subquery

SQL
-- Regular subquery: inner query runs ONCE, returns the overall average
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Correlated subquery: inner query runs ONCE PER ROW, returns dept-specific average
SELECT name, department, salary
FROM employees e
WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e2.department = e.department);

More Correlated Subquery Examples

Find the most recent order for each customer:

SQL
SELECT o.customer_id, o.order_date, o.amount
FROM orders o
WHERE o.order_date = (
    SELECT MAX(o2.order_date)
    FROM orders o2
    WHERE o2.customer_id = o.customer_id   -- correlated reference
);

Find products that have never been ordered:

SQL
SELECT p.name, p.category
FROM products p
WHERE NOT EXISTS (
    SELECT 1
    FROM order_items oi
    WHERE oi.product_id = p.id    -- correlated reference
);

Performance Considerations

Correlated subqueries execute once per outer row. For a table with 100,000 rows, that is 100,000 subquery executions. Most can be rewritten as JOINs for better performance:

SQL
-- Correlated subquery (potentially slow for large tables)
SELECT e.name, e.department, e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary) FROM employees e2 WHERE e2.department = e.department
);

-- Equivalent JOIN with derived table (faster)
SELECT e.name, e.department, e.salary
FROM employees e
JOIN (
    SELECT department, AVG(salary) AS dept_avg
    FROM employees
    GROUP BY department
) dept_stats ON e.department = dept_stats.department
WHERE e.salary > dept_stats.dept_avg;
⚠️
When to keep correlated subqueries

Some correlated subqueries cannot be trivially rewritten as JOINs β€” particularly those using EXISTS/NOT EXISTS for semi-join or anti-join semantics. Modern query optimisers (PostgreSQL, SQL Server, MySQL 8) often convert correlated subqueries to joins internally. Always EXPLAIN/profile your query on real data.

Summary

πŸ“‹ Summary

  • A correlated subquery references a column from the outer query β€” making it dependent on each outer row.
  • It re-executes once per outer row, which can be slow on large datasets.
  • Classic use: find rows where a value compares against a per-group aggregate (e.g., salary vs department average).
  • Most correlated subqueries can be rewritten as JOINs with derived tables for better performance.
  • EXISTS-based correlated subqueries are often the best way to express semi-join or anti-join logic.

FAQ

Can a correlated subquery appear in the SELECT clause? +

Yes. A correlated scalar subquery in SELECT computes a value for each row: SELECT name, (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS order_count FROM customers c. This is often called a "correlated column subquery." It is convenient but slow for large tables β€” a LEFT JOIN with GROUP BY is usually faster.

How do I know if my subquery is correlated or not? +

Check whether the inner query references any alias or column from the outer FROM clause. If you can run the inner query independently (without the outer query's context) and get a meaningful result, it is a regular subquery. If it references outer columns and would fail or produce no result standalone, it is correlated.

Do all database engines optimise correlated subqueries? +

Major databases (PostgreSQL, MySQL 8, SQL Server, Oracle) apply "subquery unnesting" to convert many correlated subqueries into more efficient join plans automatically. However, this is not guaranteed for all query patterns. Always check the EXPLAIN plan to see whether the optimiser actually avoided a nested loop execution.