What Is a Subquery?
A subquery (also called an inner query or nested query) is a complete SELECT statement enclosed in parentheses that is embedded within another SQL statement. The outer statement uses the result of the subquery as part of its own logic.
-- Outer query
SELECT name, salary
FROM employees
WHERE salary > (
-- Subquery: computes the average salary
SELECT AVG(salary)
FROM employees
);
The database evaluates the inner query first to get the average salary, then uses that single value in the outer WHERE condition. This is a scalar subquery.
Scalar β returns exactly one value (one row, one column).
Row β returns exactly one row with multiple columns.
Table (multi-row) β returns multiple rows, used like a table.
Scalar Subqueries
A scalar subquery returns a single value and can be used anywhere a literal value is valid: in WHERE, SELECT, or HAVING.
-- Compare each employee salary to company average
SELECT
name,
department,
salary,
(SELECT AVG(salary) FROM employees) AS company_avg,
salary - (SELECT AVG(salary) FROM employees) AS diff
FROM employees
ORDER BY diff DESC;
If a scalar subquery returns more than one row, most databases raise an error. Always use aggregate functions (AVG, MAX, MIN, COUNT) or ensure the subquery is guaranteed to produce a single row.
Row Subqueries
A row subquery returns exactly one row with multiple columns. It is used with row comparison operators:
-- Find the employee with the same (department, job_title) as employee id 42
SELECT name, department, job_title
FROM employees
WHERE (department, job_title) = (
SELECT department, job_title
FROM employees
WHERE id = 42
)
AND id != 42;
Table Subqueries (Multi-Row)
A table subquery returns multiple rows and is typically used with IN, NOT IN, ANY, ALL, or EXISTS:
-- Find employees in departments located in NYC
SELECT name, department
FROM employees
WHERE dept_id IN (
SELECT id
FROM departments
WHERE location = 'NYC'
);
Subqueries in the WHERE Clause
The most common position. Use with comparison operators or IN/EXISTS:
-- Employees earning more than the maximum salary in Marketing
SELECT name, salary
FROM employees
WHERE salary > (
SELECT MAX(salary)
FROM employees
WHERE department = 'Marketing'
);
Subqueries in the FROM Clause (Derived Tables)
A subquery in FROM acts as a temporary table, often called a derived table. It must be given an alias:
-- Find departments whose average salary is above 50000
SELECT dept, avg_sal
FROM (
SELECT department AS dept, AVG(salary) AS avg_sal
FROM employees
GROUP BY department
) dept_summary -- alias is required
WHERE avg_sal > 50000
ORDER BY avg_sal DESC;
Subqueries in the SELECT Clause
A scalar subquery in SELECT computes a derived column for each row:
-- Show each order with the customer's total order count
SELECT
o.id AS order_id,
o.customer_id,
o.amount,
(SELECT COUNT(*)
FROM orders o2
WHERE o2.customer_id = o.customer_id) AS customer_order_count
FROM orders o
ORDER BY customer_order_count DESC;
Summary
π Summary
- A subquery is a SELECT inside another SQL statement, always in parentheses.
- Scalar subqueries return one value β used in WHERE, SELECT, HAVING.
- Row subqueries return one row β used with row comparison operators.
- Table subqueries return multiple rows β used with IN, NOT IN, EXISTS, or in FROM.
- Derived tables (subqueries in FROM) must have an alias and act like temporary tables.
- Subqueries in SELECT compute a value per row β must return exactly one row.
FAQ
Yes β SQL allows subqueries within subqueries to any depth. However, deeply nested subqueries become very hard to read and debug. If you find yourself nesting three or more levels, rewrite using CTEs (WITH clause) which give each subquery a name and make the logic step-by-step readable.
JOINs are generally faster for combining data from multiple tables because the query optimiser can use index-based join algorithms. Use subqueries when: (1) you need to compare against an aggregate; (2) you need EXISTS/NOT EXISTS semantics; (3) you need a derived table for an intermediate aggregation. Modern optimisers often rewrite subqueries as JOINs internally, so the performance difference is smaller than it used to be.
Yes β standard SQL and every major database (MySQL, PostgreSQL, SQL Server, Oracle, SQLite) require a derived table alias. Without the alias the query will fail with a syntax error. Always add AS alias_name or just alias_name after the closing parenthesis of the subquery.