IN with a Subquery
IN checks whether a value appears in the list returned by a subquery:
-- Find users who have placed at least one order
SELECT id, name, email
FROM users
WHERE id IN (
SELECT user_id
FROM orders
);
-- The subquery returns a list of user IDs; IN checks membership
NOT IN excludes rows that match any value in the subquery result:
-- Find users who have NEVER placed an order
SELECT id, name
FROM users
WHERE id NOT IN (
SELECT user_id
FROM orders
WHERE user_id IS NOT NULL -- β οΈ critical: exclude NULLs (see below)
);
EXISTS Syntax
EXISTS returns TRUE if the subquery produces at least one row, FALSE if it produces zero rows. The actual column values in the subquery result do not matter β convention is SELECT 1:
-- Same query with EXISTS: users who have placed at least one order
SELECT u.id, u.name, u.email
FROM users u
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.user_id = u.id -- correlated reference
);
-- EXISTS stops scanning orders as soon as it finds the first matching row
NOT EXISTS
NOT EXISTS is the safe, preferred way to find rows with no matching records in another table (anti-join):
-- Users who have NEVER placed an order (anti-join with NOT EXISTS)
SELECT u.id, u.name
FROM users u
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.user_id = u.id
);
-- Products with no order items
SELECT p.id, p.name
FROM products p
WHERE NOT EXISTS (
SELECT 1
FROM order_items oi
WHERE oi.product_id = p.id
);
IN vs EXISTS β Key Differences
| Feature | IN | EXISTS |
|---|---|---|
| What it evaluates | Value membership in a list | Whether subquery returns any rows |
| Short-circuits? | No (full list scanned) | Yes (stops at first match) |
| Correlated? | Usually not | Usually yes |
| NULL handling | Dangerous with NOT IN | Safe with NOT EXISTS |
| Best when subquery is⦠| Small result set | Large result set |
Performance and NULL Trap
If the subquery returns any NULL values, NOT IN returns zero rows β silently, with no error. This is because x NOT IN (1, 2, NULL) evaluates to UNKNOWN for all values of x (due to three-valued logic). Always add WHERE col IS NOT NULL inside the NOT IN subquery, or better: use NOT EXISTS instead.
-- DANGEROUS: if any orders.user_id is NULL, returns zero rows!
SELECT id, name FROM users
WHERE id NOT IN (SELECT user_id FROM orders);
-- SAFE option 1: filter NULLs in subquery
SELECT id, name FROM users
WHERE id NOT IN (SELECT user_id FROM orders WHERE user_id IS NOT NULL);
-- SAFE option 2: use NOT EXISTS (NULL-safe by design)
SELECT u.id, u.name FROM users u
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
EXISTS is generally faster when the subquery result set is large because it stops at the first matching row. IN is simpler to read for small, static lists or small subquery results.
Summary
π Summary
- IN checks value membership in a subquery result list.
- EXISTS returns TRUE as soon as the subquery finds one row β it short-circuits for efficiency.
- NOT EXISTS is the safe way to do anti-joins (rows with no match in another table).
- NOT IN with NULLs is dangerous β if the subquery returns any NULL, NOT IN returns zero rows. Always add IS NOT NULL or use NOT EXISTS.
- EXISTS is typically faster for large subquery results; IN is fine for small sets.
FAQ
Yes. WHERE status IN ('active', 'pending', 'review') uses a literal list β no subquery. The subquery form WHERE id IN (SELECT ...) is different syntax but uses the same IN semantics. The literal list form does not have the NULL trap issue because you control the values; only the subquery form can return unexpected NULLs.
No. EXISTS only cares whether the subquery returns at least one row. The column values are completely ignored. That is why SELECT 1 is used by convention β it makes the intent clear and avoids unnecessary column lookups. SELECT *, SELECT NULL, or any column all work identically with EXISTS.
Both approaches produce the same result for anti-joins. LEFT JOIN ... WHERE right.id IS NULL is sometimes preferred because it is visually consistent with other join-based queries in the same file. NOT EXISTS is semantically clearer about intent ("no matching row exists"). Performance is usually equivalent after optimisation β check your database's EXPLAIN plan to confirm.