Ad – 728×90
🚀 SQL Basics

SQL DISTINCT – Removing Duplicate Rows

By default, a SQL query returns every row that matches its conditions — including duplicates. If you want to see only the unique values in a column (e.g. "what departments exist?", "which countries do our customers come from?"), use DISTINCT. It filters out duplicate rows from the result set before returning data. This lesson covers single-column DISTINCT, multi-column DISTINCT, COUNT(DISTINCT ...), and when to choose DISTINCT over GROUP BY.

⏱️ 10 min read 🎯 Beginner 📅 Updated 2026

DISTINCT Syntax

Place DISTINCT immediately after the SELECT keyword. It applies to the entire SELECT list — all columns together are used to determine uniqueness.

SQL
-- Without DISTINCT: returns every row (may include duplicates)
SELECT department FROM employees;
-- Returns: Engineering, Marketing, Engineering, Marketing, Engineering
-- (one row per employee, repeated departments)

-- With DISTINCT: returns each unique department once
SELECT DISTINCT department FROM employees;
-- Returns: Engineering, Marketing
Without DISTINCTWith DISTINCT
EngineeringEngineering
MarketingMarketing
Engineering
Marketing
Engineering
ℹ️
Common use cases for DISTINCT

DISTINCT is useful for data exploration: "what unique values exist in this column?" It answers questions like: "what departments do we have?", "which countries have placed orders?", "what product categories exist in the database?". It is also used in joins to collapse repeated rows that result from one-to-many relationships.

DISTINCT with Multiple Columns

When you list multiple columns after SELECT DISTINCT, DISTINCT operates on the combination of all listed columns. A row is a duplicate only if every column value in the row matches another row.

SQL
-- Unique city-country combinations
SELECT DISTINCT city, country
FROM customers
ORDER BY country, city;

-- Example data:
-- (Paris, France), (London, UK), (Paris, France) → removes second Paris/France
-- Returns: (Paris, France), (London, UK)

-- Unique department-role combinations
SELECT DISTINCT department, job_title
FROM employees
ORDER BY department;
-- Returns each unique pairing, even if the same department appears multiple times
-- with different job titles
💡
DISTINCT applies to the entire row

SELECT DISTINCT city, country does not mean "distinct city AND distinct country separately". It means "distinct combinations of (city, country) as a pair". Two rows with the same city but different countries are NOT duplicates and both appear in the result.

Ad – 336×280

COUNT(DISTINCT column)

A very common pattern is using DISTINCT inside the COUNT() aggregate function to count the number of unique values in a column:

SQL
-- How many distinct departments exist?
SELECT COUNT(DISTINCT department) AS num_departments
FROM employees;
-- Returns: 2 (Engineering, Marketing)

-- How many unique customers have placed at least one order?
SELECT COUNT(DISTINCT customer_id) AS active_customers
FROM orders;

-- How many unique products have been sold?
SELECT COUNT(DISTINCT product_id) AS products_sold
FROM order_items;

-- Compare: total rows vs distinct values
SELECT
    COUNT(*)                AS total_employees,
    COUNT(DISTINCT department) AS distinct_departments
FROM employees;
-- total_employees: 5, distinct_departments: 2

DISTINCT vs GROUP BY

For simple deduplication, DISTINCT and GROUP BY on the same columns produce identical results. The choice comes down to intent and whether you need aggregations:

SQL
-- These two queries return the same result:
SELECT DISTINCT department FROM employees ORDER BY department;

SELECT department FROM employees GROUP BY department ORDER BY department;

-- But GROUP BY is more powerful — it can aggregate:
SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY headcount DESC;

-- DISTINCT cannot do this — DISTINCT alone cannot produce aggregates
AspectDISTINCTGROUP BY
Removes duplicatesYesYes
Can aggregate (SUM, COUNT, AVG)NoYes
Can filter groups (HAVING)NoYes
Syntax simplicitySimplerMore explicit
Best forSimple deduplication onlyGrouping with aggregation
ℹ️
Performance: DISTINCT requires a sort or hash

To find unique values, the database must either sort all rows and remove adjacent duplicates, or build a hash table of seen values. Both operations can be expensive on large tables. If you need deduplication frequently on the same column, ensure that column is indexed. For tables with millions of rows, test whether DISTINCT or GROUP BY performs better in your specific database and query plan.

📋 Summary

  • SELECT DISTINCT column — returns only unique values; duplicate rows are removed from the result.
  • With multiple columns, DISTINCT operates on the combination of all columns — a row is a duplicate only if every value matches.
  • COUNT(DISTINCT column) — counts the number of unique values in a column. A common pattern for analytics.
  • DISTINCT and GROUP BY produce the same result for simple deduplication, but GROUP BY is more flexible — it enables aggregation, HAVING filters, and complex grouping.
  • DISTINCT requires a sort or hash operation — it can be slow on large tables with many unique values. Use indexes and test performance.

FAQ

Can I use DISTINCT with SELECT *? +

Yes — SELECT DISTINCT * removes rows where every single column value is identical. In practice, if a table has a primary key (which is unique by definition), SELECT DISTINCT * will never remove any rows because no two rows can be identical when a unique ID column exists. DISTINCT with * is only useful on tables without a unique key, or on result sets derived from joins that produce repeated rows.

How does DISTINCT handle NULL values? +

DISTINCT treats all NULLs as equal to each other. If a column has multiple NULL values, SELECT DISTINCT column returns exactly one NULL in the result. This is an exception to the normal rule that NULL is not equal to NULL in comparison expressions — for DISTINCT deduplication purposes, NULLs are considered duplicates of each other.

Can I use DISTINCT on only one column when selecting multiple? +

Standard SQL does not allow SELECT DISTINCT(col1), col2 to mean "distinct col1 only while also selecting col2". The DISTINCT keyword applies to the entire SELECT list. However, some databases (like MySQL) support GROUP_CONCAT(DISTINCT col) or window functions like ROW_NUMBER() OVER (PARTITION BY col ORDER BY ...) for more nuanced deduplication patterns. For "first row per group" queries, window functions (covered later in this course) are the proper tool.

When should I prefer DISTINCT over GROUP BY? +

Use DISTINCT when your only goal is deduplication and you do not need any aggregation. SELECT DISTINCT department FROM employees is more readable and communicates intent more clearly than SELECT department FROM employees GROUP BY department when you just want the list of unique departments. Use GROUP BY when you need counts, sums, averages, or other aggregate calculations alongside the grouped column.