Ad – 728Γ—90
πŸ“Š Grouping & Aggregation

SQL GROUP BY – Grouping Rows for Aggregation

GROUP BY is one of the most powerful clauses in SQL. It collapses all rows that share the same value in a column into a single group, letting aggregate functions like COUNT, SUM, AVG, MIN, and MAX compute one result per group instead of one result for the whole table.

⏱️ 20 min read 🎯 Intermediate πŸ“… Updated 2026

GROUP BY Syntax

The GROUP BY clause comes after WHERE and before ORDER BY in a SELECT statement:

SQL
SELECT   column1, aggregate_function(column2)
FROM     table_name
WHERE    condition          -- optional: filters rows BEFORE grouping
GROUP BY column1
ORDER BY column1;           -- optional
πŸ’‘
Every non-aggregate column in SELECT must appear in GROUP BY

This is the fundamental rule. If you select department and COUNT(*), then department must be in the GROUP BY clause. Aggregate functions like COUNT, SUM, AVG are exempt β€” they summarise the group.

How Grouping Works

Think of GROUP BY as sorting the rows into buckets. Rows with the same value in the GROUP BY column are placed in the same bucket. Then each aggregate function runs once per bucket, producing one output row per group.

Given this employees table:

idnamedepartmentsalary
1AliceEngineering90000
2BobEngineering85000
3CarolMarketing72000
4DaveMarketing68000
5EveHR60000

Using GROUP BY with Aggregate Functions

Count employees and compute average salary per department:

SQL
SELECT
    department,
    COUNT(*)        AS headcount,
    AVG(salary)     AS avg_salary,
    SUM(salary)     AS total_payroll,
    MIN(salary)     AS lowest_salary,
    MAX(salary)     AS highest_salary
FROM employees
GROUP BY department
ORDER BY headcount DESC;
β–Ά Result
Engineering β†’ headcount: 2, avg_salary: 87500, total_payroll: 175000
Marketing β†’ headcount: 2, avg_salary: 70000, total_payroll: 140000
HR β†’ headcount: 1, avg_salary: 60000, total_payroll: 60000
ℹ️
COUNT(*) vs COUNT(column)

COUNT(*) counts every row in the group including NULLs. COUNT(column) counts only non-NULL values in that column. Use COUNT(*) for headcounts and COUNT(column) when you want to know how many rows have a value in a specific field.

Ad – 336Γ—280

Grouping by Multiple Columns

You can group by more than one column. Each unique combination of the grouped columns forms its own group:

SQL
SELECT
    department,
    job_title,
    COUNT(*)    AS headcount,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department, job_title
ORDER BY department, job_title;

This groups by the combination of department AND job_title. "Engineering / Senior Engineer" and "Engineering / Junior Engineer" become separate groups even though they share the same department.

SQL
-- Sales totals per year and month
SELECT
    YEAR(order_date)  AS yr,
    MONTH(order_date) AS mo,
    SUM(amount)       AS monthly_sales
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY yr, mo;

GROUP BY Rules

Follow these rules to write correct GROUP BY queries:

RuleExplanation
Non-aggregate = GROUP BY Every column in SELECT that is not inside an aggregate function must appear in the GROUP BY list.
NULL forms its own group All rows where the GROUP BY column is NULL are placed in a single NULL group.
HAVING for post-group filtering To filter groups (e.g., only departments with more than 5 people), use HAVING, not WHERE.
ORDER BY can use aliases You can ORDER BY an aggregate alias like ORDER BY headcount DESC even though it was defined in SELECT.
⚠️
NULL grouping behaviour

If the GROUP BY column contains NULLs, all NULL rows are grouped together as one group. The aggregate functions compute over those NULL rows just like any other group. Keep this in mind when your data has optional columns.

SQL
-- Demonstrate NULL group: employees with no department assigned
SELECT
    department,          -- will show NULL for unassigned employees
    COUNT(*) AS headcount
FROM employees
GROUP BY department;
-- The NULL group appears as a row where department IS NULL

Summary

πŸ“‹ Summary

  • GROUP BY collapses all rows sharing the same column value into a single output row.
  • Aggregate functions (COUNT, SUM, AVG, MIN, MAX) compute one result per group.
  • Every non-aggregate column in SELECT must also appear in GROUP BY.
  • You can group by multiple columns β€” each unique combination becomes a separate group.
  • NULL values form their own group.
  • Use HAVING (not WHERE) to filter groups after aggregation.

FAQ

Can I use a WHERE clause with GROUP BY? +

Yes. WHERE filters individual rows BEFORE grouping occurs. Only rows that pass the WHERE condition are included in the groups. To filter groups themselves (after aggregation), use HAVING. The order is: FROM β†’ WHERE β†’ GROUP BY β†’ HAVING β†’ SELECT β†’ ORDER BY.

What happens if I GROUP BY a column not in SELECT? +

That is perfectly valid. You can GROUP BY a column without including it in the SELECT list. The grouping still happens by that column β€” you just do not see it in the output. This is useful when you want to group by a technical key but only display a human-readable label.

Can I GROUP BY an expression or function result? +

Yes. You can GROUP BY any scalar expression: GROUP BY YEAR(order_date), GROUP BY UPPER(country), or GROUP BY price * quantity. In PostgreSQL and some other databases you can also use a SELECT alias in GROUP BY, but standard SQL requires repeating the expression.

Is GROUP BY the same as ORDER BY? +

No. GROUP BY organises rows into groups for aggregation β€” it changes the number of output rows. ORDER BY just sorts the output rows without changing them. A GROUP BY query does not guarantee any particular sort order β€” add an explicit ORDER BY if you need sorted results.