Project Overview
An e-commerce platform needs to track:
- Users β registered customers with email and signup date.
- Categories β product taxonomy (Electronics, Books, Clothing, etc.).
- Products β items for sale with price, category, and stock.
- Orders β a customer's checkout event with a total and status.
- Order Items β the individual lines within an order (product, quantity, unit price at time of purchase).
Schema Design
| Table | PK | Key Columns | Foreign Keys |
|---|---|---|---|
| categories | id | name, parent_id (self-referential) | parent_id β categories |
| users | id | email, name, created_at | β |
| products | id | name, price, category_id, stock | category_id β categories |
| orders | id | user_id, status, total, created_at | user_id β users |
| order_items | (order_id, product_id) | quantity, unit_price | order_id β orders, product_id β products |
Create Tables
CREATE TABLE categories (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
parent_id INT,
FOREIGN KEY (parent_id) REFERENCES categories(id)
);
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(150) UNIQUE NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(200) NOT NULL,
category_id INT NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL DEFAULT 0,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (category_id) REFERENCES categories(id),
CHECK (price >= 0),
CHECK (stock >= 0)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'pending',
total DECIMAL(10,2) NOT NULL DEFAULT 0,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id),
CHECK (status IN ('pending','processing','shipped','delivered','cancelled'))
);
CREATE TABLE order_items (
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE RESTRICT,
CHECK (quantity > 0)
);
-- Index FK columns for JOIN performance
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_orders_user ON orders(user_id);
CREATE INDEX idx_orders_created ON orders(created_at);
CREATE INDEX idx_items_product ON order_items(product_id);
Sample Data
INSERT INTO categories (id, name, parent_id) VALUES
(1, 'Electronics', NULL),
(2, 'Laptops', 1),
(3, 'Phones', 1),
(4, 'Books', NULL),
(5, 'Clothing', NULL);
INSERT INTO users (id, name, email, created_at) VALUES
(1, 'Alice', 'alice@example.com', '2023-01-10 09:00:00'),
(2, 'Bob', 'bob@example.com', '2023-03-22 14:30:00'),
(3, 'Carol', 'carol@example.com', '2023-06-15 11:00:00'),
(4, 'Dave', 'dave@example.com', '2023-09-01 08:00:00'),
(5, 'Eve', 'eve@example.com', '2024-01-05 16:00:00');
INSERT INTO products (id, name, category_id, price, stock) VALUES
(1, 'MacBook Pro 16"', 2, 2499.00, 15),
(2, 'Dell XPS 15', 2, 1899.00, 20),
(3, 'iPhone 15 Pro', 3, 999.00, 50),
(4, 'Samsung Galaxy S24', 3, 849.00, 45),
(5, 'Clean Code (book)', 4, 35.00, 100),
(6, 'Dune (book)', 4, 18.00, 80),
(7, 'T-Shirt (Medium)', 5, 25.00, 200),
(8, 'Jeans (32x32)', 5, 65.00, 150),
(9, 'iPad Pro 12.9"', 1, 1099.00, 30),
(10, 'Wireless Keyboard', 1, 129.00, 75);
INSERT INTO orders (id, user_id, status, total, created_at) VALUES
(1, 1, 'delivered', 2534.00, '2024-01-15 10:00:00'),
(2, 1, 'delivered', 1899.00, '2024-02-20 11:00:00'),
(3, 2, 'delivered', 1048.00, '2024-01-25 09:30:00'),
(4, 2, 'shipped', 107.00, '2024-03-05 14:00:00'),
(5, 3, 'delivered', 2499.00, '2024-02-10 08:00:00'),
(6, 3, 'processing', 194.00, '2024-03-18 15:00:00'),
(7, 4, 'delivered', 848.00, '2024-03-01 12:00:00'),
(8, 5, 'pending', 1228.00, '2024-03-20 16:30:00');
INSERT INTO order_items (order_id, product_id, quantity, unit_price) VALUES
(1, 1, 1, 2499.00), (1, 10, 1, 129.00), -- Alice: MacBook + Keyboard
(2, 2, 1, 1899.00), -- Alice: Dell XPS
(3, 3, 1, 999.00), (3, 5, 1, 35.00), (3, 7, 1, 14.00),
(4, 6, 2, 18.00), (4, 7, 2, 25.00), (4, 10, 1, 21.00),
(5, 1, 1, 2499.00), -- Carol: MacBook
(6, 5, 2, 35.00), (6, 6, 2, 18.00), (6, 8, 1, 65.00), (6, 7, 1, 25.00),
(7, 4, 1, 849.00),
(8, 9, 1, 1099.00), (8, 5, 2, 35.00), (8, 7, 2, 25.00), (8, 6, 1, 9.00);
JOIN Queries
Full order details with line items
SELECT
o.id AS order_id,
u.name AS customer,
p.name AS product,
c.name AS category,
oi.quantity,
oi.unit_price,
oi.quantity * oi.unit_price AS line_total,
o.status,
o.created_at
FROM orders o
JOIN users u ON u.id = o.user_id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
JOIN categories c ON c.id = p.category_id
ORDER BY o.id, p.name;
Customers with more than one order
SELECT
u.name,
u.email,
COUNT(o.id) AS order_count,
SUM(o.total) AS lifetime_value
FROM users u
JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name, u.email
HAVING COUNT(o.id) > 1
ORDER BY lifetime_value DESC;
Analytics Queries
Top-selling products by revenue
SELECT
p.name AS product,
c.name AS category,
SUM(oi.quantity) AS units_sold,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
RANK() OVER (ORDER BY SUM(oi.quantity * oi.unit_price) DESC) AS revenue_rank
FROM order_items oi
JOIN products p ON p.id = oi.product_id
JOIN categories c ON c.id = p.category_id
JOIN orders o ON o.id = oi.order_id
WHERE o.status != 'cancelled'
GROUP BY p.id, p.name, c.name
ORDER BY total_revenue DESC;
Revenue by category
SELECT
c.name AS category,
COUNT(DISTINCT o.id) AS order_count,
SUM(oi.quantity * oi.unit_price) AS revenue,
ROUND(
100.0 * SUM(oi.quantity * oi.unit_price) /
SUM(SUM(oi.quantity * oi.unit_price)) OVER (), 2
) AS pct_of_total
FROM order_items oi
JOIN products p ON p.id = oi.product_id
JOIN categories c ON c.id = p.category_id
JOIN orders o ON o.id = oi.order_id
WHERE o.status != 'cancelled'
GROUP BY c.id, c.name
ORDER BY revenue DESC;
Average order value and order frequency per customer
SELECT
u.name,
COUNT(o.id) AS total_orders,
ROUND(AVG(o.total), 2) AS avg_order_value,
ROUND(SUM(o.total), 2) AS total_spent,
MIN(o.created_at::DATE) AS first_order,
MAX(o.created_at::DATE) AS last_order
FROM users u
LEFT JOIN orders o ON o.user_id = u.id AND o.status != 'cancelled'
GROUP BY u.id, u.name
ORDER BY total_spent DESC NULLS LAST;
Monthly sales trend
SELECT
DATE_TRUNC('month', o.created_at) AS month,
COUNT(DISTINCT o.id) AS orders,
COUNT(DISTINCT o.user_id) AS unique_customers,
SUM(o.total) AS revenue,
ROUND(AVG(o.total), 2) AS avg_order_value
FROM orders o
WHERE o.status != 'cancelled'
GROUP BY 1
ORDER BY 1;
Summary
π Summary
- Designed a 5-table normalized e-commerce schema: categories, users, products, orders, order_items.
- Used composite PK on order_items, CHECK constraints on status/price/stock, ON DELETE CASCADE for items.
- Indexed all FK columns and the orders.created_at column for time-series queries.
- Wrote 4-table JOINs to produce full order detail reports.
- Analytical queries: top-selling products with window RANK, revenue by category with percentage share, monthly trend with DATE_TRUNC.
- This dataset is used as-is in the next project: Analytics Queries.
FAQ
Product prices change over time. If you look up the current price from the products table when querying historical orders, the order total will appear wrong after any price update. Storing unit_price in order_items captures the price at the moment of purchase β this is a standard e-commerce schema pattern. The products table price is used only for the current listing price; the order_items price is the historical record.
If you delete a row from orders, all related rows in order_items are automatically deleted. This is appropriate here because order items have no meaning without the parent order. The opposite direction (products) uses ON DELETE RESTRICT β you cannot delete a product that has been ordered, preserving the historical record. Always think carefully about which direction CASCADE is appropriate.
parent_id creates a self-referential hierarchy in the categories table. "Laptops" (parent_id = 1) and "Phones" (parent_id = 1) are subcategories of "Electronics" (parent_id = NULL). This allows multiple levels of categorization without creating separate tables for each level. To query the full category hierarchy, use a recursive CTE β the same pattern as the employee hierarchy in the Exercises page.
For a production system: (1) Store all amounts as integers in the smallest currency unit (cents/pence) to avoid floating-point rounding errors β $9.99 stored as 999. (2) Add a currency column or a global currency setting. (3) DECIMAL(10,2) is acceptable for learning projects and small-scale apps, but be aware of rounding at scale. Never use FLOAT for money in any production system β floating-point precision issues will eventually cause calculation errors.