Tables – Where Data Lives
A table is the fundamental unit of storage in a relational database. Think of it as a spreadsheet: data is organised into rows (also called records or tuples) and columns (also called fields or attributes). Unlike a spreadsheet, however, a database table enforces strict rules — every column has a defined data type, and the database engine validates every row against those rules before accepting it.
| id | first_name | last_name | department | salary |
|---|---|---|---|---|
| 1 | Alice | Johnson | Engineering | 85000 |
| 2 | Bob | Smith | Marketing | 62000 |
| 3 | Carol | Lee | Engineering | 91000 |
Key terminology to know:
- Row / Record — one complete entry in the table (e.g. one employee).
- Column / Field — one attribute of every row (e.g.
salary). - Cell — the intersection of one row and one column — holds a single value.
- NULL — a special marker meaning "no value" or "unknown". Different from zero or an empty string.
Although a database is ultimately stored on disk, you should never think of a table as a file. The database engine manages how data is physically stored — in pages, using B-tree indexes, with write-ahead logs. As a SQL developer, you only interact with the logical table structure. The physical storage is the engine's concern.
Primary Key – The Unique Row Identifier
Every table should have a primary key — a column (or set of columns) whose value is unique for every row and never NULL. The primary key is the definitive way to identify a specific record. Without a primary key, you cannot reliably reference a single row.
-- The id column is the primary key — unique, auto-incremented integer
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
department VARCHAR(100),
salary DECIMAL(10, 2)
);
Best practices for primary keys:
- Use
AUTO_INCREMENT(MySQL) orSERIAL(PostgreSQL) — let the database assign IDs automatically. You never have to worry about duplicates. - Keep it simple — a single integer column named
idis the most common and practical primary key. - Never use user-provided data as a PK — email addresses, usernames, or phone numbers can change. Integers cannot be taken back once assigned.
- Composite primary keys — using two or more columns together as a PK is valid, but usually a sign that a junction table is the right model.
Foreign Key – Linking Tables Together
A foreign key is a column in one table that references the primary key of another table. This link creates a relationship between the two tables and enforces referential integrity — the database ensures you cannot create a foreign key value that points to a row that does not exist.
-- Users table (parent)
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
-- Orders table (child) — references users
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
total DECIMAL(10, 2),
order_date DATE,
-- FK: user_id must match an id in the users table
FOREIGN KEY (user_id) REFERENCES users(id)
);
With the foreign key in place, trying to insert an order with user_id = 999 when no user with id = 999 exists will be rejected by the database with an error. Similarly, trying to delete a user who has existing orders will also fail — unless you configure ON DELETE CASCADE to automatically delete related orders.
Relationship Types
Relationships between tables come in three patterns. Understanding them is essential for designing a good schema:
One-to-One (1:1)
Each row in Table A corresponds to exactly one row in Table B, and vice versa. This is relatively rare — often used to split a table for performance reasons or to separate sensitive data (e.g. a users table and a separate user_profiles table with extended personal information).
One-to-Many (1:N) — Most Common
One row in Table A can relate to many rows in Table B, but each row in Table B relates to only one row in Table A. This is the most common relationship in any database. Examples: one user has many orders; one department has many employees; one blog post has many comments.
| users (1) | orders (many) |
|---|---|
| user id=1 (Alice) | order id=101 (user_id=1) |
| order id=102 (user_id=1) | |
| order id=103 (user_id=1) | |
| user id=2 (Bob) | order id=104 (user_id=2) |
Many-to-Many (M:N)
Rows in Table A can relate to many rows in Table B, and rows in Table B can relate to many rows in Table A. This requires a junction table (also called a bridge table or associative table) that holds pairs of foreign keys.
Example: students and courses — a student can enrol in many courses, and a course can have many students.
CREATE TABLE students (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL
);
CREATE TABLE courses (
id INT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(200) NOT NULL
);
-- Junction table: holds student-course pairs
CREATE TABLE enrollments (
student_id INT NOT NULL,
course_id INT NOT NULL,
enrolled_at DATE,
PRIMARY KEY (student_id, course_id), -- composite PK
FOREIGN KEY (student_id) REFERENCES students(id),
FOREIGN KEY (course_id) REFERENCES courses(id)
);
Schemas – The Blueprint
A schema is the complete structural definition of a database — all its tables, columns, data types, constraints, indexes, and relationships. The schema is the blueprint that tells the database how data must be structured before any actual data is stored.
Schemas serve two purposes:
- Enforcement — the database engine validates every INSERT and UPDATE against the schema and rejects data that violates the rules (wrong type, NULL in a NOT NULL column, duplicate primary key, broken foreign key).
- Communication — a schema is also documentation. Reading a schema tells you exactly what data an application stores and how entities relate to each other.
-- Schema for a minimal e-commerce database
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(100) NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(200) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
stock INT DEFAULT 0
);
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
total DECIMAL(10, 2),
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id)
);
CREATE TABLE order_items (
id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
unit_price DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
Indexes – Speed Up Lookups
As your table grows to millions of rows, queries that filter on a column (e.g. WHERE email = 'alice@example.com') will slow to a crawl if the database has to scan every single row. An index is a separate data structure (usually a B-tree) that the database builds and maintains alongside your table to enable fast lookups by specific column values — like the index at the back of a book.
-- Without an index, this scans every row (full table scan)
SELECT * FROM users WHERE email = 'alice@example.com';
-- Create an index on the email column
CREATE INDEX idx_users_email ON users(email);
-- Now the query uses the index — fast even with millions of rows
SELECT * FROM users WHERE email = 'alice@example.com';
Every time you INSERT, UPDATE, or DELETE a row, the database must also update every index on that table. For tables with many indexes and high write volume, this overhead becomes significant. Index only the columns you actually filter or sort on — the primary key and foreign key columns are indexed automatically in most databases.
📋 Summary
- A table organises data into rows (records) and columns (fields) — the fundamental unit of a relational database.
- A primary key is a column (or columns) uniquely identifying each row. Every table should have one — typically an auto-incrementing integer
id. - A foreign key is a column that references the primary key of another table, creating a relationship and enforcing referential integrity.
- The three relationship types: one-to-one (rare), one-to-many (most common), many-to-many (requires a junction table).
- A schema is the structural blueprint of a database — tables, columns, types, and constraints defined before any data is stored.
- An index speeds up column lookups dramatically on large tables, at the cost of slower writes.
FAQ
A table can have only one primary key constraint, but that primary key can span multiple columns — this is called a composite primary key. For example, in an enrollments junction table, the combination of student_id and course_id together can serve as the primary key. In practice, most tables use a single auto-incrementing integer column as the primary key for simplicity.
By default, the database will reject the deletion with a referential integrity error. You have three options when defining the foreign key: ON DELETE RESTRICT (default — reject the delete), ON DELETE CASCADE (automatically delete all child rows too), and ON DELETE SET NULL (set the foreign key column to NULL in all child rows). Choose the behaviour that matches your application's data model. CASCADE is convenient but can cause unintended mass deletions if used carelessly.
The terminology varies by database system. In MySQL, "database" and "schema" are used interchangeably — CREATE DATABASE and CREATE SCHEMA do the same thing. In PostgreSQL and SQL Server, a database is a top-level container that can hold multiple schemas, which in turn hold tables. In common usage, "schema" usually refers to the structural definition (tables, columns, constraints) of a database, while "database" refers to the entire instance including the data.
Three reasons. First, integers are compact and fast to compare — joining tables on integers is faster than joining on variable-length strings. Second, user-controlled values can change (people change their email address or username), but a primary key must be stable because it is referenced by foreign keys in other tables. Third, auto-incrementing integers are guaranteed unique by the database with no extra logic. Using a natural key (email, username) as a primary key creates fragile schemas that break when values need to change.