Ad – 728×90
🗄️ Introduction

Database Concepts – Tables, Keys, Schemas & Relationships

Before writing complex SQL queries, you need to understand how relational databases are organised. This lesson covers the fundamental building blocks: tables (where data lives), primary keys (unique row identifiers), foreign keys (links between tables), the three relationship types (one-to-one, one-to-many, many-to-many), schemas (the blueprint), and a brief introduction to indexes. These concepts underpin everything you will do in SQL.

⏱️ 20 min read 🎯 Beginner 📅 Updated 2026

Tables – Where Data Lives

A table is the fundamental unit of storage in a relational database. Think of it as a spreadsheet: data is organised into rows (also called records or tuples) and columns (also called fields or attributes). Unlike a spreadsheet, however, a database table enforces strict rules — every column has a defined data type, and the database engine validates every row against those rules before accepting it.

Example: employees table
idfirst_namelast_namedepartmentsalary
1AliceJohnsonEngineering85000
2BobSmithMarketing62000
3CarolLeeEngineering91000

Key terminology to know:

  • Row / Record — one complete entry in the table (e.g. one employee).
  • Column / Field — one attribute of every row (e.g. salary).
  • Cell — the intersection of one row and one column — holds a single value.
  • NULL — a special marker meaning "no value" or "unknown". Different from zero or an empty string.
ℹ️
Tables are not files

Although a database is ultimately stored on disk, you should never think of a table as a file. The database engine manages how data is physically stored — in pages, using B-tree indexes, with write-ahead logs. As a SQL developer, you only interact with the logical table structure. The physical storage is the engine's concern.

Primary Key – The Unique Row Identifier

Every table should have a primary key — a column (or set of columns) whose value is unique for every row and never NULL. The primary key is the definitive way to identify a specific record. Without a primary key, you cannot reliably reference a single row.

SQL
-- The id column is the primary key — unique, auto-incremented integer
CREATE TABLE employees (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(100) NOT NULL,
    last_name  VARCHAR(100) NOT NULL,
    department VARCHAR(100),
    salary     DECIMAL(10, 2)
);

Best practices for primary keys:

  • Use AUTO_INCREMENT (MySQL) or SERIAL (PostgreSQL) — let the database assign IDs automatically. You never have to worry about duplicates.
  • Keep it simple — a single integer column named id is the most common and practical primary key.
  • Never use user-provided data as a PK — email addresses, usernames, or phone numbers can change. Integers cannot be taken back once assigned.
  • Composite primary keys — using two or more columns together as a PK is valid, but usually a sign that a junction table is the right model.

Foreign Key – Linking Tables Together

A foreign key is a column in one table that references the primary key of another table. This link creates a relationship between the two tables and enforces referential integrity — the database ensures you cannot create a foreign key value that points to a row that does not exist.

SQL
-- Users table (parent)
CREATE TABLE users (
    id    INT         PRIMARY KEY AUTO_INCREMENT,
    name  VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL
);

-- Orders table (child) — references users
CREATE TABLE orders (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    user_id    INT          NOT NULL,
    total      DECIMAL(10, 2),
    order_date DATE,
    -- FK: user_id must match an id in the users table
    FOREIGN KEY (user_id) REFERENCES users(id)
);
💡
Referential integrity in action

With the foreign key in place, trying to insert an order with user_id = 999 when no user with id = 999 exists will be rejected by the database with an error. Similarly, trying to delete a user who has existing orders will also fail — unless you configure ON DELETE CASCADE to automatically delete related orders.

Relationship Types

Relationships between tables come in three patterns. Understanding them is essential for designing a good schema:

One-to-One (1:1)

Each row in Table A corresponds to exactly one row in Table B, and vice versa. This is relatively rare — often used to split a table for performance reasons or to separate sensitive data (e.g. a users table and a separate user_profiles table with extended personal information).

One-to-Many (1:N) — Most Common

One row in Table A can relate to many rows in Table B, but each row in Table B relates to only one row in Table A. This is the most common relationship in any database. Examples: one user has many orders; one department has many employees; one blog post has many comments.

users (1)orders (many)
user id=1 (Alice)order id=101 (user_id=1)
order id=102 (user_id=1)
order id=103 (user_id=1)
user id=2 (Bob)order id=104 (user_id=2)

Many-to-Many (M:N)

Rows in Table A can relate to many rows in Table B, and rows in Table B can relate to many rows in Table A. This requires a junction table (also called a bridge table or associative table) that holds pairs of foreign keys.

Example: students and courses — a student can enrol in many courses, and a course can have many students.

SQL
CREATE TABLE students (
    id   INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL
);

CREATE TABLE courses (
    id    INT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(200) NOT NULL
);

-- Junction table: holds student-course pairs
CREATE TABLE enrollments (
    student_id INT NOT NULL,
    course_id  INT NOT NULL,
    enrolled_at DATE,
    PRIMARY KEY (student_id, course_id),          -- composite PK
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (course_id)  REFERENCES courses(id)
);
Ad – 336×280

Schemas – The Blueprint

A schema is the complete structural definition of a database — all its tables, columns, data types, constraints, indexes, and relationships. The schema is the blueprint that tells the database how data must be structured before any actual data is stored.

Schemas serve two purposes:

  • Enforcement — the database engine validates every INSERT and UPDATE against the schema and rejects data that violates the rules (wrong type, NULL in a NOT NULL column, duplicate primary key, broken foreign key).
  • Communication — a schema is also documentation. Reading a schema tells you exactly what data an application stores and how entities relate to each other.
SQL – complete schema example
-- Schema for a minimal e-commerce database
CREATE TABLE users (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    email      VARCHAR(255) UNIQUE NOT NULL,
    name       VARCHAR(100) NOT NULL,
    created_at DATETIME     DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE products (
    id          INT            PRIMARY KEY AUTO_INCREMENT,
    name        VARCHAR(200)   NOT NULL,
    price       DECIMAL(10, 2) NOT NULL,
    stock       INT            DEFAULT 0
);

CREATE TABLE orders (
    id         INT            PRIMARY KEY AUTO_INCREMENT,
    user_id    INT            NOT NULL,
    total      DECIMAL(10, 2),
    created_at DATETIME       DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_id) REFERENCES users(id)
);

CREATE TABLE order_items (
    id         INT            PRIMARY KEY AUTO_INCREMENT,
    order_id   INT            NOT NULL,
    product_id INT            NOT NULL,
    quantity   INT            NOT NULL,
    unit_price DECIMAL(10, 2) NOT NULL,
    FOREIGN KEY (order_id)   REFERENCES orders(id),
    FOREIGN KEY (product_id) REFERENCES products(id)
);

Indexes – Speed Up Lookups

As your table grows to millions of rows, queries that filter on a column (e.g. WHERE email = 'alice@example.com') will slow to a crawl if the database has to scan every single row. An index is a separate data structure (usually a B-tree) that the database builds and maintains alongside your table to enable fast lookups by specific column values — like the index at the back of a book.

SQL
-- Without an index, this scans every row (full table scan)
SELECT * FROM users WHERE email = 'alice@example.com';

-- Create an index on the email column
CREATE INDEX idx_users_email ON users(email);

-- Now the query uses the index — fast even with millions of rows
SELECT * FROM users WHERE email = 'alice@example.com';
⚠️
Indexes speed up reads but slow down writes

Every time you INSERT, UPDATE, or DELETE a row, the database must also update every index on that table. For tables with many indexes and high write volume, this overhead becomes significant. Index only the columns you actually filter or sort on — the primary key and foreign key columns are indexed automatically in most databases.

📋 Summary

  • A table organises data into rows (records) and columns (fields) — the fundamental unit of a relational database.
  • A primary key is a column (or columns) uniquely identifying each row. Every table should have one — typically an auto-incrementing integer id.
  • A foreign key is a column that references the primary key of another table, creating a relationship and enforcing referential integrity.
  • The three relationship types: one-to-one (rare), one-to-many (most common), many-to-many (requires a junction table).
  • A schema is the structural blueprint of a database — tables, columns, types, and constraints defined before any data is stored.
  • An index speeds up column lookups dramatically on large tables, at the cost of slower writes.

FAQ

Can a table have more than one primary key? +

A table can have only one primary key constraint, but that primary key can span multiple columns — this is called a composite primary key. For example, in an enrollments junction table, the combination of student_id and course_id together can serve as the primary key. In practice, most tables use a single auto-incrementing integer column as the primary key for simplicity.

What happens if I delete a row that other rows reference via a foreign key? +

By default, the database will reject the deletion with a referential integrity error. You have three options when defining the foreign key: ON DELETE RESTRICT (default — reject the delete), ON DELETE CASCADE (automatically delete all child rows too), and ON DELETE SET NULL (set the foreign key column to NULL in all child rows). Choose the behaviour that matches your application's data model. CASCADE is convenient but can cause unintended mass deletions if used carelessly.

What is the difference between a schema and a database? +

The terminology varies by database system. In MySQL, "database" and "schema" are used interchangeably — CREATE DATABASE and CREATE SCHEMA do the same thing. In PostgreSQL and SQL Server, a database is a top-level container that can hold multiple schemas, which in turn hold tables. In common usage, "schema" usually refers to the structural definition (tables, columns, constraints) of a database, while "database" refers to the entire instance including the data.

Why is the primary key usually an integer and not the email or username? +

Three reasons. First, integers are compact and fast to compare — joining tables on integers is faster than joining on variable-length strings. Second, user-controlled values can change (people change their email address or username), but a primary key must be stable because it is referenced by foreign keys in other tables. Third, auto-incrementing integers are guaranteed unique by the database with no extra logic. Using a natural key (email, username) as a primary key creates fragile schemas that break when values need to change.