Database Concepts – Tables, Keys, Schemas & Relationships

Tables – Where Data Lives

A table is the fundamental unit of storage in a relational database. Think of it as a spreadsheet: data is organised into rows (also called records or tuples) and columns (also called fields or attributes). Unlike a spreadsheet, however, a database table enforces strict rules — every column has a defined data type, and the database engine validates every row against those rules before accepting it.

Example: `employees` table
id	first_name	last_name	department	salary
1	Alice	Johnson	Engineering	85000
2	Bob	Smith	Marketing	62000
3	Carol	Lee	Engineering	91000

Key terminology to know:

Row / Record — one complete entry in the table (e.g. one employee).
Column / Field — one attribute of every row (e.g. salary).
Cell — the intersection of one row and one column — holds a single value.
NULL — a special marker meaning "no value" or "unknown". Different from zero or an empty string.

ℹ️

Tables are not files

Although a database is ultimately stored on disk, you should never think of a table as a file. The database engine manages how data is physically stored — in pages, using B-tree indexes, with write-ahead logs. As a SQL developer, you only interact with the logical table structure. The physical storage is the engine's concern.

Primary Key – The Unique Row Identifier

Every table should have a primary key — a column (or set of columns) whose value is unique for every row and never NULL. The primary key is the definitive way to identify a specific record. Without a primary key, you cannot reliably reference a single row.

SQL

-- The id column is the primary key — unique, auto-incremented integer
CREATE TABLE employees (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(100) NOT NULL,
    last_name  VARCHAR(100) NOT NULL,
    department VARCHAR(100),
    salary     DECIMAL(10, 2)
);

Best practices for primary keys:

Use AUTO_INCREMENT (MySQL) or SERIAL (PostgreSQL) — let the database assign IDs automatically. You never have to worry about duplicates.
Keep it simple — a single integer column named id is the most common and practical primary key.
Never use user-provided data as a PK — email addresses, usernames, or phone numbers can change. Integers cannot be taken back once assigned.
Composite primary keys — using two or more columns together as a PK is valid, but usually a sign that a junction table is the right model.

Foreign Key – Linking Tables Together

A foreign key is a column in one table that references the primary key of another table. This link creates a relationship between the two tables and enforces referential integrity — the database ensures you cannot create a foreign key value that points to a row that does not exist.

SQL

-- Users table (parent)
CREATE TABLE users (
    id    INT         PRIMARY KEY AUTO_INCREMENT,
    name  VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL
);

-- Orders table (child) — references users
CREATE TABLE orders (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    user_id    INT          NOT NULL,
    total      DECIMAL(10, 2),
    order_date DATE,
    -- FK: user_id must match an id in the users table
    FOREIGN KEY (user_id) REFERENCES users(id)
);

💡

Referential integrity in action

With the foreign key in place, trying to insert an order with user_id = 999 when no user with id = 999 exists will be rejected by the database with an error. Similarly, trying to delete a user who has existing orders will also fail — unless you configure ON DELETE CASCADE to automatically delete related orders.

Relationship Types

Relationships between tables come in three patterns. Understanding them is essential for designing a good schema:

One-to-One (1:1)

Each row in Table A corresponds to exactly one row in Table B, and vice versa. This is relatively rare — often used to split a table for performance reasons or to separate sensitive data (e.g. a users table and a separate user_profiles table with extended personal information).

One-to-Many (1:N) — Most Common

One row in Table A can relate to many rows in Table B, but each row in Table B relates to only one row in Table A. This is the most common relationship in any database. Examples: one user has many orders; one department has many employees; one blog post has many comments.

users (1)	orders (many)
user id=1 (Alice)	order id=101 (user_id=1)
	order id=102 (user_id=1)
	order id=103 (user_id=1)
user id=2 (Bob)	order id=104 (user_id=2)

Many-to-Many (M:N)

Rows in Table A can relate to many rows in Table B, and rows in Table B can relate to many rows in Table A. This requires a junction table (also called a bridge table or associative table) that holds pairs of foreign keys.

Example: students and courses — a student can enrol in many courses, and a course can have many students.

SQL

CREATE TABLE students (
    id   INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL
);

CREATE TABLE courses (
    id    INT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(200) NOT NULL
);

-- Junction table: holds student-course pairs
CREATE TABLE enrollments (
    student_id INT NOT NULL,
    course_id  INT NOT NULL,
    enrolled_at DATE,
    PRIMARY KEY (student_id, course_id),          -- composite PK
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (course_id)  REFERENCES courses(id)
);

Schemas – The Blueprint

A schema is the complete structural definition of a database — all its tables, columns, data types, constraints, indexes, and relationships. The schema is the blueprint that tells the database how data must be structured before any actual data is stored.

Schemas serve two purposes:

Enforcement — the database engine validates every INSERT and UPDATE against the schema and rejects data that violates the rules (wrong type, NULL in a NOT NULL column, duplicate primary key, broken foreign key).
Communication — a schema is also documentation. Reading a schema tells you exactly what data an application stores and how entities relate to each other.

SQL – complete schema example

-- Schema for a minimal e-commerce database
CREATE TABLE users (
    id         INT          PRIMARY KEY AUTO_INCREMENT,
    email      VARCHAR(255) UNIQUE NOT NULL,
    name       VARCHAR(100) NOT NULL,
    created_at DATETIME     DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE products (
    id          INT            PRIMARY KEY AUTO_INCREMENT,
    name        VARCHAR(200)   NOT NULL,
    price       DECIMAL(10, 2) NOT NULL,
    stock       INT            DEFAULT 0
);

CREATE TABLE orders (
    id         INT            PRIMARY KEY AUTO_INCREMENT,
    user_id    INT            NOT NULL,
    total      DECIMAL(10, 2),
    created_at DATETIME       DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_id) REFERENCES users(id)
);

CREATE TABLE order_items (
    id         INT            PRIMARY KEY AUTO_INCREMENT,
    order_id   INT            NOT NULL,
    product_id INT            NOT NULL,
    quantity   INT            NOT NULL,
    unit_price DECIMAL(10, 2) NOT NULL,
    FOREIGN KEY (order_id)   REFERENCES orders(id),
    FOREIGN KEY (product_id) REFERENCES products(id)
);

Indexes – Speed Up Lookups

As your table grows to millions of rows, queries that filter on a column (e.g. WHERE email = 'alice@example.com') will slow to a crawl if the database has to scan every single row. An index is a separate data structure (usually a B-tree) that the database builds and maintains alongside your table to enable fast lookups by specific column values — like the index at the back of a book.

SQL

-- Without an index, this scans every row (full table scan)
SELECT * FROM users WHERE email = 'alice@example.com';

-- Create an index on the email column
CREATE INDEX idx_users_email ON users(email);

-- Now the query uses the index — fast even with millions of rows
SELECT * FROM users WHERE email = 'alice@example.com';

⚠️

Indexes speed up reads but slow down writes

Every time you INSERT, UPDATE, or DELETE a row, the database must also update every index on that table. For tables with many indexes and high write volume, this overhead becomes significant. Index only the columns you actually filter or sort on — the primary key and foreign key columns are indexed automatically in most databases.

Core Database Concepts: Tables, Keys, Relationships

Before writing queries, you need the mental model of a relational database: data lives in tables (rows and columns), rows are identified by keys, and tables connect through relationships.

Concept	Is
Table	a grid: columns (fields) × rows (records)
Primary key	column that uniquely identifies each row
Foreign key	column pointing to another table's primary key
Relationship	one-to-many, many-to-many, one-to-one

customers(id PK, name)
orders(id PK, customer_id FK → customers.id, total)
-- one customer → many orders (a one-to-many relationship)

The relational idea: instead of repeating a customer's name and address in every order (wasteful and error-prone), you store customers once and have each order reference the customer by a foreign key. This is normalization — each fact lives in exactly one place, so updating it is a single change. Keys make it work: a primary key (often an auto-incrementing id) uniquely identifies each row; a foreign key stores another table's primary key to link them, and the database enforces that the referenced row exists (no orphans). Relationship types: one-to-many (a customer has many orders) is the most common; many-to-many (students ↔ courses) needs a join/bridge table; one-to-one is rarer. Why it matters: understanding these relationships is what lets you design sane schemas and write correct joins — the whole point of SQL is recombining this deliberately-split data on demand.

🏋️ Practical Exercise

Identify a table, a row, and a column in a sample dataset.
Define a primary key for a table.
Identify a foreign-key relationship.
Sketch a simple two-table schema.
Note the difference between a database and a table.

🔥 Challenge Exercise

Design a minimal schema for a blog: a users table and a posts table, each with a primary key and a foreign key linking posts to users. Explain the roles of the primary key, the foreign key, and the relationship between the tables.

📋 Summary

A table organises data into rows (records) and columns (fields) — the fundamental unit of a relational database.
A primary key is a column (or columns) uniquely identifying each row. Every table should have one — typically an auto-incrementing integer id.
A foreign key is a column that references the primary key of another table, creating a relationship and enforcing referential integrity.
The three relationship types: one-to-one (rare), one-to-many (most common), many-to-many (requires a junction table).
A schema is the structural blueprint of a database — tables, columns, types, and constraints defined before any data is stored.
An index speeds up column lookups dramatically on large tables, at the cost of slower writes.

Interview Questions

What is a relational database?
What is the difference between a table, a row, and a column?
What is a primary key?
What is a foreign key?
What does a one-to-many relationship mean?

FAQ

Can a table have more than one primary key? +

A table can have only one primary key constraint, but that primary key can span multiple columns — this is called a composite primary key. For example, in an enrollments junction table, the combination of student_id and course_id together can serve as the primary key. In practice, most tables use a single auto-incrementing integer column as the primary key for simplicity.

What happens if I delete a row that other rows reference via a foreign key? +

By default, the database will reject the deletion with a referential integrity error. You have three options when defining the foreign key: ON DELETE RESTRICT (default — reject the delete), ON DELETE CASCADE (automatically delete all child rows too), and ON DELETE SET NULL (set the foreign key column to NULL in all child rows). Choose the behaviour that matches your application's data model. CASCADE is convenient but can cause unintended mass deletions if used carelessly.

What is the difference between a schema and a database? +

The terminology varies by database system. In MySQL, "database" and "schema" are used interchangeably — CREATE DATABASE and CREATE SCHEMA do the same thing. In PostgreSQL and SQL Server, a database is a top-level container that can hold multiple schemas, which in turn hold tables. In common usage, "schema" usually refers to the structural definition (tables, columns, constraints) of a database, while "database" refers to the entire instance including the data.

Why is the primary key usually an integer and not the email or username? +

Three reasons. First, integers are compact and fast to compare — joining tables on integers is faster than joining on variable-length strings. Second, user-controlled values can change (people change their email address or username), but a primary key must be stable because it is referenced by foreign keys in other tables. Third, auto-incrementing integers are guaranteed unique by the database with no extra logic. Using a natural key (email, username) as a primary key creates fragile schemas that break when values need to change.