Python List Comprehensions – Concise & Powerful

Introduction – The Problem They Solve

Suppose you want to create a list of squares of numbers from 1 to 10. The traditional approach uses a for loop:

Python

# Traditional for loop
squares = []
for n in range(1, 11):
    squares.append(n ** 2)
print(squares)
# [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

# Same result with a list comprehension – one line!
squares = [n ** 2 for n in range(1, 11)]
print(squares)
# [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

▶ Output

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100] [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

The list comprehension is shorter, more expressive, and often faster than the equivalent loop.

The Basic Syntax

The general form of a list comprehension is:

Python

# [expression  for  variable  in  iterable]
result = [expression for variable in iterable]

# Common patterns:
words = ["hello", "world", "python"]

# 1. Transform each element
upper = [w.upper() for w in words]
print(upper)   # ['HELLO', 'WORLD', 'PYTHON']

# 2. Apply a function
lengths = [len(w) for w in words]
print(lengths)  # [5, 5, 6]

# 3. From any iterable
chars = [c for c in "Python"]
print(chars)    # ['P', 'y', 't', 'h', 'o', 'n']

# 4. From a range
evens = [x * 2 for x in range(10)]
print(evens)    # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Comprehensions with Conditions

Add an if clause at the end to filter elements. Only items where the condition is True are included:

Python

# [expression  for  variable  in  iterable  if  condition]

numbers = range(1, 21)

# Only even numbers
evens = [n for n in numbers if n % 2 == 0]
print(evens)   # [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

# Only odd numbers > 10
big_odds = [n for n in numbers if n % 2 != 0 and n > 10]
print(big_odds)  # [11, 13, 15, 17, 19]

# Filter strings by length
words = ["cat", "elephant", "ant", "butterfly", "bee", "hippopotamus"]
long_words = [w for w in words if len(w) > 4]
print(long_words)  # ['elephant', 'butterfly', 'hippopotamus']

# Filter and transform together
scores = [45, 88, 92, 31, 70, 65, 99, 12]
passing_grades = [s for s in scores if s >= 60]
print(passing_grades)   # [88, 92, 70, 65, 99]
print(f"Average of passing: {sum(passing_grades)/len(passing_grades):.1f}")

▶ Output

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20] [11, 13, 15, 17, 19] ['elephant', 'butterfly', 'hippopotamus'] [88, 92, 70, 65, 99] Average of passing: 82.8

if-else Inside the Expression

You can use a ternary expression in the expression part (before for) to transform elements differently based on a condition. This is different from the filtering if after for:

Python

# [value_if_true  if  condition  else  value_if_false  for  var  in  iterable]

numbers = range(-5, 6)

# Replace negatives with 0, keep positives
non_negative = [n if n >= 0 else 0 for n in numbers]
print(non_negative)  # [0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5]

# Label even/odd
labels = ["even" if n % 2 == 0 else "odd" for n in range(1, 8)]
print(labels)  # ['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd']

# Clamp values to a [0, 100] range
raw = [120, -10, 85, 101, 50, -5, 99]
clamped = [max(0, min(100, v)) for v in raw]
print(clamped)  # [100, 0, 85, 100, 50, 0, 99]

▶ Output

[0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5] ['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd'] [100, 0, 85, 100, 50, 0, 99]

Nested Comprehensions

You can nest one comprehension inside another. This is equivalent to nested for loops:

Python

# Flatten a 2D list (list of lists)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
print(flat)   # [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Equivalent loop:
# flat = []
# for row in matrix:
#     for num in row:
#         flat.append(num)

# Generate a multiplication table as a 2D list
table = [[i * j for j in range(1, 6)] for i in range(1, 6)]
for row in table:
    print(row)

# Cartesian product of two lists
colors = ["red", "blue"]
sizes  = ["S", "M", "L"]
products = [(color, size) for color in colors for size in sizes]
print(products)
# [('red', 'S'), ('red', 'M'), ('red', 'L'),
#  ('blue', 'S'), ('blue', 'M'), ('blue', 'L')]

▶ Output

[1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3, 4, 5] [2, 4, 6, 8, 10] [3, 6, 9, 12, 15] [4, 8, 12, 16, 20] [5, 10, 15, 20, 25] [('red', 'S'), ('red', 'M'), ('red', 'L'), ('blue', 'S'), ('blue', 'M'), ('blue', 'L')]

⚠️

Don't Overdo Nesting

A comprehension with two for clauses can still be readable. Three or more nested for clauses usually produce code that is harder to understand than a plain loop. Readability always wins.

Generator Expressions

Replace the square brackets with parentheses and you get a generator expression — it produces values lazily on demand instead of building the entire list in memory. This is critical for large datasets:

Python

import sys

# List comprehension – builds entire list in RAM
squares_list = [n ** 2 for n in range(1_000_000)]
print(f"List size: {sys.getsizeof(squares_list):,} bytes")

# Generator expression – produces one value at a time
squares_gen = (n ** 2 for n in range(1_000_000))
print(f"Generator size: {sys.getsizeof(squares_gen):,} bytes")

# Use a generator when you only need to iterate once
total = sum(n ** 2 for n in range(1_000_000))   # no [] needed inside sum()!
print(f"Sum of squares: {total:,}")

# Generator with condition
large_squares = (n ** 2 for n in range(100) if n % 7 == 0)
for val in large_squares:
    print(val, end=" ")
print()

# Generators are exhausted after one pass
gen = (x * 2 for x in range(5))
print(list(gen))  # [0, 2, 4, 6, 8]
print(list(gen))  # []  ← generator is now empty!

▶ Output

List size: 8,448,728 bytes Generator size: 104 bytes Sum of squares: 333,332,833,333,500,000 0 49 196 441 784 1225 1764 2401 3136 3969 4900 5929 7056 8281 9604 [0, 2, 4, 6, 8] []

Set and Dict Comprehensions

The same comprehension syntax works for sets ({expr for ...}) and dictionaries ({key: value for ...}):

Python

# Set comprehension – automatically removes duplicates
words = ["apple", "banana", "apple", "cherry", "banana", "date"]
unique_lengths = {len(w) for w in words}
print(unique_lengths)   # {4, 5, 6} (order may vary)

# Dict comprehension
# Build a word → length mapping
word_length = {w: len(w) for w in words}
print(word_length)
# {'apple': 5, 'banana': 6, 'cherry': 6, 'date': 4}

# Invert a dictionary (swap keys and values)
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted)    # {1: 'a', 2: 'b', 3: 'c'}

# Dict comprehension with condition
scores = {"Alice": 88, "Bob": 42, "Carol": 95, "Dave": 58}
passing = {name: score for name, score in scores.items() if score >= 60}
print(passing)     # {'Alice': 88, 'Carol': 95}

# Square numbers as a dict
square_map = {n: n**2 for n in range(1, 8)}
print(square_map)  # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49}

▶ Output

{4, 5, 6} {'apple': 5, 'banana': 6, 'cherry': 6, 'date': 4} {1: 'a', 2: 'b', 3: 'c'} {'Alice': 88, 'Carol': 95} {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49}

Performance: Comprehension vs for Loop

List comprehensions are generally faster than equivalent for loops because they are optimised at the CPython bytecode level. Here is a comparison:

Python

import timeit

# Method 1: for loop with append
def using_loop(n):
    result = []
    for i in range(n):
        result.append(i ** 2)
    return result

# Method 2: list comprehension
def using_comprehension(n):
    return [i ** 2 for i in range(n)]

# Method 3: map() with lambda
def using_map(n):
    return list(map(lambda i: i ** 2, range(n)))

n = 100_000
t_loop = timeit.timeit(lambda: using_loop(n), number=50)
t_comp = timeit.timeit(lambda: using_comprehension(n), number=50)
t_map  = timeit.timeit(lambda: using_map(n), number=50)

print(f"for loop:        {t_loop:.3f}s")
print(f"comprehension:   {t_comp:.3f}s")
print(f"map(lambda):     {t_map:.3f}s")

▶ Typical Output

for loop: 2.847s comprehension: 1.923s map(lambda): 2.541s

Syntax	Result type	Memory	Use when…
`[... for ...]`	list	All in RAM	Need to index, sort, or reuse multiple times
`(... for ...)`	generator	One item at a time	Large data; only iterate once
`{... for ...}`	set	All in RAM	Need unique values
`{k: v for ...}`	dict	All in RAM	Build key-value mappings

When NOT to Use List Comprehensions

List comprehensions are powerful but not always the right tool. Avoid them when:

Python

# ❌ BAD: using a comprehension just for side effects
# This creates a list that's thrown away immediately
[print(x) for x in range(5)]     # Wasteful!

# ✅ GOOD: use a regular loop for side effects
for x in range(5):
    print(x)

# ❌ BAD: comprehension that's too complex to read at a glance
result = [f(x) for x in [g(y) for y in data if pred(y)] if condition(x)]

# ✅ GOOD: break complex logic into named steps
filtered_y   = [g(y) for y in data if pred(y)]
transformed  = [f(x) for x in filtered_y if condition(x)]

# ❌ BAD: building a huge list you only iterate once
for item in [process(x) for x in huge_data]:   # builds entire list first!
    do_something(item)

# ✅ GOOD: use a generator expression
for item in (process(x) for x in huge_data):   # one item at a time
    do_something(item)

Real-World Examples

Python

import os
from pathlib import Path

# 1. Get all Python files in a directory
py_files = [f for f in Path(".").iterdir() if f.suffix == ".py"]

# 2. Parse CSV rows into dicts (without the csv module)
csv_lines = [
    "Alice,30,Engineer",
    "Bob,25,Designer",
    "Carol,35,Manager",
]
headers = ["name", "age", "role"]
records = [
    dict(zip(headers, line.split(",")))
    for line in csv_lines
]
print(records)
# [{'name': 'Alice', 'age': '30', 'role': 'Engineer'}, ...]

# 3. Remove duplicates while preserving order
seen = set()
data = [1, 3, 2, 1, 4, 3, 5, 2]
unique_ordered = [x for x in data if not (x in seen or seen.add(x))]
print(unique_ordered)  # [1, 3, 2, 4, 5]

# 4. Transpose a matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = [[row[i] for row in matrix] for i in range(len(matrix[0]))]
for row in transposed:
    print(row)
# [1, 4, 7]
# [2, 5, 8]
# [3, 6, 9]

# 5. Extract numbers from mixed list
mixed = [1, "hello", 3.14, True, "world", 42, None, 7]
numbers_only = [x for x in mixed if isinstance(x, (int, float)) and not isinstance(x, bool)]
print(numbers_only)   # [1, 3.14, 42, 7]

List Comprehensions: Concise, but Know the Limit

A comprehension builds a list in one expression — faster and clearer than an append loop for simple transforms. Read it as "collect expression for each item, optionally where condition."

# loop version
squares = []
for n in range(10):
    if n % 2 == 0:
        squares.append(n * n)

# comprehension — same result, one line
squares = [n * n for n in range(10) if n % 2 == 0]

Related forms

{n * n for n in nums}            # set comprehension
{k: v for k, v in pairs}         # dict comprehension
(n * n for n in nums)            # generator — lazy, no list built

When NOT to use one: if you're only causing side effects (printing, calling an API), write a plain for loop — a comprehension that throws away its list just to run code is misleading. And once you need two ifs plus nesting, readability drops fast; a normal loop wins. Gotcha: nested-loop order reads left-to-right like nested fors: [x for row in grid for x in row] flattens correctly, not the reverse.

🏋️ Practical Exercise

Using only list comprehensions (no loops), write solutions for:

Generate all prime numbers between 2 and 50.
From a list of sentences, extract only those that contain the word "Python" (case-insensitive).
Flatten this 3D list: [[[1,2],[3,4]],[[5,6],[7,8]]] into a 1D list.
Build a dict mapping each word in a sentence to the number of vowels it contains.

🔥 Challenge Exercise

Write a function parse_log(filepath) that reads a log file line by line using a generator expression and returns:

A list of all ERROR level messages (list comprehension).
A dict mapping log levels to the count of their occurrences (dict comprehension).
A set of unique IP addresses mentioned in the log (set comprehension).

Process a file of 100,000+ lines memory-efficiently using generators.

📋 Summary

List comprehension syntax: [expression for variable in iterable].
Add a filter: [expression for variable in iterable if condition].
Transform with condition: [a if cond else b for variable in iterable].
Nested comprehension: [expr for x in outer for y in inner] — equivalent to nested for loops.
Generator expressions use () instead of [] and produce values lazily, saving memory.
Set comprehensions use {}; dict comprehensions use {key: value for ...}.
Comprehensions are typically faster than equivalent for loops with .append().
Avoid comprehensions for side effects, deeply nested logic, or very large datasets where generators are better.

Interview Questions

What is a list comprehension and how does it differ from a regular for loop?
What is the syntax for a list comprehension with a filter condition?
How does a generator expression differ from a list comprehension?
When should you use a generator expression instead of a list comprehension?
How do you write a nested list comprehension? What is the equivalent for loop?
What does [x for x in range(5) if x % 2 == 0] produce?
What is the difference between [x if x > 0 else 0 for x in lst] and [x for x in lst if x > 0]?
Why is using a list comprehension for side effects (like printing) considered bad practice?
What is a set comprehension? When is it useful?
How do you invert a dictionary using a dict comprehension?

FAQ

Are list comprehensions always faster than for loops? +

For simple transformations, yes — list comprehensions are typically 20–40% faster than for loops with .append() in CPython, because they use a specialised LIST_APPEND bytecode internally. However, for very complex logic with multiple conditions and function calls, the difference narrows, and readability should be your primary guide.

Can I use walrus operator (:=) inside a list comprehension? +

Yes! Python 3.8+ supports the walrus operator (assignment expression) inside comprehensions. This is useful when you want to both compute a value and filter on it without computing it twice: [y for x in data if (y := expensive(x)) > threshold].

Do variables defined in a list comprehension leak into the outer scope? +

No. In Python 3, list comprehensions have their own scope — the loop variable does not leak. This is different from Python 2 behaviour where the loop variable would be accessible in the outer scope after the comprehension. Generator expressions and set/dict comprehensions have always had their own scope.

What is the maximum number of nested for clauses in a comprehension? +

Python imposes no hard limit, but readability is the practical constraint. Two levels of nesting are common (e.g., flattening a 2D list). Three or more levels should almost always be refactored into named variables, helper functions, or explicit loops for maintainability.

Introduction – The Problem They Solve

The Basic Syntax

Comprehensions with Conditions

if-else Inside the Expression

Nested Comprehensions

Generator Expressions

Set and Dict Comprehensions

Performance: Comprehension vs for Loop

When NOT to Use List Comprehensions

Real-World Examples

List Comprehensions: Concise, but Know the Limit

Related forms

🏋️ Practical Exercise

🔥 Challenge Exercise

📋 Summary

Interview Questions

Related Topics

FAQ