Python File Handling – Read, Write, and Manage Files

Introduction – Files in Python

A file is a sequence of bytes stored on disk. Python treats files as objects you interact with through a file handle returned by open(). There are two broad categories:

Text files – contain human-readable characters (e.g., .txt, .csv, .json). Python handles newline translation and encoding for you.
Binary files – contain raw bytes (e.g., images, PDFs, executables). You work with bytes objects instead of strings.

Before any operation you must open the file; after you're done you must close it. Python's with statement automates the closing step, even when exceptions occur.

The open() Function

The signature of open() is:

Python

open(file, mode='r', encoding=None, errors=None, newline=None)

Mode	Meaning	Creates file?	Truncates?
`'r'`	Read (default)	No – raises FileNotFoundError	No
`'w'`	Write	Yes	Yes – overwrites existing content
`'a'`	Append	Yes	No – adds to end of file
`'x'`	Exclusive create	Yes – fails if exists	N/A
`'r+'`	Read + write	No	No
`'b'` suffix	Binary mode (e.g., `'rb'`)	—	—

💡

Always specify encoding for text files

Use encoding="utf-8" to avoid surprises across different operating systems. Windows defaults to the system codepage (e.g., cp1252), which differs from the UTF-8 default on Linux/macOS.

Context Managers – The with Statement

The with statement is the correct way to open files. It guarantees file.close() is called even if an exception is raised:

Python

# ❌ Old-style (risky if an exception occurs before close)
f = open("notes.txt", "r", encoding="utf-8")
content = f.read()
f.close()   # might not run if read() raises!

# ✅ Modern style – close is guaranteed
with open("notes.txt", "r", encoding="utf-8") as f:
    content = f.read()
# f is automatically closed here, even on exception

print(f.closed)   # True

Reading Files

Python provides several methods for reading file content depending on how much you need at once:

Python

# Assume "haiku.txt" contains three lines of a haiku

# read() – entire file as one string
with open("haiku.txt", "r", encoding="utf-8") as f:
    entire = f.read()
    print(repr(entire))
# 'An old silent pond\nA frog jumps into the pond\nSplash! Silence again.\n'

# readline() – one line at a time (including '\n')
with open("haiku.txt", "r", encoding="utf-8") as f:
    first_line = f.readline()
    second_line = f.readline()
    print(first_line.strip())   # An old silent pond
    print(second_line.strip())  # A frog jumps into the pond

# readlines() – list of all lines
with open("haiku.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()
    print(f"Line count: {len(lines)}")  # 3

# Iteration (most memory-efficient – processes one line at a time)
with open("haiku.txt", "r", encoding="utf-8") as f:
    for line_no, line in enumerate(f, start=1):
        print(f"{line_no}: {line.rstrip()}")

▶ Output

1: An old silent pond 2: A frog jumps into the pond 3: Splash! Silence again.

💡

Prefer Iteration for Large Files

For files that are megabytes or gigabytes in size, file.read() loads everything into RAM. Iterating over the file object reads one line at a time, keeping memory usage constant regardless of file size.

Writing Files

Use mode 'w' to create or overwrite, and 'a' to append without touching existing content:

Python

# write() – write a string
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("Line 1\n")
    f.write("Line 2\n")
    f.write("Line 3\n")

# writelines() – write a list of strings (does NOT add newlines automatically)
lines = ["alpha\n", "beta\n", "gamma\n"]
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(lines)

# Append mode – add to an existing file
with open("log.txt", "a", encoding="utf-8") as f:
    f.write("2024-01-15 09:23:01 INFO  Server started\n")
    f.write("2024-01-15 09:23:05 DEBUG Listening on port 8080\n")

# Write multiple lines cleanly with print()
with open("report.txt", "w", encoding="utf-8") as f:
    for i in range(1, 6):
        print(f"Item {i}: {'★' * i}", file=f)

# Verify the written content
with open("report.txt", "r", encoding="utf-8") as f:
    print(f.read())

▶ Output

Item 1: ★ Item 2: ★★ Item 3: ★★★ Item 4: ★★★★ Item 5: ★★★★★

Working with Binary Files

Use 'rb' and 'wb' modes for images, audio files, compressed archives, or any non-text data:

Python

# Copy a file in binary mode (works for any file type)
def copy_file(src, dst, chunk_size=65536):
    """Copy src to dst reading chunk_size bytes at a time."""
    with open(src, "rb") as source, open(dst, "wb") as dest:
        while True:
            chunk = source.read(chunk_size)
            if not chunk:
                break
            dest.write(chunk)
    print(f"Copied '{src}' → '{dst}'")

# Read the first 4 bytes (magic bytes) of a PNG file
def is_png(filepath):
    PNG_MAGIC = b'\x89PNG'
    try:
        with open(filepath, "rb") as f:
            return f.read(4) == PNG_MAGIC
    except (FileNotFoundError, IOError):
        return False

File Position – seek() and tell()

Python

with open("output.txt", "r", encoding="utf-8") as f:
    print(f.tell())          # 0 – at the start

    first = f.read(5)
    print(f"Read: {first!r}")
    print(f.tell())          # 5

    f.seek(0)                # rewind to start
    print(f.tell())          # 0

    f.seek(0, 2)             # seek to end (0 bytes from end)
    print(f"File size: {f.tell()} bytes")

The os Module – File System Operations

The os module provides functions for creating, removing, and inspecting files and directories:

Python

import os

# Current working directory
print(os.getcwd())           # /home/user/project

# Check existence
print(os.path.exists("notes.txt"))   # True or False
print(os.path.isfile("notes.txt"))   # True if it's a regular file
print(os.path.isdir("data"))         # True if it's a directory

# File metadata
size = os.path.getsize("notes.txt")
print(f"Size: {size} bytes")

# Create a directory (and nested directories)
os.makedirs("data/output", exist_ok=True)

# List directory contents
for entry in os.listdir("."):
    print(entry)

# Rename / move a file
os.rename("old_name.txt", "new_name.txt")

# Delete a file (raises FileNotFoundError if missing)
if os.path.exists("temp.txt"):
    os.remove("temp.txt")

# Walk a directory tree
for root, dirs, files in os.walk("project"):
    for filename in files:
        full_path = os.path.join(root, filename)
        print(full_path)

Modern File Paths with pathlib

pathlib.Path (introduced in Python 3.4) provides an object-oriented API for paths. It is more readable and cross-platform than string-based os.path manipulation:

Python

from pathlib import Path

# Create a Path object
p = Path("data/notes.txt")

# Path inspection
print(p.name)        # notes.txt
print(p.stem)        # notes
print(p.suffix)      # .txt
print(p.parent)      # data
print(p.absolute())  # /home/user/project/data/notes.txt

# Build paths with / operator (cross-platform!)
base = Path("project")
config = base / "config" / "settings.json"
print(config)        # project/config/settings.json

# Check existence
print(p.exists())    # True / False
print(p.is_file())
print(p.is_dir())

# Read / write text directly
config_path = Path("config.txt")
config_path.write_text("debug=true\nport=8080\n", encoding="utf-8")
content = config_path.read_text(encoding="utf-8")
print(content)

# Read / write bytes directly
Path("data.bin").write_bytes(b'\x00\x01\x02\x03')

# Create directories
Path("logs/2024").mkdir(parents=True, exist_ok=True)

# List files matching a pattern (glob)
project = Path(".")
for py_file in project.glob("**/*.py"):
    print(py_file)

# File size
print(f"Size: {p.stat().st_size} bytes")

ℹ️

pathlib vs os.path

For new code, prefer pathlib. It handles path separators on Windows (\) and Unix (/) automatically, and the dot-attribute API is far more readable than nested os.path.join(os.path.dirname(...)) calls.

Working with CSV and JSON Files

Python

import csv
import json

# ── CSV ──────────────────────────────────────────────────────
# Writing a CSV
students = [
    {"name": "Alice", "grade": "A", "score": 95},
    {"name": "Bob",   "grade": "B", "score": 82},
    {"name": "Carol", "grade": "A", "score": 91},
]
with open("students.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "grade", "score"])
    writer.writeheader()
    writer.writerows(students)

# Reading a CSV
with open("students.csv", "r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['name']}: {row['score']}")

# ── JSON ─────────────────────────────────────────────────────
config = {
    "debug": True,
    "port": 8080,
    "allowed_hosts": ["localhost", "127.0.0.1"],
}

# Write JSON
with open("config.json", "w", encoding="utf-8") as f:
    json.dump(config, f, indent=2)

# Read JSON
with open("config.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)
    print(loaded["port"])   # 8080

▶ Output

Alice: 95 Bob: 82 Carol: 91 8080

Handling File Errors

Python

from pathlib import Path

def safe_read(filepath):
    """Read a file and return its content or None on failure."""
    try:
        return Path(filepath).read_text(encoding="utf-8")
    except FileNotFoundError:
        print(f"File not found: {filepath}")
    except PermissionError:
        print(f"Permission denied: {filepath}")
    except IsADirectoryError:
        print(f"Expected a file, got a directory: {filepath}")
    except OSError as e:
        print(f"OS error ({e.errno}): {e.strerror}")
    return None

def safe_write(filepath, content):
    """Write content to filepath, creating parent directories as needed."""
    try:
        path = Path(filepath)
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(content, encoding="utf-8")
        return True
    except (PermissionError, OSError) as e:
        print(f"Could not write {filepath}: {e}")
        return False

result = safe_read("data/report.txt")
if result:
    print(result[:100])

Always Open Files with `with` — Here's Why

If you open a file manually and an error happens before close(), the file handle leaks — and on Windows the file may stay locked. The with statement (a context manager) guarantees the file closes, error or not:

# ❌ leaks the handle if an exception fires mid-read
f = open("data.txt")
data = f.read()
f.close()

# ✅ closes automatically, even on error
with open("data.txt", encoding="utf-8") as f:
    data = f.read()

Mode	Meaning	If file exists
`"r"`	read (default)	error if missing
`"w"`	write	truncates to empty!
`"a"`	append	writes at end
`"x"`	create	error if it already exists

⚠️

Two real-world footguns

"w" silently erases the file the instant you open it — reach for "a" if you meant to add. And always pass encoding="utf-8": the default encoding is platform-dependent, so code that works on your machine can corrupt text on someone else's.

🏋️ Practical Exercise

Write a word_frequency(filepath) function that:

Reads a text file line by line (efficient for large files).
Splits each line into words, converts to lowercase, and strips punctuation.
Counts how often each word appears using a dictionary.
Returns the 10 most common words with their counts.
Handles FileNotFoundError and PermissionError gracefully.

🔥 Challenge Exercise

Build a simple logging system using only file I/O and pathlib. Create a Logger class that:

Accepts a log directory path and a log level (DEBUG, INFO, WARNING, ERROR).
Writes each log message to a daily log file named YYYY-MM-DD.log.
Rotates (compresses or moves) logs older than 7 days.
Exposes .debug(), .info(), .warning(), and .error() methods.

📋 Summary

Use open(file, mode, encoding="utf-8") to get a file handle.
Always use the with statement — it closes the file automatically, even on exceptions.
Modes: 'r' (read), 'w' (write/overwrite), 'a' (append), 'x' (exclusive create); append 'b' for binary.
Read methods: read() (all at once), readline() (one line), readlines() (list), or iterate the file object (memory-efficient).
Write methods: write(str), writelines(list), or print(..., file=f).
pathlib.Path provides a modern, object-oriented API; use / to join paths.
os.makedirs(..., exist_ok=True) creates nested directories safely.
Always handle FileNotFoundError, PermissionError, and OSError when doing file I/O.

Interview Questions

What does the with statement guarantee when working with files?
What is the difference between 'w' and 'a' file modes?
How do you read a very large file without loading it entirely into memory?
What is the difference between os.path and pathlib.Path?
What does newline="" do when opening a CSV file?
How do you create nested directories in one call in Python?
What exception is raised when you try to open a file that doesn't exist?
What is the difference between read(), readline(), and readlines()?
How do you check if a path exists and is a file (not a directory)?
What does seek(0) do to a file handle?

FAQ

Does Python automatically flush written data to disk? +

By default, Python buffers writes and flushes when the buffer is full or when the file is closed. Call f.flush() to force an immediate flush without closing. The with block closes the file (and thus flushes) when it exits, so you rarely need to call flush() manually.

How do I read a file that might not be UTF-8? +

Use errors='replace' or errors='ignore' in open() to handle bad bytes. For unknown encodings, install the chardet library (pip install chardet) and call chardet.detect(raw_bytes) to detect the encoding before opening.

Can I open two files at the same time with a single with statement? +

Yes. Separate them with a comma: with open("src.txt") as src, open("dst.txt", "w") as dst:. Both files are opened and both are closed when the block exits, regardless of exceptions.

When should I use pathlib instead of os.path? +

For new code, always prefer pathlib. It is more Pythonic, cross-platform by design, and supports method chaining. Use os.path only when working with legacy codebases or libraries that require string paths (though Path objects can be passed to most modern APIs).

Introduction – Files in Python

The open() Function

Context Managers – The with Statement

Reading Files

Writing Files

Working with Binary Files

File Position – seek() and tell()

The os Module – File System Operations

Modern File Paths with pathlib

Working with CSV and JSON Files

Handling File Errors

Always Open Files with with — Here's Why

🏋️ Practical Exercise

🔥 Challenge Exercise

📋 Summary

Interview Questions

Related Topics

FAQ

Always Open Files with `with` — Here's Why