Ad – 728Γ—90
⚑ Intermediate

Python File Handling – Read, Write, and Manage Files

Nearly every real program touches the filesystem β€” reading configuration files, saving user data, processing logs, or importing datasets. Python makes file I/O straightforward with the built-in open() function, context managers for safe resource handling, and powerful standard-library modules like os and pathlib. This lesson covers everything from reading a simple text file to navigating directory trees and handling errors gracefully.

⏱️ 25 min read 🎯 Intermediate πŸ“… Updated 2026

Introduction – Files in Python

A file is a sequence of bytes stored on disk. Python treats files as objects you interact with through a file handle returned by open(). There are two broad categories:

  • Text files – contain human-readable characters (e.g., .txt, .csv, .json). Python handles newline translation and encoding for you.
  • Binary files – contain raw bytes (e.g., images, PDFs, executables). You work with bytes objects instead of strings.

Before any operation you must open the file; after you're done you must close it. Python's with statement automates the closing step, even when exceptions occur.

The open() Function

The signature of open() is:

Python
open(file, mode='r', encoding=None, errors=None, newline=None)
ModeMeaningCreates file?Truncates?
'r'Read (default)No – raises FileNotFoundErrorNo
'w'WriteYesYes – overwrites existing content
'a'AppendYesNo – adds to end of file
'x'Exclusive createYes – fails if existsN/A
'r+'Read + writeNoNo
'b' suffixBinary mode (e.g., 'rb')β€”β€”
πŸ’‘
Always specify encoding for text files

Use encoding="utf-8" to avoid surprises across different operating systems. Windows defaults to the system codepage (e.g., cp1252), which differs from the UTF-8 default on Linux/macOS.

Context Managers – The with Statement

The with statement is the correct way to open files. It guarantees file.close() is called even if an exception is raised:

Python
# ❌ Old-style (risky if an exception occurs before close)
f = open("notes.txt", "r", encoding="utf-8")
content = f.read()
f.close()   # might not run if read() raises!

# βœ… Modern style – close is guaranteed
with open("notes.txt", "r", encoding="utf-8") as f:
    content = f.read()
# f is automatically closed here, even on exception

print(f.closed)   # True

Reading Files

Python provides several methods for reading file content depending on how much you need at once:

Python
# Assume "haiku.txt" contains three lines of a haiku

# read() – entire file as one string
with open("haiku.txt", "r", encoding="utf-8") as f:
    entire = f.read()
    print(repr(entire))
# 'An old silent pond\nA frog jumps into the pond\nSplash! Silence again.\n'

# readline() – one line at a time (including '\n')
with open("haiku.txt", "r", encoding="utf-8") as f:
    first_line = f.readline()
    second_line = f.readline()
    print(first_line.strip())   # An old silent pond
    print(second_line.strip())  # A frog jumps into the pond

# readlines() – list of all lines
with open("haiku.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()
    print(f"Line count: {len(lines)}")  # 3

# Iteration (most memory-efficient – processes one line at a time)
with open("haiku.txt", "r", encoding="utf-8") as f:
    for line_no, line in enumerate(f, start=1):
        print(f"{line_no}: {line.rstrip()}")
β–Ά Output
1: An old silent pond 2: A frog jumps into the pond 3: Splash! Silence again.
πŸ’‘
Prefer Iteration for Large Files

For files that are megabytes or gigabytes in size, file.read() loads everything into RAM. Iterating over the file object reads one line at a time, keeping memory usage constant regardless of file size.

Writing Files

Use mode 'w' to create or overwrite, and 'a' to append without touching existing content:

Python
# write() – write a string
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("Line 1\n")
    f.write("Line 2\n")
    f.write("Line 3\n")

# writelines() – write a list of strings (does NOT add newlines automatically)
lines = ["alpha\n", "beta\n", "gamma\n"]
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(lines)

# Append mode – add to an existing file
with open("log.txt", "a", encoding="utf-8") as f:
    f.write("2024-01-15 09:23:01 INFO  Server started\n")
    f.write("2024-01-15 09:23:05 DEBUG Listening on port 8080\n")

# Write multiple lines cleanly with print()
with open("report.txt", "w", encoding="utf-8") as f:
    for i in range(1, 6):
        print(f"Item {i}: {'β˜…' * i}", file=f)

# Verify the written content
with open("report.txt", "r", encoding="utf-8") as f:
    print(f.read())
β–Ά Output
Item 1: β˜… Item 2: β˜…β˜… Item 3: β˜…β˜…β˜… Item 4: β˜…β˜…β˜…β˜… Item 5: β˜…β˜…β˜…β˜…β˜…
Ad – 336Γ—280

Working with Binary Files

Use 'rb' and 'wb' modes for images, audio files, compressed archives, or any non-text data:

Python
# Copy a file in binary mode (works for any file type)
def copy_file(src, dst, chunk_size=65536):
    """Copy src to dst reading chunk_size bytes at a time."""
    with open(src, "rb") as source, open(dst, "wb") as dest:
        while True:
            chunk = source.read(chunk_size)
            if not chunk:
                break
            dest.write(chunk)
    print(f"Copied '{src}' β†’ '{dst}'")

# Read the first 4 bytes (magic bytes) of a PNG file
def is_png(filepath):
    PNG_MAGIC = b'\x89PNG'
    try:
        with open(filepath, "rb") as f:
            return f.read(4) == PNG_MAGIC
    except (FileNotFoundError, IOError):
        return False

File Position – seek() and tell()

Python
with open("output.txt", "r", encoding="utf-8") as f:
    print(f.tell())          # 0 – at the start

    first = f.read(5)
    print(f"Read: {first!r}")
    print(f.tell())          # 5

    f.seek(0)                # rewind to start
    print(f.tell())          # 0

    f.seek(0, 2)             # seek to end (0 bytes from end)
    print(f"File size: {f.tell()} bytes")

The os Module – File System Operations

The os module provides functions for creating, removing, and inspecting files and directories:

Python
import os

# Current working directory
print(os.getcwd())           # /home/user/project

# Check existence
print(os.path.exists("notes.txt"))   # True or False
print(os.path.isfile("notes.txt"))   # True if it's a regular file
print(os.path.isdir("data"))         # True if it's a directory

# File metadata
size = os.path.getsize("notes.txt")
print(f"Size: {size} bytes")

# Create a directory (and nested directories)
os.makedirs("data/output", exist_ok=True)

# List directory contents
for entry in os.listdir("."):
    print(entry)

# Rename / move a file
os.rename("old_name.txt", "new_name.txt")

# Delete a file (raises FileNotFoundError if missing)
if os.path.exists("temp.txt"):
    os.remove("temp.txt")

# Walk a directory tree
for root, dirs, files in os.walk("project"):
    for filename in files:
        full_path = os.path.join(root, filename)
        print(full_path)

Modern File Paths with pathlib

pathlib.Path (introduced in Python 3.4) provides an object-oriented API for paths. It is more readable and cross-platform than string-based os.path manipulation:

Python
from pathlib import Path

# Create a Path object
p = Path("data/notes.txt")

# Path inspection
print(p.name)        # notes.txt
print(p.stem)        # notes
print(p.suffix)      # .txt
print(p.parent)      # data
print(p.absolute())  # /home/user/project/data/notes.txt

# Build paths with / operator (cross-platform!)
base = Path("project")
config = base / "config" / "settings.json"
print(config)        # project/config/settings.json

# Check existence
print(p.exists())    # True / False
print(p.is_file())
print(p.is_dir())

# Read / write text directly
config_path = Path("config.txt")
config_path.write_text("debug=true\nport=8080\n", encoding="utf-8")
content = config_path.read_text(encoding="utf-8")
print(content)

# Read / write bytes directly
Path("data.bin").write_bytes(b'\x00\x01\x02\x03')

# Create directories
Path("logs/2024").mkdir(parents=True, exist_ok=True)

# List files matching a pattern (glob)
project = Path(".")
for py_file in project.glob("**/*.py"):
    print(py_file)

# File size
print(f"Size: {p.stat().st_size} bytes")
ℹ️
pathlib vs os.path

For new code, prefer pathlib. It handles path separators on Windows (\) and Unix (/) automatically, and the dot-attribute API is far more readable than nested os.path.join(os.path.dirname(...)) calls.

Working with CSV and JSON Files

Python
import csv
import json

# ── CSV ──────────────────────────────────────────────────────
# Writing a CSV
students = [
    {"name": "Alice", "grade": "A", "score": 95},
    {"name": "Bob",   "grade": "B", "score": 82},
    {"name": "Carol", "grade": "A", "score": 91},
]
with open("students.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "grade", "score"])
    writer.writeheader()
    writer.writerows(students)

# Reading a CSV
with open("students.csv", "r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['name']}: {row['score']}")

# ── JSON ─────────────────────────────────────────────────────
config = {
    "debug": True,
    "port": 8080,
    "allowed_hosts": ["localhost", "127.0.0.1"],
}

# Write JSON
with open("config.json", "w", encoding="utf-8") as f:
    json.dump(config, f, indent=2)

# Read JSON
with open("config.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)
    print(loaded["port"])   # 8080
β–Ά Output
Alice: 95 Bob: 82 Carol: 91 8080

Handling File Errors

Python
from pathlib import Path

def safe_read(filepath):
    """Read a file and return its content or None on failure."""
    try:
        return Path(filepath).read_text(encoding="utf-8")
    except FileNotFoundError:
        print(f"File not found: {filepath}")
    except PermissionError:
        print(f"Permission denied: {filepath}")
    except IsADirectoryError:
        print(f"Expected a file, got a directory: {filepath}")
    except OSError as e:
        print(f"OS error ({e.errno}): {e.strerror}")
    return None

def safe_write(filepath, content):
    """Write content to filepath, creating parent directories as needed."""
    try:
        path = Path(filepath)
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(content, encoding="utf-8")
        return True
    except (PermissionError, OSError) as e:
        print(f"Could not write {filepath}: {e}")
        return False

result = safe_read("data/report.txt")
if result:
    print(result[:100])

πŸ‹οΈ Practical Exercise

Write a word_frequency(filepath) function that:

  1. Reads a text file line by line (efficient for large files).
  2. Splits each line into words, converts to lowercase, and strips punctuation.
  3. Counts how often each word appears using a dictionary.
  4. Returns the 10 most common words with their counts.
  5. Handles FileNotFoundError and PermissionError gracefully.

πŸ”₯ Challenge Exercise

Build a simple logging system using only file I/O and pathlib. Create a Logger class that:

  • Accepts a log directory path and a log level (DEBUG, INFO, WARNING, ERROR).
  • Writes each log message to a daily log file named YYYY-MM-DD.log.
  • Rotates (compresses or moves) logs older than 7 days.
  • Exposes .debug(), .info(), .warning(), and .error() methods.

Interview Questions

  • What does the with statement guarantee when working with files?
  • What is the difference between 'w' and 'a' file modes?
  • How do you read a very large file without loading it entirely into memory?
  • What is the difference between os.path and pathlib.Path?
  • What does newline="" do when opening a CSV file?
  • How do you create nested directories in one call in Python?
  • What exception is raised when you try to open a file that doesn't exist?
  • What is the difference between read(), readline(), and readlines()?
  • How do you check if a path exists and is a file (not a directory)?
  • What does seek(0) do to a file handle?

πŸ“‹ Summary

  • Use open(file, mode, encoding="utf-8") to get a file handle.
  • Always use the with statement β€” it closes the file automatically, even on exceptions.
  • Modes: 'r' (read), 'w' (write/overwrite), 'a' (append), 'x' (exclusive create); append 'b' for binary.
  • Read methods: read() (all at once), readline() (one line), readlines() (list), or iterate the file object (memory-efficient).
  • Write methods: write(str), writelines(list), or print(..., file=f).
  • pathlib.Path provides a modern, object-oriented API; use / to join paths.
  • os.makedirs(..., exist_ok=True) creates nested directories safely.
  • Always handle FileNotFoundError, PermissionError, and OSError when doing file I/O.

Frequently Asked Questions

Does Python automatically flush written data to disk? +

By default, Python buffers writes and flushes when the buffer is full or when the file is closed. Call f.flush() to force an immediate flush without closing. The with block closes the file (and thus flushes) when it exits, so you rarely need to call flush() manually.

How do I read a file that might not be UTF-8? +

Use errors='replace' or errors='ignore' in open() to handle bad bytes. For unknown encodings, install the chardet library (pip install chardet) and call chardet.detect(raw_bytes) to detect the encoding before opening.

Can I open two files at the same time with a single with statement? +

Yes. Separate them with a comma: with open("src.txt") as src, open("dst.txt", "w") as dst:. Both files are opened and both are closed when the block exits, regardless of exceptions.

When should I use pathlib instead of os.path? +

For new code, always prefer pathlib. It is more Pythonic, cross-platform by design, and supports method chaining. Use os.path only when working with legacy codebases or libraries that require string paths (though Path objects can be passed to most modern APIs).