Python - CSV

Overview

Estimated time: 20–30 minutes

CSV is a common interchange format. Learn how to read and write CSV safely using the standard library’s csv module.

Learning Objectives

  • Read and write CSV files with csv.reader/csv.writer and DictReader/DictWriter.
  • Handle newlines and encoding correctly across platforms.
  • Customize dialects (delimiter, quotechar) and avoid common pitfalls.

Prerequisites

  • Files & I/O basics (open, context managers)

Examples

import csv
from pathlib import Path

rows = [
    {"id": 1, "name": "Ada"},
    {"id": 2, "name": "Alan"},
]

# Write
with Path("people.csv").open("w", newline="", encoding="utf-8") as f:
    w = csv.DictWriter(f, fieldnames=["id", "name"])
    w.writeheader()
    w.writerows(rows)

# Read
with Path("people.csv").open("r", newline="", encoding="utf-8") as f:
    r = csv.DictReader(f)
    for row in r:
        print(row)

Expected Output: dict rows printed with id and name fields.

Common Pitfalls

  • On Windows, pass newline="" to open when using csv to avoid blank lines.
  • Always specify the encoding (usually utf-8); be deliberate with files from Excel/legacy systems.
  • Don’t parse CSV with split—use the csv module for proper quoting/escaping.

Best Practices

  • Prefer DictReader/DictWriter for clearer code.
  • Validate and sanitize data; be robust to missing columns and extra whitespace.
  • Document delimiters and quote conventions if you deviate from the default dialect.

Checks for Understanding

  1. Why is newline="" important when writing CSV on Windows?
  2. When should you prefer DictWriter over writer?
Show answers
  1. To prevent extra blank lines due to newline translation.
  2. When you want to reference columns by name and ensure headers are written consistently.

Exercises

  1. Read a CSV with a semicolon delimiter; write a cleaned version with a comma delimiter and trimmed fields.
  2. Aggregate a CSV of transactions by user, outputting totals.