Python - CSV
Overview
Estimated time: 20–30 minutes
CSV is a common interchange format. Learn how to read and write CSV safely using the standard library’s csv
module.
Learning Objectives
- Read and write CSV files with
csv.reader
/csv.writer
andDictReader
/DictWriter
. - Handle newlines and encoding correctly across platforms.
- Customize dialects (delimiter, quotechar) and avoid common pitfalls.
Prerequisites
- Files & I/O basics (
open
, context managers)
Examples
import csv
from pathlib import Path
rows = [
{"id": 1, "name": "Ada"},
{"id": 2, "name": "Alan"},
]
# Write
with Path("people.csv").open("w", newline="", encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=["id", "name"])
w.writeheader()
w.writerows(rows)
# Read
with Path("people.csv").open("r", newline="", encoding="utf-8") as f:
r = csv.DictReader(f)
for row in r:
print(row)
Expected Output: dict rows printed with id and name fields.
Common Pitfalls
- On Windows, pass
newline=""
toopen
when usingcsv
to avoid blank lines. - Always specify the encoding (usually
utf-8
); be deliberate with files from Excel/legacy systems. - Don’t parse CSV with
split
—use thecsv
module for proper quoting/escaping.
Best Practices
- Prefer
DictReader
/DictWriter
for clearer code. - Validate and sanitize data; be robust to missing columns and extra whitespace.
- Document delimiters and quote conventions if you deviate from the default dialect.
Checks for Understanding
- Why is
newline=""
important when writing CSV on Windows? - When should you prefer
DictWriter
overwriter
?
Show answers
- To prevent extra blank lines due to newline translation.
- When you want to reference columns by name and ensure headers are written consistently.
Exercises
- Read a CSV with a semicolon delimiter; write a cleaned version with a comma delimiter and trimmed fields.
- Aggregate a CSV of transactions by user, outputting totals.