Python - RegEx
Overview
Estimated time: 25–35 minutes
Use the re
module to search, extract, and replace text using patterns.
Learning Objectives
- Use
re.search
,re.findall
, andre.sub
. - Write readable patterns with raw strings and verbose mode.
- Compile patterns for reuse.
Prerequisites
Basic search and groups
import re
m = re.search(r"(\d{4})-(\d{2})-(\d{2})", "Date: 2025-09-05")
if m:
year, month, day = m.groups()
print(year, month, day)
Find all and replace
text = "Emails: [email protected], [email protected]"
print(re.findall(r"[\w.-]+@[\w.-]+", text))
print(re.sub(r"test.org", "example.org", text))
Compiled and verbose patterns
pattern = re.compile(r"""
^ # start
[A-Za-z0-9_.-]+ # local part
@
[A-Za-z0-9.-]+ # domain
$ # end
""", re.VERBOSE)
print(bool(pattern.match("[email protected]")))
Common Pitfalls
- Forgetting raw strings (r"...") when backslashes are used.
- Overcomplicated regex when a simple split or parse would do.
Checks for Understanding
- What does
re.VERBOSE
enable? - How do you capture groups in a pattern?
Show answers
- Whitespace and comments in multi-line regex for readability.
- Parentheses around subpatterns.
Exercises
- Extract all dates in YYYY-MM-DD from a block of text.
- Write a regex to validate simple IPv4 addresses.