Python - Pandas
Overview
Estimated time: 35–50 minutes
Pandas provides powerful data structures for tabular data analysis: Series and DataFrame.
Learning Objectives
- Load data from CSV/JSON and inspect DataFrames.
- Select/filter with loc/iloc; perform joins and groupby aggregations.
- Understand tidy data principles and common pitfalls.
Examples
import pandas as pd
df = pd.DataFrame({"name": ["Ada","Alan"], "score": [95, 88]})
print(df[df.score > 90][["name","score"]])
print(df.groupby("name").score.mean())
Guidance & Patterns
- Use loc/iloc explicitly; avoid chained indexing.
- Prefer vectorized operations; avoid row-wise apply when possible.
Best Practices
- Keep columns typed correctly; parse dates on read.
- Document assumptions; validate inputs and handle missing values.
Exercises
- Join two DataFrames on a key and compute summary stats.
- Reshape data with pivot/melt to achieve tidy form.