Machine Learning - Introduction

Overview

Machine Learning (ML) builds models that learn patterns from data to make predictions or decisions. This section orients you to common problem types, the basic workflow, and where to go next.

Typical workflow

  1. Define the problem and success metrics.
  2. Collect and explore data; split into train/validation/test.
  3. Preprocess (cleaning, scaling/encoding) using pipelines.
  4. Train baseline; evaluate; iterate with improvements.
  5. Validate on holdout; package for deployment; monitor.

Quick start (classification)

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

X, y = load_breast_cancer(return_X_y=True)
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
clf = make_pipeline(StandardScaler(), LogisticRegression(max_iter=1000))
clf.fit(Xtr, ytr)
print({"accuracy": accuracy_score(yte, clf.predict(Xte))})

Next steps