Machine Learning - Recurrent Neural Networks (RNN)
Overview
RNNs model sequences by maintaining hidden state. LSTMs/GRUs mitigate vanishing gradients and often perform better on longer contexts.
Minimal LSTM (PyTorch)
import torch, torch.nn as nn
seq_len, batch, feat, hidden = 20, 4, 8, 16
x = torch.randn(seq_len, batch, feat)
lstm = nn.LSTM(input_size=feat, hidden_size=hidden, num_layers=1)
out, (h, c) = lstm(x)
# out: (seq_len, batch, hidden), last hidden state h: (1, batch, hidden)