A GPT-2–style walkthrough of causal self-attention—from a small worked example through single-head and multi-head formulations and their computational graphs.
An exploration of Sparse Autoencoders as a tool for decomposing polysemantic neural network representations into interpretable, monosemantic features.