Skip to content

This framework asks: How similar are two representations in terms of their learned feature structure, independent of rotation or scale?

CKA measures the alignment between two sets of representations by comparing their Gram matrices (or linear kernels). Unlike simple correlation of flattened activations, CKA is invariant to orthogonal transformations and isotropic scaling — making it ideal for comparing representations across different layers, training runs, or model architectures.

For circuit analysis, CKA reveals whether two circuit heads learn similar representational structure even when their specific weight matrices differ. It also tracks how representations evolve across layers, identifying computational phases in the network.

SourceYearKey contribution
Kornblith et al., “Similarity of Neural Network Representations Revisited”2019Introduced CKA; showed superiority over CCA and PWCCA
Cortes et al., “Algorithms for Learning Kernels Based on Centered Alignment”2012Original centered alignment formulation
Nguyen et al., “Do Wide Neural Networks Really Need to be Wide?“2020Used CKA to analyze width vs. representation similarity
Raghu et al., “Do Vision Transformers See Like CNNs?“2021CKA for cross-architecture comparison

Given representations ( X \in \mathbb{R}^{n \times p} ) and ( Y \in \mathbb{R}^{n \times q} ) for ( n ) inputs, linear CKA computes:

[ \text{CKA}(X, Y) = \frac{|Y^T X|_F^2}{|X^T X|_F \cdot |Y^T Y|_F} ]

after centering both ( X ) and ( Y ). This is equivalent to the HSIC (Hilbert-Schmidt Independence Criterion) normalized by the individual kernel norms. CKA = 1 when representations encode identical structure up to linear transformation; CKA = 0 when they are independent.

The linear kernel suffices for most analyses. For nonlinear structure, RBF-kernel CKA replaces the Gram matrix ( XX^T ) with ( K_{ij} = \exp(-|x_i - x_j|^2 / 2\sigma^2) ).

Computes pairwise CKA between all layers, producing a similarity heatmap that reveals representational phases.

What it establishes: Which layers share representational structure and where phase transitions occur. What it does not establish: What information content changes at each transition.

Usage:

uv run python cka_analysis.py --tasks ioi sva --kernel linear

Compares representations between the full model and ablated circuits to quantify how much representational structure a circuit accounts for.

What it establishes: Whether a subset of circuit heads captures the full model’s representational geometry. What it does not establish: Whether the captured structure is task-relevant (combine with RSA for that).

Usage:

uv run python cka_analysis.py --tasks ioi sva --compare-ablated
PatternWhat it means
CKA ~ 1.0 across adjacent layersRepresentational continuity; gradual refinement
CKA block structure (high within, low across)Distinct computational phases in the network
CKA(full, ablated) > 0.9Circuit subset preserves nearly all representational structure
CKA(full, ablated) < 0.5Ablation destroys significant representational geometry
Early-late CKA near zeroDeep transformation; early and late layers share little structure