Skip to content

This framework asks: How many independent directions does a circuit subspace actually use in practice?

PCA dimensionality measures the effective rank of a representation by analyzing the eigenvalue spectrum of its covariance matrix. A representation that concentrates variance in few principal components is low-dimensional (even if embedded in a high-dimensional space), suggesting compact, interpretable structure. A flat spectrum indicates the representation uses all available dimensions.

For circuit analysis, PCA dimensionality reveals whether a circuit head’s computations are intrinsically low-rank — concentrated in a small subspace — or genuinely high-dimensional. Low effective dimensionality suggests the head performs a simple, potentially interpretable operation.

SourceYearKey contribution
Ansuini et al., “Intrinsic dimension of data representations in deep neural networks”2019Measured intrinsic dimension across layers
Aghajanyan et al., “Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning”2020Connected low intrinsic dimension to fine-tuning
Papyan et al., “Prevalence of Neural Collapse”2020Spectral collapse in final layers
Hu et al., “LoRA: Low-Rank Adaptation”2021Exploited low intrinsic dimensionality for efficient adaptation

Given activations ( H \in \mathbb{R}^{n \times d} ) (centered), the covariance eigenvalues ( \lambda_1 \geq \lambda_2 \geq \ldots \geq \lambda_d ) define several dimensionality measures:

[ d_{\text{eff}}^{(90%)} = \min\left{ k : \frac{\sum_{i=1}^k \lambda_i}{\sum_{i=1}^d \lambda_i} \geq 0.9 \right} ]

The participation ratio offers a softer measure:

[ d_{\text{PR}} = \frac{\left(\sum_i \lambda_i\right)^2}{\sum_i \lambda_i^2} ]

Low ( d_{\text{eff}} ) relative to ambient dimension ( d ) indicates a low-rank representation. The spectrum shape (exponential decay vs. power-law vs. plateau) reveals the underlying structure of the computation.

Spectral Analysis (spectral_dimensionality.py)

Section titled “Spectral Analysis (spectral_dimensionality.py)”

Computes eigenvalue spectrum of per-head activation covariance, reporting effective dimension at 90% and 95% variance thresholds.

What it establishes: How many principal directions capture most of the activation variance per circuit component. What it does not establish: Whether the top directions are task-relevant (combine with E01/E02 for that).

Usage:

uv run python spectral_dimensionality.py --tasks ioi sva --threshold 0.9

Tracks effective dimensionality across layers, revealing whether representations compress or expand.

What it establishes: The dimensionality trajectory through the network — compression points suggest information bottlenecks. What it does not establish: What information is discarded during compression.

Usage:

uv run python spectral_dimensionality.py --tasks ioi sva --layer-profile
PatternWhat it means
( d_{\text{eff}} \ll d )Low-rank representation; head performs a simple projection
( d_{\text{eff}} \approx d )Full-rank; head uses all available dimensions
Dimensionality drops at layer ( \ell )Information bottleneck — representations compress
Exponential spectral decaySingle dominant direction with rapid falloff
Power-law spectrumMulti-scale structure without clear cutoff