B06 — Template Distance
Section titled “B06 — Template Distance”This framework asks: how structurally similar are circuits for different tasks, and do related tasks share circuit topology?
Template distance measures the structural distance between two circuits — the minimum number of edge/node additions, deletions, or substitutions needed to transform one circuit graph into another. When circuits for semantically related tasks (e.g., IOI and colored objects) have low template distance, this suggests shared computational infrastructure. When unrelated tasks have high template distance, it confirms that circuit discovery is identifying genuinely task-specific structure rather than generic attention patterns.
This instrument operates at the graph level rather than the weight level: it compares circuit topologies (which heads, which edges) rather than individual weight matrices. It is the structural complement to behavioral generalization tests — template distance asks whether the structure generalizes, not just the performance.
Theoretical grounding
Section titled “Theoretical grounding”| Source | Year | Key contribution |
|---|---|---|
| Conmy et al., arXiv 2304.14997 | 2023 | ACDC produces circuit graphs amenable to structural comparison |
| Sanchez-Lengeling et al., arXiv 2010.00321 | 2020 | Graph-edit distance and graph kernels for comparing computational graphs |
| Meister & Cotterell, arXiv 2305.15054 | 2023 | Circuit universality — shared structure across models and tasks |
| Hanna et al., arXiv 2305.00586 | 2023 | Cross-task circuit comparison in GPT-2 |
Core concept
Section titled “Core concept”Let circuits ( C_1 = (V_1, E_1) ) and ( C_2 = (V_2, E_2) ) be directed graphs where vertices are model components (heads, MLPs, residual stream positions) and edges represent information flow. The graph-edit distance is:
[ d_{\text{GED}}(C_1, C_2) = \min_{\text{edit sequence}} \sum_{i} \text{cost}(e_i) ]
where edit operations include node insertion/deletion and edge insertion/deletion. For circuits with labeled nodes (layer, head index), a natural cost function assigns zero cost to matching nodes and unit cost to mismatches.
A normalized version — circuit metric distance (CMD) — scales by circuit size:
[ d_{\text{CMD}}(C_1, C_2) = \frac{d_{\text{GED}}(C_1, C_2)}{|V_1| + |V_2| + |E_1| + |E_2|} ]
Values near 0 indicate nearly identical circuits; values near 0.5 indicate completely disjoint circuits.
Instruments under B06
Section titled “Instruments under B06”Circuit Metric Distance (26_cmd.py)
Section titled “Circuit Metric Distance (26_cmd.py)”Computes pairwise CMD between circuits discovered for different tasks. Reports: (1) the full distance matrix, (2) hierarchical clustering of tasks by circuit similarity, (3) identification of shared “backbone” components present in multiple circuits.
What it establishes: Quantitative structural similarity between task circuits, enabling claims about shared vs. task-specific computational infrastructure.
What it does not establish: Whether shared structure implies shared mechanism (two circuits may share topology but implement different computations via different weights).
Usage:
uv run python 26_cmd.py --tasks ioi sva greater_thanReading the scores
Section titled “Reading the scores”| Pattern | What it means |
|---|---|
| Low CMD between related tasks | Shared computational infrastructure — potential circuit universals |
| High CMD between all task pairs | Each task uses genuinely distinct circuitry |
| Cluster of low-CMD tasks | Family of tasks sharing a computational backbone |
| One component in all circuits | Potential “hub” component — structurally universal |
| CMD near 0.5 | Circuits are maximally different — no structural overlap |