Skip to content
OriginCraver (2007) — organization; neuroscience connectomics (Sporns et al. 2005)
QuestionHow are the identified components wired to each other?
Licensing evidencePath patching (edge-level) + specificity of claimed pathways vs. alternatives
Interpretive-validity riskConfusing temporal ordering (layer sequence) with causal wiring (information flow)
Position in partial orderIcon>ItopI_{\text{con}} > I_{\text{top}} — asserts structure (a graph), not just membership (a set)

A verdict tagged [implementational-connectomic] identifies directed connections between components — which feeds into which, through what pathway. This is stronger than topographic because it asserts structure (a directed graph) rather than just membership (a set). The connectomic claim says: “head AA sends information to head BB through the residual stream, and BB‘s computation depends on receiving AA‘s output.”

The analogy to neuroscience is deliberate: a connectome is a wiring diagram. It tells you the structure of the network without specifying what each neuron computes. In MI, a circuit graph (nodes = heads, edges = information-flow dependencies) is a connectomic claim. It is more than a list of important heads, and less than an algorithm.

Let G=(V,E)G = (V, E) be a directed graph where V=CV = C (the circuit components) and EV×VE \subseteq V \times V (directed edges). A connectomic claim asserts:

(ci,cj)E:PathEffect(cicj)>δANDPathEffect(cicj)PathEffect(cick) for (ci,ck)E\forall (c_i, c_j) \in E: \quad \text{PathEffect}(c_i \to c_j) > \delta \quad \text{AND} \quad \text{PathEffect}(c_i \to c_j) \gg \text{PathEffect}(c_i \to c_k) \text{ for } (c_i, c_k) \notin E

where PathEffect(cicj)\text{PathEffect}(c_i \to c_j) is the causal effect of patching cic_i‘s output specifically at the path to cjc_j (holding other paths fixed). This is what path patching measures.

The key distinction from topographic: a topographic claim is invariant to permutation of the component set. A connectomic claim is not — it asserts directed relationships between specific pairs.

What licenses an [implementational-connectomic] tag

Section titled “What licenses an [implementational-connectomic] tag”
  1. Path-level causal evidence — path patching (Goldowsky-Dill et al. 2023) or edge attribution demonstrating that the claimed pathway is load-bearing. Activation patching alone establishes node importance (topographic), not edge importance (connectomic).

  2. Specificity of claimed paths — patching along the claimed path has substantially more effect than patching along alternative paths of the same length. If every path from AA to downstream has similar effect, the claim is not connectomic — it’s just that AA matters (topographic).

  3. Directionality — the effect must be asymmetric. If corrupting AA affects BB and corrupting BB equally affects AA (after controlling for layer order), the “wiring” claim is not established.

  4. Convergent structural evidence (optional but strengthening) — weight-space composition scores (QK/OV composition, virtual weights) that agree with the causal graph. When path patching and weight-space analysis agree on the same edges, the connectomic claim is substantially stronger.

What does NOT license a [implementational-connectomic] tag

Section titled “What does NOT license a [implementational-connectomic] tag”
  • Layer ordering alone. In a transformer, every earlier layer’s output is accessible to every later layer via the residual stream. The fact that AA is in layer 5 and BB is in layer 9 does not mean AA connects to BB — every layer-5 head “connects” to every layer-9 head in this trivial sense. A connectomic claim must show specific, load-bearing connections above this background.
  • Attention pattern inspection. That head BB attends to positions where head AA has written is suggestive but not causal. The residual stream contains contributions from many components at each position.
  • Correlation between head activations. Two heads being co-active is statistical, not structural. They might both respond to the same input feature without being wired to each other.
  • ACDC edges without effect validation. ACDC discovers edges via iterative patching, but the discovered graph should be validated with held-out path-patching to confirm edge-level effects.
Worked example: IOI QK composition edges

Claim. In the IOI circuit, the S-inhibition heads (L7H3, L7H10, L8H6, L8H10, L8H11) receive directed input from the duplicate-token heads (L5H1, L5H5) via the residual stream, and this connection is load-bearing for the suppression of the repeated name. [implementational-connectomic]

Evidence:

  • Path patching from L5H1/L5H5 output specifically at the path to L7-8 S-inhibition heads shows Δlogit diff=0.4\Delta \text{logit diff} = 0.4-0.70.7
  • Alternative paths (L5H1 → name-mover heads directly) show Δ<0.05\Delta < 0.05 — the information flows through S-inhibition first
  • QK composition scores: WOV5.1,WQK7.3\langle W_{OV}^{5.1}, W_{QK}^{7.3} \rangle is high relative to random head pairs (top 5% of all pairwise compositions)
  • Directionality: corrupting L5H1 degrades L7H3’s attention pattern; corrupting L7H3 does not affect L5H1’s attention pattern

What this is not: This does not specify what the duplicate-token heads compute or what operation S-inhibition performs. It says only that information flows directionally from one group to the other, and that this flow is necessary for the behavior.

DirectionWhat’s required
ItopIconI_{\text{top}} \to I_{\text{con}} (upgrade from topographic)Path-level causal evidence that specific edges carry information, not just that nodes matter.
IconIfunI_{\text{con}} \to I_{\text{fun}} (upgrade to functional)Specify what each node in the graph does to its input to produce its output. The graph tells you the wiring; the functional claim tells you the components.
IconAI_{\text{con}} \to A (upgrade to algorithmic)Combine the graph (connectomic) with the node functions (functional) and demonstrate sufficiency of the procedure. An algorithm = wiring + operations + sufficiency.

Instruments that provide connectomic-level evidence

Section titled “Instruments that provide connectomic-level evidence”
  • A02 (Path patching) — direct causal evidence of edge-level effects
  • B02 (OV/QK composition) — weight-space evidence for compositional wiring
  • B08 (Edge Jaccard) — agreement between methods on the edge set
  • A13 (PC algorithm) — observational causal discovery of the graph structure
  • C01 (Transfer entropy) — directional information flow between components