Skip to content
OriginMarr (1982), Level 2
QuestionWhat procedure does the system execute to produce the output?
Licensing evidencePath patching demonstrating information flow + operation specification + sufficiency of the claimed procedure
Interpretive-validity riskNaming components sequentially and calling it an “algorithm” without specifying what each step does
Position in partial orderC>A>R>Ifun>Icon>ItopC > A > R > I_{\text{fun}} > I_{\text{con}} > I_{\text{top}} — second highest

A verdict tagged [algorithmic] specifies a sequence of operations, their ordering, and how intermediate representations flow between them. The algorithm is stated with enough precision that it could be re-implemented — it is a description of a procedure, not just a naming of components.

An algorithmic claim says the model follows a specific step-by-step procedure. It must specify: (1) what operation each step performs on its input, (2) what output that operation produces, (3) how outputs flow between steps, and (4) that executing the claimed procedure on the claimed inputs produces the claimed outputs.

An algorithmic claim asserts the existence of a sequence of operations {o1,,ok}\{o_1, \ldots, o_k\} such that:

y=okok1o1(x)y = o_k \circ o_{k-1} \circ \cdots \circ o_1(x)

where each oio_i is specified at the component level (which head/MLP performs it), with its input domain (what it reads from the residual stream) and output range (what it writes). The composition must be sufficient: executing the claimed procedure should reproduce the behavior without the rest of the model (Craver 2007’s sufficiency criterion).

The ordering is testable: if the algorithm claims step AA feeds step BB, then interventions on AA at the correct position should affect BB‘s output, and interventions at other positions should not.

  1. Operation specification — each step states what transformation is applied, not just which component is active. “L5H1 attends to the previous token” is not enough; “L5H1 copies position information backward via its OV circuit, writing the attended token’s embedding into the residual stream at the current position” specifies the operation.

  2. Path-level causal evidence — path patching (not just activation patching) demonstrating the claimed information flow. Activation patching establishes necessity; path patching establishes directed dependency between specific steps.

  3. Timing consistency — if step AA at layer l1l_1 feeds step BB at layer l2>l1l_2 > l_1, then corrupting AA‘s output at layer l1l_1 should degrade BB‘s behavior, and corrupting at l2l_2 directly should have a different (larger) effect.

  4. Sufficiency — the claimed procedure, executed in isolation (minimal circuit with complement ablated), must reproduce the behavior. This is the difference between “these components are important” (implementational) and “they jointly execute this procedure” (algorithmic).

What does NOT license an [algorithmic] tag

Section titled “What does NOT license an [algorithmic] tag”
  • Naming components and calling their sequential activation an “algorithm.” An algorithm must specify what operation each step performs on its input to produce its output, not just which components are active in which order.
  • Activation patching alone. Activation patching establishes causal necessity (implementational). It does not establish directed information flow between steps (algorithmic).
  • Post-hoc narrative. “First the model attends to X, then it processes Y, then it outputs Z” narrated from attention patterns is not an algorithm unless each step’s operation is specified and the information flow is causally demonstrated.
Worked example: Induction heads as algorithmic

Olsson et al. (2022) describe induction heads with an algorithmic claim:

Step 1 (previous-token heads, L0-L1): Copy position information backward — the OV circuit writes the embedding of the previous token into the current position’s residual stream.

Step 2 (induction heads, L5-L6): Use QK composition with step 1’s output to attend to the token following the previous occurrence of the current token. The Q vectors at the current position compose with K vectors that now contain previous-token information (from step 1), creating an attention pattern that targets the post-match position.

Step 3 (copying via OV): The OV circuit of the induction head copies the attended token’s embedding to the output, boosting its logit.

Why this is algorithmic, not just implementational: Each step specifies an operation (copy, compose, attend). The information flow is directional (step 1 output feeds step 2 input via QK composition). The procedure is sufficient — the two-layer induction circuit reproduces in-context learning behavior in isolation.

Evidence: QK composition scores demonstrate the directed dependency. Patching step 1’s output disrupts step 2’s attention pattern. The minimal two-layer circuit reproduces the behavior.

Anti-pattern: IOI "algorithm" that's actually topographic

Pseudo-algorithmic claim: “The IOI algorithm works as follows: duplicate-token heads fire, then S-inhibition heads fire, then name-mover heads fire.”

Why this fails: It lists components in layer order and calls it an algorithm. But it doesn’t specify: What operation do duplicate-token heads perform on their input? What representation do they write? How does S-inhibition read that output? The temporal sequence of layer computation is architecture, not algorithm. Every transformer processes layers sequentially — that fact alone is not an algorithmic claim.

Correct version: “Duplicate-token heads compute the identity of repeated name tokens by attending from the second occurrence to the first and writing a same-token signal into the residual stream. S-inhibition heads read this signal and suppress the corresponding name’s contribution to the output logits by writing a negative direction aligned with that name’s unembedding vector. Name-mover heads then copy the remaining (unsuppressed) name to the output.”

Now each step has a specified operation, input, and output.

DirectionWhat’s required
IfunAI_{\text{fun}} \to A (upgrade from functional)You have component-level functions — now demonstrate ordering, composition, and sufficiency: the functions flow into each other through a specific directed procedure that jointly produces the behavior.
ACA \to C (upgrade to computational)Provide the normative account: why does this algorithm solve the right problem? Show error analysis consistent with problem boundaries.
AIfunA \to I_{\text{fun}} (downgrade)If path-level causal evidence fails — information flow is not directional or the procedure is not sufficient in isolation — the claim is at best functional (individual operations) without the algorithmic composition.

Instruments that provide algorithmic-level evidence

Section titled “Instruments that provide algorithmic-level evidence”
  • A02 (Path patching) — directed causal evidence of information flow
  • A04 (Resample ablation) — sufficiency of the proposed procedure
  • B02 (OV/QK composition) — weight-space evidence for step composition
  • B08 (Edge Jaccard) — stability of the claimed information-flow graph