Skip to content
OriginMarr (1982), Level 1
QuestionWhat function does the system compute, and why is it the right function?
Licensing evidenceTask specification + normative account + behavioral coverage across the full variation space
Interpretive-validity riskLevel inflation — claiming computational understanding when the evidence licenses only algorithmic or implementational
Position in partial orderC>A>R>Ifun>Icon>ItopC > A > R > I_{\text{fun}} > I_{\text{con}} > I_{\text{top}} — highest commitment

A verdict tagged [computational] specifies an input-output mapping f:XYf: \mathcal{X} \to \mathcal{Y} and a normative account of why ff is the correct or optimal solution to an environmental problem. Marr’s canonical example: early vision computes edge detection because edges correspond to depth discontinuities, object boundaries, and illumination changes. The explanation is why edges are the right thing to compute — not just that the system computes them.

In MI, a computational-mode claim says the model solves a specific problem (indirect object identification, ordinal comparison, in-context sequence completion) and that this solution is the solution to an identifiable subproblem of language modeling. The claim is not just that the model produces the right outputs — any sufficiently trained model does that. The claim is that the model’s internal organization reflects the structure of the problem in a way that makes the solution intelligible.

Let T\mathcal{T} denote a task (a behavioral regularity in the data). A computational-mode claim asserts:

  1. There exists a function fC:XYf_C: \mathcal{X} \to \mathcal{Y} that characterizes the task
  2. The circuit CC approximates fCf_C across the relevant variation: PrxDtest[m(C,x)=fC(x)]1ϵ\Pr_{x \sim \mathcal{D}_{\text{test}}}[m(C, x) = f_C(x)] \geq 1 - \epsilon
  3. fCf_C is the right function to compute given the structure of language — there is a normative account of why this subproblem is separable and why the model should solve it

Condition (3) is what distinguishes computational from algorithmic. An algorithm can be stated without justifying why it solves the right problem.

Three requirements, all mandatory:

  1. Specification of ff with domain and range characterized — precise enough that a reader could build a lookup table implementing it without knowing transformers exist
  2. Normative account — why this function is the correct solution to a genuine subproblem of language modeling. What structural properties of language make the problem separable?
  3. Error analysis — the model’s failure modes correspond to cases where the problem specification is genuinely ambiguous or ill-defined, not arbitrary failures. Edge-case behavior consistent with the normative theory.

Optionally: a formal optimality argument showing the model’s solution is efficient or rational given its constraints.

What does NOT license a [computational] tag

Section titled “What does NOT license a [computational] tag”
  • Naming a function without the normative account. “The model computes IOI” is not computational unless accompanied by an account of why IOI is a well-posed subproblem and what structural properties of language make it separable from broader coreference.
  • High faithfulness on one prompt distribution. Faithfulness is a behavioral metric on tested prompts, not a task-level characterization.
  • Showing ablation degrades performance. That is implementational (locus of causal necessity). It does not establish what problem the circuit solves.
  • Teleological inflation. “The model needs this circuit to perform the task” — need is implementational, not computational.
Worked example: IOI as computational vs. algorithmic

Wang et al. (2022) identify a circuit for indirect object identification. Is the claim computational or algorithmic?

Computational version: “The model solves the coreference resolution subproblem of identifying which named entity fills the indirect object role, treating this as a constraint-satisfaction problem over syntactic roles and entity mentions. This is a well-posed subproblem because indirect objects in English are systematically predictable from the verb’s argument structure and prior entity mentions.”

Algorithmic version: “The model identifies the indirect object through a specific procedure: duplicate-token heads mark repeated names, S-inhibition heads suppress the subject, and name-mover heads copy the remaining name to the output position.”

The first makes a commitment about what problem is being solved and why. The second makes a commitment about how it is solved step by step. The evidence for the second (activation patching, path patching) does not establish the first — the first requires additionally showing that the problem decomposition is correct (that IOI is genuinely separable from broader coreference) and that the solution structure reflects the problem structure.

Most published IOI work operates at the algorithmic level, using computational-level language.

Worked example: Greater-Than as a valid computational claim

Hanna et al. (2023) characterize the Greater-Than circuit:

Task specification. fGT("The war lasted from y1 to y2")=valid iff y2>y1f_{\text{GT}}(\text{"The war lasted from } y_1 \text{ to } y_2\text{"}) = \text{valid iff } y_2 > y_1

Normative account. Temporal ordering is a genuine subproblem of language modeling because English systematically constrains year sequences in “from X to Y” constructions. The model should solve this because violating temporal order would produce low-probability continuations across a wide class of natural text.

Behavioral coverage. Tested across 11 sentence frames and year pairs spanning 1000-2000, with accuracy 0.92\geq 0.92 on all frames including those not used during discovery.

Error analysis. Failures cluster at the boundary (y2y1y_2 \approx y_1) where the problem is genuinely ambiguous, consistent with the normative theory.

This satisfies all three requirements: specification, normative account, and error analysis.

DirectionWhat’s required
ACA \to C (upgrade from algorithmic)Normative account: why this algorithm solves the right problem. Error analysis showing failures correspond to problem boundaries, not arbitrary implementation limits.
CAC \to A (downgrade to algorithmic)If the normative account is missing or the error analysis shows failures unrelated to problem structure, the claim is algorithmic at best.

Instruments that provide computational-level evidence

Section titled “Instruments that provide computational-level evidence”
  • D01 (Behavioral: logit attribution) — measures output quality across conditions
  • D07 (Generalization gap) — tests whether the circuit generalizes beyond its discovery distribution
  • A06 (Probabilistic specificity) — tests whether the circuit is task-specific or a general bottleneck
  • F03 (Nomological validity) — whether the circuit obeys theoretical predictions about the task