Skip to content
TierLateral (outside progression)
What it meansMultiple mechanistic accounts are consistent with all available evidence — the data does not distinguish between them
When to assignTwo or more hypotheses have comparable evidential support and no available experiment has been performed to separate them
Resolution pathIdentify the discriminating experiment and perform it — the claim then moves to the appropriate progressive tier
Relationship to progressive tiersA claim at any progressive tier can additionally be flagged as Underdetermined if competing accounts exist at that tier

Underdetermined is not a failure. It is a precise characterization of the epistemic state: the evidence is real, the measurements are sound, but the data is equally consistent with multiple distinct mechanistic accounts. The informative response is to name the competing accounts and identify what experiment would distinguish them.

This verdict prevents premature commitment to one mechanistic story when the evidence does not support that commitment. It also provides a clear research direction: the discriminating experiment.

Formally, let H1,H2,,HnH_1, H_2, \ldots, H_n be competing mechanistic hypotheses for behavior BB. Underdetermination holds when the available evidence E\mathcal{E} is approximately equally likely under all competing hypotheses, and the posterior ratio is determined primarily by priors rather than evidence.

Verdict: Underdetermined — [implementational-topographic] Claim: The Docstring Circuit implements variable binding. Competing accounts: (1) Variable binding — tracking which variable name maps to which argument position. (2) Positional copying — copying from a fixed offset regardless of variable identity. Evidence status: Activation patching results are consistent with both accounts. Neither predicts distinct behavior on the tested prompts. Discriminating experiment: Test on prompts where variable names are reordered relative to argument positions — the two accounts predict different outputs. Scope: GPT-2 Small, Python docstring completion, Heimersheim & Janiak prompt distribution

  • All competing hypotheses stated explicitly with their predictions
  • Evidence that supports each hypothesis listed
  • Explanation of why the available evidence does not discriminate
  • At least one discriminating experiment identified (what would the competing accounts predict differently?)
  • Current progressive tier of the evidence (Underdetermined is overlaid on a progressive tier)
ResolutionOutcome
Discriminating experiment favors HiH_iClaim moves to appropriate progressive tier under HiH_i; other hypotheses become Disconfirmed or deprioritized
All hypotheses shown to be equivalentThe accounts are notational variants — dissolve into a single claim at its progressive tier
New hypothesis Hn+1H_{n+1} subsumes all othersThe more general account replaces the competing specific ones
  • Docstring Circuit (Heimersheim & Janiak, 2023) — variable binding vs. positional copying accounts both consistent with observed patching results
  • IOI S-inhibition heads — inhibition vs. attention-sink accounts both predict the observed attention patterns; targeted intervention on sink tokens would discriminate
  • Superposition vs. polysemanticity in real (non-toy) models — whether observed feature interference is superposition (geometric packing) or polysemanticity (intrinsic multi-functionality) remains underdetermined in many cases

Underdetermined does not mean “bad evidence.” A claim can have strong causal evidence (Tier 2 or 3 level) while simultaneously being underdetermined between competing accounts. The progressive tier captures evidential strength; the Underdetermined flag captures interpretive ambiguity. Both dimensions matter.