Mechanistic Validity Taxonomy
Section titled “Mechanistic Validity Taxonomy”| Role | Organizing spine of the entire framework |
| Layers | A (Instruments) → B (Evidence families) → C (Criteria) → D (Validity types) → E (Verdicts) |
| Verdict annotation | Mode tag: the seven description levels |
| Last updated | 16 May 2026 |
Every circuit claim in mechanistic interpretability is a chain from a concrete measurement to a conclusion. The taxonomy names every link in that chain. Reading bottom-up is how you build a claim. Reading top-down is how you evaluate one.
The five-layer hierarchy
Section titled “The five-layer hierarchy”Mode tag ── [computational] [algorithmic] [representational] [implementational] [architectural] [structural] [transportable] │ ▼Layer E ── Verdict ────────── The claim, stated with explicit scope and mode tag │Layer D ── Validity types ─── The five abstract questions a claim must answer │ ├── Construct │ ├── Internal │ ├── External │ ├── Measurement │ └── Interpretive │Layer C ── Criteria ───────── ~27 specific, falsifiable conditions, grouped by type │Layer B ── Evidence families ─ The six kinds of signal an instrument can produce │ Causal | Structural | Representational | Behavioral | Info-theoretic | Measurement │Layer A ── Instruments ────── The concrete runnable testsA claim that skips Layer D is not a finding. It is a measurement with a story attached.
How to read the taxonomy
Section titled “How to read the taxonomy”| Direction | Use case |
|---|---|
| Bottom-up (A → E) | Building a new claim: what does my instrument establish? |
| Top-down (E → A) | Auditing an existing claim: what evidence would this verdict require? |
| Sideways (across Layer B) | Checking convergent validity: do independent evidence families agree? |
| Mode tag last | After the verdict is assembled, check whether the declared description level is licensed |
How the layers gate each other
Section titled “How the layers gate each other”The layers form a dependency order:
- Layer A must be run before Layer B can be assigned.
- Layer C cannot be assessed before Layer B.
- Layer D cannot be satisfied before Layer C.
- Layer E cannot be written before Layer D.
- The mode tag is applied to Layer E last.
The dependency order is the reason the audit procedure runs construct validity first and interpretive validity last.
The mode tag
Section titled “The mode tag”Every verdict carries a bracketed description-mode tag that names the level of description the claim operates at.
| Tag | Claim type | Page |
|---|---|---|
[computational] | What is being computed and why | A_computational.md |
[algorithmic] | What operation is performed | B_algorithmic.md |
[representational] | What is encoded, where, how | C_representational.md |
[implementational] | Which weights/components carry it | D_implementational.md |
[architectural] | How computational labor is distributed | E_architectural.md |
[structural] | What the weights say before any input | F_structural.md |
[transportable] | Which features survive cross-model shift | G_transportable.md |
For how the seven modes extend Marr’s three levels, see H_marr-comparison.md.
Where to go from here
Section titled “Where to go from here”- Dependency order in detail: B_dependency-order.md
- How to read and write a verdict: C_verdict-anatomy.md
- Design commitments behind the framework: D_design-principles.md