Skip to content
Validity typeInternal
Pass conditionAblating the proposed component reliably degrades the target behavior, across ≥2 ablation methods
Evidence familyCausal
Minimum reportingAblation method(s), metric, delta value(s), number of prompt samples
Common failure modeReporting only one ablation method; not comparing zero, resample, and mean ablation

Necessity establishes that the proposed component is causally required for the behavior: when it is removed or disrupted, the behavior degrades.

Satisfied when:

  1. Ablating the component degrades the target behavior, measured on the target metric.
  2. The result holds across ≥2 ablation methods. Zero, resample, and mean ablation each have different failure modes. Consistent degradation across all three is robustly necessary.
  3. Result is computed over ≥30–50 prompts with variance estimated.
MethodWhat it doesFailure mode
Zero ablationSets activation to zeroUnnatural zero can trigger unexpected downstream effects
Resample ablationReplaces with activation from shuffled promptMay understate effects for correlated heads
Mean ablationReplaces with dataset meanMean carries task-relevant information; can overstate effects via mean-field disruption
Patch ablationReplaces with corrupted-input run activationMost controlled; establishes necessity under specific counterfactual

Using all three unconditional methods together and showing consistent degradation across all three is the gold standard.

Necessity establishes the component is required. It does not establish that restoring it alone recovers the behavior. Necessity alone licenses Causally suggestive tier at best. Mechanistically supported requires sufficiency (I2) as well.

This is the most common understated result in MI: showing ablation degrades behavior, calling the component “a core circuit member,” and implicitly claiming more than necessity licenses.

  • Ablation method(s), metric, delta for each method.
  • Number of prompts and whether variance was estimated.
  • If only one ablation method used: flag necessity as partially satisfied (method variance untested).