A12 — Transportability (Pearl & Bareinboim)
Section titled “A12 — Transportability (Pearl & Bareinboim)”This framework asks: under what formal conditions can a circuit discovered in one model (or scale, or distribution) be validly transferred to another?
Transportability theory (Pearl & Bareinboim 2011) provides a calculus for determining when causal conclusions derived in one domain (the “source”) are valid in a different domain (the “target”). In MI, this addresses the generalization question: if we discover that heads 9.9 and 10.0 form the name-mover circuit in GPT-2 Small, does this conclusion transfer to GPT-2 Medium? To a model trained on different data? To a different task formulation? Transportability theory formalizes exactly which structural assumptions must hold for the transfer to be valid, and identifies when additional target-domain data is needed.
This is not merely an empirical question (“does the same circuit show up?”) but a formal one: transportability tells you which differences between source and target can be tolerated and which break the transfer. When a circuit fails to transport, the theory identifies the specific causal assumptions that were violated.
Theoretical grounding
Section titled “Theoretical grounding”| Source | Year | Key contribution |
|---|---|---|
| Pearl & Bareinboim, “Transportability of Causal and Statistical Relations” | 2011 | Formal calculus for transporting causal conclusions across domains |
| Pearl, Causality | 2000/2009 | Structural causal models underlying transportability |
| Méloux et al., arXiv 2411.10442 | 2025 | Cross-model circuit comparison in language models |
| Mueller et al., arXiv 2406.14673 | 2024 | MIB benchmark: standardized evaluation enabling cross-model comparison |
Core concept: selection diagrams and S-nodes
Section titled “Core concept: selection diagrams and S-nodes”Transportability uses selection diagrams — causal graphs augmented with S-nodes that mark variables whose mechanisms differ between source and target domains. A causal conclusion is transportable if the do-calculus can express the target-domain quantity using only source-domain experimental data plus target-domain observational data, accounting for the S-nodes.
In MI terms: if the source model is GPT-2 Small and the target is GPT-2 Medium, the S-nodes mark components whose mechanisms differ (e.g., different head counts, different learned representations). The circuit is transportable if its causal structure does not depend on variables affected by S-nodes. If it does, transportability theory identifies what additional target-domain experiments are needed.
The cross-model invariance test empirically checks transportability by measuring whether the same causal structure (same edges, same IIA scores) holds across models. Formal transportability analysis can then explain why certain structures transfer (they don’t depend on S-variables) and why others don’t.
Instruments under A12
Section titled “Instruments under A12”C38 — Cross-Model Invariance (38_cross_model_invariance.py)
Section titled “C38 — Cross-Model Invariance (38_cross_model_invariance.py)”Tests whether a circuit’s causal structure is invariant across model variants. Evaluates the same circuit hypothesis (same component roles, same causal edges) in multiple models and measures structural agreement:
[ \text{Invariance}(M_1, M_2) = 1 - \frac{\text{SHD}(\mathcal{C}{M_1}, \mathcal{C}{M_2})}{|\mathcal{C}{M_1}| + |\mathcal{C}{M_2}|} ]
where SHD is structural Hamming distance between the circuits discovered in each model.
What it establishes: Whether circuit structure is model-invariant (evidence of transportability) or model-specific.
What it does not establish: Why the structure transfers or fails to transfer (requires formal transportability analysis with selection diagrams).
Usage:
uv run python 38_cross_model_invariance.py --tasks ioi sva --n-prompts 40C41 — Transportability Analysis (41_transportability.py)
Section titled “C41 — Transportability Analysis (41_transportability.py)”Instrument status: This script is a stub awaiting implementation. The directory listing confirms it does not yet exist.
An implementation would require:
- Selection diagram construction: Identify which variables differ between source and target domains (model architecture, training data, scale).
- Do-calculus derivation: Check whether the target-domain causal quantity can be expressed using source-domain experiments + target-domain observations.
- Empirical verification: If transportable, verify the transported conclusion holds empirically. If not, identify the minimal additional target-domain interventions needed.
- Output: Transportability verdict (yes/no/conditional) with the specific S-variables that block or enable transfer.
Planned usage:
uv run python 41_transportability.py --tasks ioi --source-model gpt2 --target-model gpt2-mediumReading the scores
Section titled “Reading the scores”| Pattern | What it means |
|---|---|
| High cross-model invariance (> 0.8) | Circuit structure is transportable; generalizes across models |
| Low invariance (< 0.4) | Model-specific circuit; cannot transfer conclusions |
| High invariance on some edges, low on others | Partially transportable; core structure generalizes but periphery is model-specific |
| Transportability fails on scale-dependent S-nodes | Architecture differences (head count, width) break transfer |
Connection to other frameworks
Section titled “Connection to other frameworks”A12 formalizes the generalization claims tested empirically by A07 (cross-task IIA transfer). Where A07 measures whether causal structure transfers across tasks within a model, A12 asks whether it transfers across models/domains with formal guarantees. A01 (SCM) provides the structural models that transportability operates on. A04 (Woodward) ensures that the interventions used in the source domain satisfy the quality criteria needed for valid transport.