Circuit-level analysis provides causal evidence that feature-level analysis cannot

active tested methodological positive falsifiable
v1 · Methodological comparison claim

summary
While SAE-based feature analysis identifies what a model represents, circuit-level analysis reveals how computations flow, providing causal evidence about model behavior that feature-level analysis alone cannot establish.
Wang et al. (2023) demonstrated that identifying circuits (connected subgraphs of model components) enables causal interventions: ablating a circuit changes model behavior in predicted ways, while ablating individual features may not. This suggests feature decomposition is necessary but insufficient for mechanistic understanding.
trust profile
dimensions
evid
76%
repl
48%
cons
62%
meth
88%
cred
79%
scop
72%
brdg
35%
cont
18%
derived scores
supp
78%
fron
52%
stab
62%
claim_support_vector v1.0 · 2026-03-09 19:58 UTC
evidence 2
↑ supporting 1
supports · artifact
Circuit Analysis in GPT-2 (Wang et al., 2023)
Wang et al. demonstrate that circuits provide causal evidence through ablation studies.
Circuit Analysis in GPT-2 (Wang et al., 2023) · 89% — Circuit ablation results
• asserting 1
asserts · artifact
Circuit Analysis in GPT-2 (Wang et al., 2023)
Direct assertion from the circuit analysis paper.
neighborhood 1
supports Circuit-level interpretability concept
attestations
Curator (Human) verifies 0.85
Methodological claim accurately distinguishes feature-level from circuit-level evidence.
domains
Mechanistic Interpretability 100%
view status
Strict Empirical included
computation trace
show raw trace data
{
  "note": "Strong methodological claim with high method quality",
  "inputs": {
    "artifacts": 1,
    "attestations": 1,
    "disputing_edges": 0,
    "supporting_edges": 1
  }
}