While SAE-based feature analysis identifies what a model represents, circuit-level analysis reveals how computations flow, providing causal evidence about model behavior that feature-level analysis alone cannot establish.
Wang et al. (2023) demonstrated that identifying circuits (connected subgraphs of model components) enables causal interventions: ablating a circuit changes model behavior in predicted ways, while ablating individual features may not. This suggests feature decomposition is necessary but insufficient for mechanistic understanding.