Individual neurons rarely encode single concepts

active contested empirical positive falsifiable
v1 · Cross-referenced from multiple sources

summary
In standard transformers, most neurons respond to multiple unrelated concepts (polysemanticity), and single-concept neurons are rare exceptions rather than the norm.
Multiple studies show that probing individual neurons yields mixed selectivity. However, recent work on larger models suggests some neurons do become more monosemantic at scale, creating a genuine dispute about whether this is a fundamental property or a scale-dependent phenomenon.
trust profile
dimensions
evid
72%
repl
55%
cons
48%
meth
71%
cred
82%
scop
45%
brdg
22%
cont
58%
derived scores
supp
60%
fron
44%
stab
50%
claim_support_vector v1.0 · 2026-03-09 19:58 UTC
evidence 3
↑ supporting 2
supports · artifact
Towards Monosemanticity (Anthropic, 2023)
Monosemanticity paper shows many neurons are polysemantic in standard-trained models.
Towards Monosemanticity (Anthropic, 2023) · 84% — Polysemanticity observations
supports · artifact
Toy Models of Superposition (Elhage et al., 2022)
Toy models paper confirms polysemanticity is the default in models with more features than dimensions.
↓ disputing 1
disputes · artifact
Scaling Monosemanticity (Anthropic, 2024)
The scaling paper shows that some neurons become more monosemantic at larger scales, partially disputing the claim that monosemanticity is always rare.
Scaling Monosemanticity (Anthropic, 2024) · 78% — Scale-dependent monosemanticity observed
scope
holds_in Small toy models (<10M parameters) model_scale
neighborhood 1
holds_in Small toy models (<10M parameters) context
attestations
Curator (Human) verifies 0.75
Claim wording acceptable but note the contested status due to scale-dependence.
domains
Mechanistic Interpretability 100%
view status
Strict Empirical included
computation trace
show raw trace data
{
  "note": "Contested: scale-dependent monosemanticity partially disputes the claim",
  "inputs": {
    "artifacts": 3,
    "attestations": 1,
    "disputing_edges": 1,
    "supporting_edges": 2
  }
}