Natarajan dimension

In the theory of Probably Approximately Correct Machine Learning, the Natarajan dimension characterizes the complexity of learning a set of functions, generalizing from the Vapnik–Chervonenkis dimension for boolean functions to multi-class functions. Originally introduced as the Generalized Dimension by Natarajan,^[1] it was subsequently renamed the Natarajan Dimension by Haussler and Long.^[2]

Definition

Let $H$ be a set of functions from a set $X$ to a set $Y$ . $H$ shatters a set $C\subset X$ if there exist two functions $f_{0},f_{1}\in H$ such that

For every $x\in C,f_{0}(x)\neq f_{1}(x)$ .
For every $B\subset C$ , there exists a function $h\in H$ such that

for all $x\in B,h(x)=f_{0}(x)$ and for all $x\in C-B,h(x)=f_{1}(x)$ .

The Natarajan dimension of H is the maximal cardinality of a set shattered by $H$ .

It is easy to see that if $|Y|=2$ , the Natarajan dimension collapses to the Vapnik–Chervonenkis dimension.

Shalev-Shwartz and Ben-David ^[3] present comprehensive material on multi-class learning and the Natarajan dimension, including uniform convergence and learnability. Recently, Cohen et al^[4]^[5] showed that the Natarajan dimension is the dominant term governing agnostic multi-class PAC learnability.

References

^ Natarajan, Balas Kausik (1989). "On Learning sets and functions". Machine Learning. 4: 67–97. doi:10.1007/BF00114804.
^ Haussler, David; Long, Philip (1995). "A Generalization of Sauer's Lemma". Journal of Combinatorial Theory. 71 (2): 219–240. doi:10.1016/0097-3165(95)90001-2.
^ Shalev-Shwartz, Shai; Ben-David, Shai (2013). Understanding machine learning. From theory to algorithms. Cambridge University Press.
^ "STOC 2026 - 58th ACM Symposium on Theory of Computing". acm-stoc.org. Retrieved 2026-02-13.
^ Cohen, Alon; Erez, Liad; Hanneke, Steve; Koren, Tomer; Mansour, Yishay; Moran, Shay; Zhang, Qian (2025-11-16). "Sample Complexity of Agnostic Multiclass Classification: Natarajan Dimension Strikes Back". arXiv:2511.12659 [cs.LG].

[1] Natarajan, Balas Kausik (1989). "On Learning sets and functions". Machine Learning. 4: 67–97. doi:10.1007/BF00114804.

[2] Haussler, David; Long, Philip (1995). "A Generalization of Sauer's Lemma". Journal of Combinatorial Theory. 71 (2): 219–240. doi:10.1016/0097-3165(95)90001-2.

[3] Shalev-Shwartz, Shai; Ben-David, Shai (2013). Understanding machine learning. From theory to algorithms. Cambridge University Press.

[4] "STOC 2026 - 58th ACM Symposium on Theory of Computing". acm-stoc.org. Retrieved 2026-02-13.

[5] Cohen, Alon; Erez, Liad; Hanneke, Steve; Koren, Tomer; Mansour, Yishay; Moran, Shay; Zhang, Qian (2025-11-16). "Sample Complexity of Agnostic Multiclass Classification: Natarajan Dimension Strikes Back". arXiv:2511.12659 [cs.LG].

[1]

[2]

[3]

[4]

[5]