Pragmatic Constraint on Distributional Semantics

Elizaveta Zhemchuzhina,N. Filippov,Ivan P. Yamshchikov

Published 2022 in arXiv.org

ABSTRACT

This paper studies the limits of language models' statistical learning in the context of Zipf's law. First, we demonstrate that Zipf-law token distribution emerges irrespective of the chosen tokenization. Second, we show that Zipf distribution is characterized by two distinct groups of tokens that differ both in terms of their frequency and their semantics. Namely, the tokens that have a one-to-one correspondence with one semantic concept have different statistical properties than those with semantic ambiguity. Finally, we demonstrate how these properties interfere with statistical learning procedures motivated by distributional semantics.

PUBLICATION RECORD

Publication year
2022
Venue
arXiv.org
Publication date
2022-11-20
Fields of study
Mathematics, Linguistics, Computer Science
Identifiers
DOI 10.48550/arXiv.2211.11041 arXiv 2211.11041
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Modeling the Unigram Distribution
2021cited by this paper
How (Non-)Optimal is the Lexicon?
2021cited by this paper
HPC Resources of the Higher School of Economics
2021cited by this paper
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
2018cited by this paper
Do neural nets learn statistical laws behind natural language?
2017cited by this paper
The origins of Zipf's meaning‐frequency law
2017cited by this paper
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
2016cited by this paper
On the Implications of Zipf's Law in Passwords
2016cited by this paper
Selected Studies of the Principle of Relative Frequency in Language
2014cited by this paper
Who's afraid of George Kingsley Zipf? Or: Do children and chimps have language?
2013cited by this paper
There is More than a Power Law in Zipf
2012influential reference
Exploring Regularity in Source Code: Software Science and Zipf's Law
2008cited by this paper
The Distributional Hypothesis
2008cited by this paper
The exact place of Zipf's and Pareto's law amongst the classical informetric laws
2005cited by this paper
Zipf's Law and Random Texts
2002cited by this paper
Zipf's law, hyperbolic distributions and entropy loss
2002cited by this paper
On the Rank‐Size Distribution for Human Settlements
2002cited by this paper
Beyond the Zipf-Mandelbrot law in quantitative linguistics
2001cited by this paper
Scaling in ecological size spectra
2001cited by this paper
Web caching and Zipf-like distributions: evidence and implications
1999cited by this paper
A new algorithm for data compression
1994influential reference
STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION.
1955cited by this paper
AND IMPLICATIONS
year unknowncited by this paper

CITED BY

MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers
2026cites this paper
COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
2025cites this paper