Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Steven T. Piantadosi
2017
Journal of Language Evolution 2(2):141-147, 2017
Human communication is unparalleled in the animal kingdom. The key distinctive feature of our language is productivity : we are able to express an infinite number of ideas using a limited set of words. Traditionally, it has been argued or assumed that productivity emerged as a ...MORE ⇓
Human communication is unparalleled in the animal kingdom. The key distinctive feature of our language is productivity : we are able to express an infinite number of ideas using a limited set of words. Traditionally, it has been argued or assumed that productivity emerged as a consequence of very specific, innate grammatical systems. Here we formally develop an alternative hypothesis: productivity may have rather solely arisen as a consequence of increasing the number of signals (e.g. sentences) in a communication system, under the additional assumption that the processing mechanisms are algorithmically unconstrained. Using tools from algorithmic information theory, we examine the consequences of two intuitive constraints on the probability that a language will be infinitely productive. We prove that under maximum entropy assumptions, increasing the complexity of a language will not strongly pressure it to be finite or infinite. In contrast, increasing the number of signals in a language increases the probability of languages that have—in fact—infinite cardinality. Thus, across evolutionary time, the productivity of human language could have arisen solely from algorithmic randomness combined with a communicative pressure for a large number of signals.
2011
PNAS 108(9):3526-3529, 2011
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that ...MORE ⇓
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.