Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Journal :: Scientific Reports
2018
Scientific reports 8:243-313, 2018
The innovation of iconic gestures is essential to establishing the vocabularies of signed languages, but might iconicity also play a role in the origin of spoken words? Can people create novel vocalizations that are comprehensible to naïve listeners without prior convention? We ...MORE ⇓
The innovation of iconic gestures is essential to establishing the vocabularies of signed languages, but might iconicity also play a role in the origin of spoken words? Can people create novel vocalizations that are comprehensible to naïve listeners without prior convention? We launched a contest in which participants submitted non-linguistic vocalizations for 30 meanings spanning actions, humans, animals, inanimate objects, properties, quantifiers and demonstratives. The winner was determined by the ability of naïve listeners to infer the meanings of the vocalizations. We report a series of experiments and analyses that evaluated the vocalizations for: (1) comprehensibility to naïve listeners; (2) the degree to which they were iconic; (3) agreement between producers and listeners in iconicity; and (4) whether iconicity helps listeners learn the vocalizations as category labels. The results show contestants were able to create successful iconic vocalizations for most of the meanings, which were largely comprehensible to naïve listeners, and easier to learn as category labels. These findings demonstrate how iconic vocalizations can enable interlocutors to establish understanding in the absence of conventions. They suggest that, prior to the advent of full-blown spoken languages, people could have used iconic vocalizations to ground a spoken vocabulary with considerable semantic breadth.
2013
Scientific Reports 3(1082), 2013
Zipf's law on word frequency and Heaps' law on the growth of distinct words are observed in Indo-European language family, but it does not hold for languages like Chinese, Japanese and Korean. These languages consist of characters, and are of very limited dictionary sizes. ...MORE ⇓
Zipf's law on word frequency and Heaps' law on the growth of distinct words are observed in Indo-European language family, but it does not hold for languages like Chinese, Japanese and Korean. These languages consist of characters, and are of very limited dictionary sizes. Extensive experiments show that: (i) The character frequency distribution follows a power law with exponent close to one, at which the corresponding Zipf's exponent diverges. Indeed, the character frequency decays exponentially in the Zipf's plot. (ii) The number of distinct characters grows with the text length in three stages: It grows linearly in the beginning, then turns to a logarithmical form, and eventually saturates. A theoretical model for writing process is proposed, which embodies the rich-get-richer mechanism and the effects of limited dictionary size. Experiments, simulations and analytical solutions agree well with each other. This work refines the understanding about Zipf's and Heaps' laws in human language systems.
2012
Scientific Reports 2(313), 2012
We analyze the dynamic properties of 10,000,000 words recorded in English, Spanish and Hebrew over the period 1800-2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of ...MORE ⇓
We analyze the dynamic properties of 10,000,000 words recorded in English, Spanish and Hebrew over the period 1800-2008 in order to gain insight into the coevolution of language and culture. We report language independent patterns useful as benchmarks for theoretical models of language evolution. A significantly decreasing (increasing) trend in the birth (death) rate of words indicates a recent shift in the selection laws governing word use. For new words, we observe a peak in the growth-rate fluctuations around 40 years after introduction, consistent with the typical entry time into standard dictionaries and the human generational timescale. Pronounced changes in the dynamics of language during periods of war shows that word correlations, occurring across time and between words, are largely influenced by coevolutionary social, technological, and political factors. We quantify cultural memory by analyzing the long-term correlations in the use of individual words using detrended fluctuation analysis.
Scientific Reports 2(943), 2012
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word ...MORE ⇓
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This cooling pattern forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.