Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Martin A. Nowak
2012
Journal of Theoretical Biology 301:161--173, 2012
We study evolutionary game theory in a setting where individuals learn from each other. We extend the traditional approach by assuming that a population contains individuals with different learning abilities. In particular, we explore the situation where individuals have ...
2011
Science 331(6014):176--182, 2011
We constructed a corpus of digitized texts containing about 4\% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were ...MORE ⇓
We constructed a corpus of digitized texts containing about 4\% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
2008
PNAS 105(3):833-838, 2008
When people speak, they often insinuate their intent indirectly rather than stating it as a bald proposition. Examples include sexual come-ons, veiled threats, polite requests, and concealed bribes. We propose a three-part theory of indirect speech, based on the idea that human ...MORE ⇓
When people speak, they often insinuate their intent indirectly rather than stating it as a bald proposition. Examples include sexual come-ons, veiled threats, polite requests, and concealed bribes. We propose a three-part theory of indirect speech, based on the idea that human communication involves a mixture of cooperation and conflict. First, indirect requests allow for plausible deniability, in which a cooperative listener can accept the request, but an uncooperative one cannot react adversarially to it. This intuition is supported by a game-theoretic model that predicts the costs and benefits to a speaker of direct and indirect requests. Second, language has two functions: to convey information and to negotiate the type of relationship holding between speaker and hearer (in particular, dominance, communality, or reciprocity). The emotional costs of a mismatch in the assumed relationship type can create a need for plausible deniability and, thereby, select for indirectness even when there are no tangible costs. Third, people perceive language as a digital medium, which allows a sentence to generate common knowledge, to propagate a message with high fidelity, and to serve as a reference point in coordination games. This feature makes an indirect request qualitatively different from a direct one even when the speaker and listener can infer each other's intentions with high confidence.
2007
Nature 449(7163):713--716, 2007
Human language is based on grammatical rules. Cultural evolution allows these rules to change over time. Rules compete with each other: as new rules rise to prominence, old ones die away. To quantify the dynamics of language evolution, we studied the regularization of English ...MORE ⇓
Human language is based on grammatical rules. Cultural evolution allows these rules to change over time. Rules compete with each other: as new rules rise to prominence, old ones die away. To quantify the dynamics of language evolution, we studied the regularization of English verbs over the past 1,200 years. Although an elaborate system of productive conjugations existed in English's proto-Germanic ancestor, Modern English uses the dental suffix, '-ed', to signify past tense. Here we describe the emergence of this linguistic rule amidst the evolutionary decay of its exceptions, known to us as irregular verbs. We have generated a data set of verbs whose conjugations have been evolving for more than a millennium, tracking inflectional changes to 177 Old-English irregular verbs. Of these irregular verbs, 145 remained irregular in Middle English and 98 are still irregular today. We study how the rate of regularization depends on the frequency of word usage. The half-life of an irregular verb scales as the square root of its usage frequency: a verb that is 100 times less frequent regularizes 10 times as fast. Our study provides a quantitative analysis of the regularization process by which ancestral forms gradually yield to an emerging linguistic rule.
2006
Evolutionary Dynamics: Exploring the Equations of Life
Harvard University Press, 2006
TABLE OF CONTENTS
Preface
1. Introduction
2. What Evolution Is
3. Fitness Landscapes and Sequence Spaces
4. Evolutionary Games
5. Prisoners of the Dilemma
6. Finite Populations
7. Games in Finite Populations
8. Evolutionary Graph ...MORE ⇓
TABLE OF CONTENTS
Preface
1. Introduction
2. What Evolution Is
3. Fitness Landscapes and Sequence Spaces
4. Evolutionary Games
5. Prisoners of the Dilemma
6. Finite Populations
7. Games in Finite Populations
8. Evolutionary Graph Theory
9. Spatial Games
10. HIV Infection
11. The Evolution of Virulence
12. The Evolutionary Dynamics of Cancer
13. Language Evolution
14. Conclusion
2005
The evolution of altruism: from game theory to human language
Spiritual Information: 100 perspectives, 2005
2004
PNAS 101(52):18053-18057, 2004
Traditional language learning theory explores an idealized interaction between a teacher and a learner. The teacher provides sentences from a language, while the learner has to infer the underlying grammar. Here, we study a new approach by considering a population of individuals ...MORE ⇓
Traditional language learning theory explores an idealized interaction between a teacher and a learner. The teacher provides sentences from a language, while the learner has to infer the underlying grammar. Here, we study a new approach by considering a population of individuals that learn from each other. There is no designated teacher. We are inspired by the observation that children grow up to speak the language of their peers, not of their parents. Our goal is to characterize learning strategies that generate ``linguistic coherence,'' which means that most individuals use the same language. We model the resulting learning dynamics as a random walk of a population on a graph. Each vertex represents a candidate language. We find that a simple strategy using a certain aspiration level with the principle of win-stay, lose-shift does extremely well: stay with your current language, if at least three others use that language; otherwise, shift to an adjacent language on the graph. This strategy guarantees linguistic coherence on all nearly regular graphs, in the relevant limit where the number of candidate languages is much greater than the population size. Moreover, for many graphs, it is sufficient to have an aspiration level demanding only two other individuals to use the same language.
Proceedings of the Royal Society B: Biological Sciences 271(1540):701-704, 2004
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other ...MORE ⇓
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other languages. This paper describes a mathematical model illustrating language change. Children learn their parents' language imperfectly, and in the case presented here, the result is a simulated population that maintains an ever-changing mixture of grammars. This research is part of a growing attempt to use mathematical models to better understand the social and biological history of language.
2003
Journal of Theoretical Biology 221(3):445-457, 2003
Any mechanism of language acquisition can only learn a restricted set of grammars. The human brain contains a mechanism for language acquisition which can learn a restricted set of grammars. The theory of this restricted set is universal grammar (UG). UG has to be sufficiently ...MORE ⇓
Any mechanism of language acquisition can only learn a restricted set of grammars. The human brain contains a mechanism for language acquisition which can learn a restricted set of grammars. The theory of this restricted set is universal grammar (UG). UG has to be sufficiently specific to induce linguistic coherence in a population. This phenomenon is known as ``coherence threshold''. Previously, we have calculated the coherence threshold for deterministic dynamics and infinitely large populations. Here, we extend the framework to stochastic processes and finite populations. If there is selection for communicative function (selective language dynamics), then the analytic results for infinite populations are excellent approximations for finite populations; as expected, finite populations need a slightly higher accuracy of language acquisition to maintain coherence. If there is no selection for communicative function (neutral language dynamics), then linguistic coherence is only possible for finite populations.
Language, Learning, and Evolution
Language Evolution: The States of the Art, 2003
Bulletin of Mathematical Biology 65(1):67-93, 2003
Universal grammar (UG) is a list of innate constraints that specify the set of grammars that can be learned by the child during primary language acquisition. UG of the human brain has been shaped by evolution. Evolution requires variation. Hence, we have to postulate and study ...MORE ⇓
Universal grammar (UG) is a list of innate constraints that specify the set of grammars that can be learned by the child during primary language acquisition. UG of the human brain has been shaped by evolution. Evolution requires variation. Hence, we have to postulate and study variation of UG. We investigate evolutionary dynamics and language acquisition in the context of multiple UGs. We provide examples for competitive exclusion and stable coexistence of different UGs. More specific UGs admit fewer candidate grammars, and less specific UGs admit more candidate grammars. We will analyze conditions for more specific UGs to outcompete less specific UGs and vice versa. An interesting finding is that less specific UGs can resist invasion by more specific UGs if learning is more accurate. In other words, accurate learning stabilizes UGs that admit large numbers of candidate grammars.
2002
Population dynamics of grammar acquisition
Simulating the Evolution of Language 7.0:149-164, 2002
The most fascinating aspect of human language is grammar. Grammar is a computational system that mediates a mapping between linguistic form and meaning. Grammar is the machinery that gives rise to the unlimited expressibility of human language. Children ...
Nature 417:611-617, 2002
Language is our legacy. It is the main evolutionary contribution of humans, and perhaps the most interesting trait that has emerged in the past 500 million years. Understanding how darwinian evolution gives rise to human language requires the integration of formal language ...MORE ⇓
Language is our legacy. It is the main evolutionary contribution of humans, and perhaps the most interesting trait that has emerged in the past 500 million years. Understanding how darwinian evolution gives rise to human language requires the integration of formal language theory, learning theory and evolutionary dynamics. Formal language theory provides a mathematical description of language and grammar. Learning theory formalizes the task of language acquisition--it can be shown that no procedure can learn an unrestricted set of languages. Universal grammar specifies the restricted set of languages learnable by the human brain. Evolutionary dynamics can be formulated to describe the cultural evolution of language and the biological evolution of universal grammar.
From quasispecies to universal grammarPDF
Z. Phys. Chem. 16:5-20, 2002
The perspective of this paper is to compare mathematical models for the evolutionary dynamics of genomes and languages. The quasispecies equation describes the evolution of genetic sequences under the influence of mutation and selection. A central result is an error threshold ...MORE ⇓
The perspective of this paper is to compare mathematical models for the evolutionary dynamics of genomes and languages. The quasispecies equation describes the evolution of genetic sequences under the influence of mutation and selection. A central result is an error threshold which specifies the minimum replication accuracy required for maintaining genetic information of a certain length. The language equation describes the evolution of communication, including the cultural evolution of grammar and the biological evolution of universal grammar. A central result is a coherence threshold which specifies certain conditions that universal grammar has to fulfill in order to induce coherent communication in a population.
2001
Proceedings of the Royal Society B: Biological Sciences 268(1472):1189-1196, 2001
The language acquisition period in humans lasts about 13 years. After puberty it becomes increasingly difficult to learn a language. We explain this phenomenon by using an evolutionary framework. We present a dynamical system describing competition between language acquisition ...MORE ⇓
The language acquisition period in humans lasts about 13 years. After puberty it becomes increasingly difficult to learn a language. We explain this phenomenon by using an evolutionary framework. We present a dynamical system describing competition between language acquisition devices, which differ in the length of the learning period. There are two selective forces that play a role in determining the critical learning period: (i) having a longer learning period increases the accuracy of language acquisition; (ii) learning is associated with certain costs that affect fitness. As a result, there exists a limited learning period which is evolutionarily stable. This result is obtained analytically by means of a Nash equilibrium analysis of language acquisition devices. Interestingly, the evolutionarily stable learning period does not maximize the average fitness of the population.
Journal of Theoretical Biology 209(1):43-59, 2001
Grammar is the computational system of language. It is a set of rules that specifies how to construct sentences out of words. Grammar is the basis of the unlimited expressibility of human language. Children acquire the grammar of their native language without formal education ...MORE ⇓
Grammar is the computational system of language. It is a set of rules that specifies how to construct sentences out of words. Grammar is the basis of the unlimited expressibility of human language. Children acquire the grammar of their native language without formal education simply by hearing a number of sample sentences. Children could not solve this learning task if they did not have some pre-formed expectations. In other words, children have to evaluate the sample sentences and choose one grammar out of a limited set of candidate grammars. The restricted search space and the mechanism which allows to evaluate the sample sentences is called universal grammar. Universal grammar cannot be learned; it must be in place when the learning process starts. In this paper, we design a mathematical theory that places the problem of language acquisition into an evolutionary context. We formulate equations for the population dynamics of communication and grammar learning. We ask how accurate children have to learn the grammar of their parents' language for a population of individuals to evolve and maintain a coherent grammatical system. It turns out that there is a maximum error tolerance for which a predominant grammar is stable. We calculate the maximum size of the search space that is compatible with coherent communication in a population. Thus, we specify the conditions for the evolution of universal grammar.
Bulletin of Mathematical Biology 63(3):451-485, 2001
The lexical matrix is an integral part of the human language system. It provides the link between word form and word meaning. A simple lexical matrix is also at the center of any animal communication system, where it defines the associations between form and meaning of animal ...MORE ⇓
The lexical matrix is an integral part of the human language system. It provides the link between word form and word meaning. A simple lexical matrix is also at the center of any animal communication system, where it defines the associations between form and meaning of animal signals. We study the evolution and population dynamics of the lexical matrix. We assume that children learn the lexical matrix of their parents. This learning process is subject to mistakes: (i) children may not acquire all lexical items of their parents (incomplete learning); and (ii) children might acquire associations between word forms and word meanings that differ from their parents' lexical items (incorrect learning). We derive an analytic framework that deals with incomplete learning. We calculate the maximum error rate that is compatible with a population maintaining a coherent lexical matrix of a given size. We calculate the equilibrium distribution of the number of lexical items known to individuals. Our analytic investigations are supplemented by numerical simulations that describe both incomplete and incorrect learning, and other extensions.
Science 291:114-118, 2001
Universal grammar specifies the mechanism of language acquisition. It determines the range of grammatical hypothesis that children entertain during language learning and the procedure they use for evaluating input sentences. How universal grammar arose is a major challenge for ...MORE ⇓
Universal grammar specifies the mechanism of language acquisition. It determines the range of grammatical hypothesis that children entertain during language learning and the procedure they use for evaluating input sentences. How universal grammar arose is a major challenge for evolutionary biology. We present a mathematical framework for the evolutionary dynamics of grammar learning. The central result is a coherence threshold, which specifies the condition for a universal grammar to induce coherent communication within a population. We study selection of grammars within the same universal grammar and competition between different universal grammars. We calculate the condition under which natural selection favors the emergence of rule-based, generative grammars that underlie complex language.
Trends in Cognitive Sciences 5(7):288-295, 2001
Language is a biological trait that radically changed the performance of one species and the appearance of the planet. Understanding how human language came about is one of the most interesting tasks for evolutionary biology. Here we discuss how natural selection can guide the ...MORE ⇓
Language is a biological trait that radically changed the performance of one species and the appearance of the planet. Understanding how human language came about is one of the most interesting tasks for evolutionary biology. Here we discuss how natural selection can guide the emergence of some basic features of human language, including arbitrary signs, words, syntactic communication and grammar. We show how natural selection can lead to the duality of patterning of human language: sequences of phonemes form words; sequences of words form sentences. Finally, we present a framework for the population dynamics of grammar acquisition, which allows us to study the cultural evolution of grammar and the biological evolution of universal grammar.
Entropy 3(4):227-246, 2001
Language is the most important evolutionary invention of the last few million years. How human language evolved from animal communication is a challenging question for evolutionary biology. In this paper we use mathematical models to analyze the major transitions in language ...MORE ⇓
Language is the most important evolutionary invention of the last few million years. How human language evolved from animal communication is a challenging question for evolutionary biology. In this paper we use mathematical models to analyze the major transitions in language evolution. We begin by discussing the evolution of coordinated associations between signals and objects in a population. We then analyze word-formation and its relationship to Shannon's noisy coding theorem. Finally, we model the population dynamics of words and the adaptive emergence of syntax.
2000
Philosophical Transactions of the Royal Society B: Biological Sciences 355(1403):1615-1622, 2000
Language is the most important evolutionary invention of the last few million years. It was an adaptation that helped our species to exchange information, make plans, express new ideas and totally change the appearance of the planet. How human language evolved from animal ...MORE ⇓
Language is the most important evolutionary invention of the last few million years. It was an adaptation that helped our species to exchange information, make plans, express new ideas and totally change the appearance of the planet. How human language evolved from animal communication is one of the most challenging questions for evolutionary biology. The aim of this paper is to outline the major principles that guided language evolution in terms of mathematical models of evolutionary dynamics and game theory. I will discuss how natural selection can lead to the emergence of arbitrary signs, the formation of words and syntactic communication.
Homo grammaticusPDF
Natural History 109:36-44, 2000
Поиск в библиотеке, Расширенный поиск. ...
Journal of Theoretical Biology 204(2):179-189, 2000
Language is about words and rules. While there is some discussion to what extent rules are learned or innate, it is clear that words have to be learned. Here I construct a mathematical framework for the population dynamics of language evolution with particular emphasis on how ...MORE ⇓
Language is about words and rules. While there is some discussion to what extent rules are learned or innate, it is clear that words have to be learned. Here I construct a mathematical framework for the population dynamics of language evolution with particular emphasis on how words are propagated over generations. I define the basic reproductive ratio of word, R, and show that R>1 is required for words to be maintained in the lexicon of a language. Assuming that the frequency distribution of words follow Zipf's law, an upper limit is obtained for the number of words in a language that relies exclusively on oral transmission.
Nature 404:495-498, 2000
Animal communication is typically non-syntactic, which means that signals refer to whole situations. Human language is syntactic, and signals consist of discrete components that have their own meaning. Syntax is a prerequisite for taking advantage of combinatorics, that is, ...MORE ⇓
Animal communication is typically non-syntactic, which means that signals refer to whole situations. Human language is syntactic, and signals consist of discrete components that have their own meaning. Syntax is a prerequisite for taking advantage of combinatorics, that is, 'making infinite use of finite means'. The vast expressive power of human language would be impossible without syntax, and the transition from non-syntactic to syntactic communication was an essential step in the evolution of human language. We aim to understand the evolutionary dynamics of this transition and to analyse how natural selection can guide it. Here we present a model for the population dynamics of language evolution, define the basic reproductive ratio of words and calculate the maximum size of a lexicon. Syntax allows larger repertoires and the possibility to formulate messages that have not been learned beforehand. Nevertheless, according to our model natural selection can only favour the emergence of syntax if the number of required signals exceeds a threshold value. This result might explain why only humans evolved syntactic communication and hence complex language.
Journal of Theoretical Biology 205(1):147-159, 2000
This paper places models of language evolution within the framework of information theory. We study how signals become associated with meaning. If there is a probability of mistaking signals for each other, then evolution leads to an error limit: increasing the number of signals ...MORE ⇓
This paper places models of language evolution within the framework of information theory. We study how signals become associated with meaning. If there is a probability of mistaking signals for each other, then evolution leads to an error limit: increasing the number of signals does not increase the fitness of a language beyond a certain limit. This error limit can be overcome by word formation: a linear increase of the word length leads to an exponential increase of the maximum fitness. We develop a general model of word formation and demonstrate the connection between the error limit and Shannon's noisy coding theorem.
Journal of Mathematical Biology 41(2):172-188, 2000
We study an evolutionary language game that describes how signals become associated with meaning. In our context, a language, L, is described by two matrices: the P matrix contains the probabilities that for a speaker certain objects are associated with certain signals, while the ...MORE ⇓
We study an evolutionary language game that describes how signals become associated with meaning. In our context, a language, L, is described by two matrices: the P matrix contains the probabilities that for a speaker certain objects are associated with certain signals, while the Q matrix contains the probabilities that for a listener certain signals are associated with certain objects. We define the payoff in our evolutionary language game as the total amount of information exchanged between two individuals. We give a formal classification of all languages, L(P, Q), describing the conditions for Nash equilibria and evolutionarily stable strategies (ESS). We describe an algorithm for generating all languages that are Nash equilibria. Finally, we show that starting from any random language, there exists an evolutionary trajectory using selection and neutral drift that ends up with a strategy that is a strict Nash equilibrium (or very close to a strict Nash equilibrium).
1999
Proceedings of the Royal Society B: Biological Sciences 266(1433):2131-2136, 1999
On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all ...MORE ⇓
On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all languages (i) a small number of phonemes are used to generate a large number of words; and (ii) a large number of words are used to a produce an unlimited number of sentences. The first (and simpler) step is the topic of the current paper. We study the evolution of communication in the presence of errors and show that this limits the number of objects (or concepts) that can be described by a simple communication system. The evolutionary optimum is achieved by using only a small number of signals to describe a few valuable concepts. Adding more signals does not increase the fitness of a language. This represents an error limit for the evolution of communication. We show that this error limit can be overcome by combining signals (phonemes) into words. The transition from an analogue to a digital system was a necessary step toward the evolution of human language.
PNAS 96(14):8028-8033, 1999
The emergence of language was a defining moment in the evolution of modern humans. It was an innovation that changed radically the character of human society. Here, we provide an approach to language evolution based on evolutionary game theory. We explore the ways in which ...MORE ⇓
The emergence of language was a defining moment in the evolution of modern humans. It was an innovation that changed radically the character of human society. Here, we provide an approach to language evolution based on evolutionary game theory. We explore the ways in which protolanguages can evolve in a nonlinguistic society and how specific signals can become associated with specific objects. We assume that early in the evolution of language, errors in signaling and perception would be common. We model the probability of misunderstanding a signal and show that this limits the number of objects that can be described by a protolanguage. This 'error limit' is not overcome by employing more sounds but by combining a small set of more easily distinguishable sounds into words. The process of 'word formation' enables a language to encode an essentially unlimited number of objects. Next, we analyze how words can be combined into sentences and specify the conditions for the evolution of very simple grammatical rules. We argue that grammar originated as a simplified rule system that evolved by natural selection to reduce mistakes in communication. Our theory provides a systematic approach for thinking about the origin and evolution of human language.
Journal of Theoretical Biology 200(2):147-162, 1999
We explore how evolutionary game dynamics have to be modified to accomodate a mathematical framework for the evolution of language. In particular, we are interested in the evolution of vocabulary, that is associations between signals and objects. We assume that successful ...MORE ⇓
We explore how evolutionary game dynamics have to be modified to accomodate a mathematical framework for the evolution of language. In particular, we are interested in the evolution of vocabulary, that is associations between signals and objects. We assume that successful communication contributes to biological fitness: individuals who communicate well leave more offspring. Children inherit from their parents a strategy for language learning (a language acquisition device). We consider three mechanisms whereby language is passed from one generation to the next: (i) parental learning: children learn the language of their parents; (ii) role model learning: children learn the language of individuals with a high payoff; and (iii) random learning: children learn the language of randomly chosen individuals. We show that parental and role model learning outperform random learning. Then we introduce mistakes in language learning and study how this process changes language over time. Mistakes increase the overall efficacy of parental and role model learning: in a world with errors evolutionary adaptation is more efficient. Our model also provides a simple explanation why homonomy is common while synonymy is rare.