Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
W. Garrett Mitchener
2014
Artificial Life 20:491-530, 2014
I describe the Utrecht Machine (UM), a discrete artificial regulatory network designed for studying how evolution discovers biochemical computation mechanisms. The corresponding binary genome format is compatible with gene deletion, duplication, and recombination. In the ...MORE ⇓
I describe the Utrecht Machine (UM), a discrete artificial regulatory network designed for studying how evolution discovers biochemical computation mechanisms. The corresponding binary genome format is compatible with gene deletion, duplication, and recombination. In the simulation presented here, an agent consisting of two UMs, a sender and a receiver, must encode, transmit, and decode a binary word over time using the narrow communication channel between them. This communication problem has chicken-and-egg structure in that a sending mechanism is useless without a corresponding receiving mechanism. An in-depth case study reveals that a coincidence creates a minimal partial solution, from which a sequence of partial sending and receiving mechanisms evolve. Gene duplications contribute by enlarging the regulatory network. Analysis of 60,000 sample runs under a variety of parameter settings confirms that crossover accelerates evolution, that stronger selection tends to find clumsier solutions and finds them more slowly, and that there is implicit selection for robust mechanisms and genomes at the codon level. Typical solutions associate each input bit with an activation speed and combine them almost additively. The parents of breakthrough organisms sometimes have lower fitness scores than others in the population, indicating that populations can cross valleys in the fitness landscape via outlying members. The simulation exhibits back mutations and population-level memory effects not accounted for in traditional population genetics models. All together, these phenomena suggest that new evolutionary models are needed that incorporate regulatory network structure.
2011
Journal of Logic, Language and Information 20(3):385--396, 2011
Abstract I discuss a stochastic model of language learning and change. During a syntactic change, each speaker makes use of constructions from two different idealized grammars at variable rates. The model incorporates regularization in that speakers have a slight ...
2007
Bulletin of Mathematical Biology 69(3):1093-1118, 2007
We investigate a model of language evolution, based on population game dynamics with learning. Specifically, we examine the case of two genetic variants of universal grammar (UG), the heart of the human language faculty, assuming each admits two possible grammars. The dynamics ...MORE ⇓
We investigate a model of language evolution, based on population game dynamics with learning. Specifically, we examine the case of two genetic variants of universal grammar (UG), the heart of the human language faculty, assuming each admits two possible grammars. The dynamics are driven by a communication game. We prove using dynamical systems techniques that if the payoff matrix obeys certain constraints, then the two UGs are stable against invasion by each other, that is, they are evolutionarily stable. These constraints are independent of the learning process. Intuitively, if a mutation in UG results in grammars that are incompatible with the established languages, then it will die out because individuals with the mutation will be unable to communicate and therefore unable to realize any potential benefit of the mutation. An example for which the proofs do not apply shows that compatible mutations may or may not be able to invade, depending on the population's history and the learning process. These results suggest that the genetic history of language is constrained by the need for compatibility and that mutations in the language faculty may have died out or taken over depending more on historical accident than on any simple notion of relative fitness.
2006
A Mathematical Model of the Loss of Verb-Second in Middle EnglishPDF
Medieval English and its Heritage: Structure, Meaning and Mechanisms of Change, 2006
Lightfoot (1999) proposes the following explanation for the loss of the verb-second rule in Middle English: There were two regional dialects of Middle English, a northern dialect influenced by Old Norse with a verb- second rule, and a southern dialect with a slightly different ...MORE ⇓
Lightfoot (1999) proposes the following explanation for the loss of the verb-second rule in Middle English: There were two regional dialects of Middle English, a northern dialect influenced by Old Norse with a verb- second rule, and a southern dialect with a slightly different word order. Children acquire the verb-second rule based on hearing some critical fraction of cue sentences requiring such a rule. As the dialects experienced increased contact, northern children were less likely to hear enough cue sentences, and consequently acquired a different grammar, resulting in the extinction of the northern dialect.

This hypothesis can be modeled with differential equations. By using dynamical systems methods, the catastrophe in question may be modeled by a mathematical event known as a saddle-node bifurcation. A key part of the model is the function that gives the probability of learning the northern dialect given that a fraction of the local population uses it. Other model acquisition algorithms, such as memoryless learner (Niyogi \& Berwick 1996), give the mysterious result that verb-second languages should be extremely stable, in contrast to the history of English. This new model provides an explanation for that behavior: Memoryless learners are more sensitive to noise, resulting in a differently shaped function that does not allow the northern grammar to disappear. This model demonstrates how dynamical systems theory can be used to study language change and learning models.

2005
A Simulation of Language Change in the Presence of Non-Idealized SyntaxPDF
Proceedings of the workshop Psychocomputational Models of Human Language Acquisition, ACL-2005, 2005
Both Middle English and Old French had a syntactic property called verb-second or V2 that disappeared. This paper describes a simulation being developed to shed light on the question of why V2 is stable in some languages, but not others. The simulation, based on a Markov chain, ...MORE ⇓
Both Middle English and Old French had a syntactic property called verb-second or V2 that disappeared. This paper describes a simulation being developed to shed light on the question of why V2 is stable in some languages, but not others. The simulation, based on a Markov chain, uses fuzzy grammars where speakers can use an arbitrary mixture of idealized grammars. Thus, it can mimic the variable syntax observed in Middle English manuscripts. The simulation supports the hypotheses that children use the topic of a sentence for word order acquisition, that acquisition takes into account the ambiguity of grammatical information available from sample sentences, and that speakers prefer to speak with more regularity than they observe in the primary linguistic data.
2004
Proceedings of the Royal Society B: Biological Sciences 271(1540):701-704, 2004
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other ...MORE ⇓
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other languages. This paper describes a mathematical model illustrating language change. Children learn their parents' language imperfectly, and in the case presented here, the result is a simulated population that maintains an ever-changing mixture of grammars. This research is part of a growing attempt to use mathematical models to better understand the social and biological history of language.
2003
Bulletin of Mathematical Biology 65(1):67-93, 2003
Universal grammar (UG) is a list of innate constraints that specify the set of grammars that can be learned by the child during primary language acquisition. UG of the human brain has been shaped by evolution. Evolution requires variation. Hence, we have to postulate and study ...MORE ⇓
Universal grammar (UG) is a list of innate constraints that specify the set of grammars that can be learned by the child during primary language acquisition. UG of the human brain has been shaped by evolution. Evolution requires variation. Hence, we have to postulate and study variation of UG. We investigate evolutionary dynamics and language acquisition in the context of multiple UGs. We provide examples for competitive exclusion and stable coexistence of different UGs. More specific UGs admit fewer candidate grammars, and less specific UGs admit more candidate grammars. We will analyze conditions for more specific UGs to outcompete less specific UGs and vice versa. An interesting finding is that less specific UGs can resist invasion by more specific UGs if learning is more accurate. In other words, accurate learning stabilizes UGs that admit large numbers of candidate grammars.
Bifurcation Analysis of the Fully Symmetric Language Dynamical EquationPDF
Journal of Mathematical Biology 46(3):265-285, 2003
Abstract In this paper, I study a continuous dynamical system that describes language acquisition and communication in a group of individuals. Children inherit from their parents a mechanism to learn their language. This mechanism is constrained by a universal ...
A Mathematical Model of Human Languages: The Interaction of Game Dynamics and Learning ProcessesPDF
Program in Applied and Computational Mathematics, Princeton University, 2003
Human language is a remarkable communication system, apparently unique among an- imals. All humans have a built-in learning mechanism known as universal grammar or UG. Languages change in regular yet unpredictable ways due to many factors, including properties of UG and contact ...MORE ⇓
Human language is a remarkable communication system, apparently unique among an- imals. All humans have a built-in learning mechanism known as universal grammar or UG. Languages change in regular yet unpredictable ways due to many factors, including properties of UG and contact with other languages. This dissertation extends the standard replicator equation used in evolutionary biology to include a learning process. The resulting language dynamical equation models language change at the population level. In a further extension, members of the population may have di#erent UGs. It models evolution of the language faculty itself.

We begin by examining the language dynamical equation in the case where the param- eters are fully symmetric. When learning is very error prone, the population always settles at an equilibrium where all grammars are present. For more accurate learning, coherent equilibria appear, where one grammar dominates the population. We identify all bifurca- tions that take place as learning accuracy increases. This alternation between incoherence and coherence provides a mechanism for understanding how language contact can trigger change.

We then relax the symmetry assumptions, and demonstrate that the language dynami- cal equation can exhibit oscillations and chaos. Such behavior is consistent with the regular, spontaneous, and unpredictable changes observed in actual languages, and with the sensi- tivity exhibited by changes triggered by language contact. From there, we move to the extended model with multiple UGs. The first stage of analysis focuses on UGs that admit only a single grammar. These are stable, immune to invasion by other UGs with imperfect learning. They can invade a population that uses a similar grammar with a multi-grammar UG. This analysis suggests that in the distant past, human UG may have admitted more languages than it currently does, and that over time variants with more built-in information have taken over. Finally, we address a low-dimensional case of competition between two UGs, and find conditions where they are stable against one another, and where they can coexist. These results imply that evolution of UG must have been incremental, and that similar variants may coexist.

This research was conducted under the supervision of Dr. Martin A. Nowak (Program in Theoretical Biology at the Institute for Advanced Study, and Program in Applied and Computational Mathematics at Princeton University).