Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Journal :: Proceedings of the Royal Society B: Biological Sciences
2018
Proceedings of the Royal Society B: Biological Sciences 285(1871):e8559-578, 2018
Languages with many speakers tend to be structurally simple while small communities sometimes develop languages with great structural complexity. Paradoxically, the opposite pattern appears to be observed for non-structural properties of language such as vocabulary size. These ...MORE ⇓
Languages with many speakers tend to be structurally simple while small communities sometimes develop languages with great structural complexity. Paradoxically, the opposite pattern appears to be observed for non-structural properties of language such as vocabulary size. These apparently opposite patterns pose a challenge for theories of language change and evolution. We use computational simulations to show that this inverse pattern can depend on a single factor: ease of diffusion through the population. A population of interacting agents was arranged on a network, passing linguistic conventions to one another along network links. Agents can invent new conventions, or replicate conventions that they have previously generated themselves or learned from other agents. Linguistic conventions are either Easy or Hard to diffuse, depending on how many times an agent needs to encounter a convention to learn it. In large groups, only linguistic conventions that are easy to learn, such as words, tend to proliferate, whereas small groups where everyone talks to everyone else allow for more complex conventions, like grammatical regularities, to be maintained. Our simulations thus suggest that language, and possibly other aspects of culture, may become simpler at the structural level as our world becomes increasingly interconnected.
2013
Proceedings of the Royal Society B: Biological Sciences 280, 2013
Despite a burgeoning science of cultural evolution, relatively little work has focused on the population structure of human cultural variation. By contrast, studies in human population genetics use a suite of tools to quantify and analyse spatial and temporal patterns of genetic ...MORE ⇓
Despite a burgeoning science of cultural evolution, relatively little work has focused on the population structure of human cultural variation. By contrast, studies in human population genetics use a suite of tools to quantify and analyse spatial and temporal patterns of genetic variation within and between populations. Human genetic diversity can be explained largely as a result of migration and drift giving rise to gradual genetic clines, together with some discontinuities arising from geographical and cultural barriers to gene flow. Here, we adapt theory and methods from population genetics to quantify the influence of geography and ethnolinguistic boundaries on the distribution of 700 variants of a folktale in 31 European ethnolinguis- tic populations. We find that geographical distance and ethnolinguistic affiliation exert significant independent effects on folktale diversity and that variation between populations supports a clustering concordant with European geography. This pattern of geographical clines and clusters paral- lels the pattern of human genetic diversity in Europe, although the effects of geographical distance and ethnolinguistic boundaries are stronger for folk- tales than genes. Our findings highlight the importance of geography and population boundaries in models of human cultural variation and point to key similarities and differences between evolutionary processes operating on human genes and culture.
Proceedings of the Royal Society B: Biological Sciences 280(1758), 2013
As in biological evolution, multiple forces are involved in cultural evolution. One force is analogous to selection, and acts on differences in the fitness of aspects of culture by influencing who people choose to learn from. Another force is analogous to mutation, and influences ...MORE ⇓
As in biological evolution, multiple forces are involved in cultural evolution. One force is analogous to selection, and acts on differences in the fitness of aspects of culture by influencing who people choose to learn from. Another force is analogous to mutation, and influences how culture changes over time owing to errors in learning and the effects of cognitive biases. Which of these forces need to be appealed to in explaining any particular aspect of human cultures is an open question. We present a study that explores this question empirically, examining the role that the cognitive biases that influence cultural transmission might play in universals of colour naming. In a large-scale laboratory experiment, participants were shown labelled examples from novel artificial systems of colour terms and were asked to classify other colours on the basis of those examples. The responses of each participant were used to generate the examples seen by subsequent participants. By simulating cultural transmission in the laboratory, we were able to isolate a single evolutionary force—the effects of cognitive biases, analogous to mutation—and examine its consequences. Our results show that this process produces convergence towards systems of colour terms similar to those seen across human languages, providing support for the conclusion that the effects of cognitive biases, brought out through cultural transmission, can account for universals in colour naming.
Proceedings of the Royal Society B: Biological Sciences 280(1762), 2013
There is disagreement about the routes taken by populations speaking Bantu languages as they expanded to cover much of sub-Saharan Africa. Here, we build phylogenetic trees of Bantu languages and map them onto geographical space in order to assess the likely pathway of expansion ...MORE ⇓
There is disagreement about the routes taken by populations speaking Bantu languages as they expanded to cover much of sub-Saharan Africa. Here, we build phylogenetic trees of Bantu languages and map them onto geographical space in order to assess the likely pathway of expansion and test between dispersal scenarios. The results clearly support a scenario in which groups first moved south through the rainforest from a homeland somewhere near the Nigeria–Cameroon border. Emerging on the south side of the rainforest, one branch moved south and west. Another branch moved towards the Great Lakes, eventually giving rise to the monophyletic clade of East Bantu languages that inhabit East and Southeastern Africa. These phylogenies also reveal information about more general processes involved in the diversification of human populations into distinct ethnolinguistic groups. Our study reveals that Bantu languages show a latitudinal gradient in covering greater areas with increasing distance from the equator. Analyses suggest that this pattern reflects a true ecological relationship rather than merely being an artefact of shared history. The study shows how a phylogeographic approach can address questions relating to the specific histories of certain groups, as well as general cultural evolutionary processes.
2012
Proceedings of the Royal Society B: Biological Sciences, 2012
Abstract It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent ...
Proceedings of the Royal Society B: Biological Sciences 279(1733):1606--1612, 2012
Abstract Human cultural traits, such as languages, musics, rituals and material objects, vary widely across cultures. However, the majority of comparative analyses of human cultural diversity focus on between-culture variation without consideration for within-culture ...
Proceedings of the Royal Society B: Biological Sciences 279(1735):1943--1949, 2012
Communication involves a pair of behaviours—a signal and a response—that are functionally interdependent. Consequently, the emergence of communication involves a chicken-and-egg problem: if signals and responses are dependent on one another, then how does such a relationship ...MORE ⇓
Communication involves a pair of behaviours—a signal and a response—that are functionally interdependent. Consequently, the emergence of communication involves a chicken-and-egg problem: if signals and responses are dependent on one another, then how does such a relationship emerge in the first place? The empirical literature suggests two solutions to this problem: ritualization and sensory manipulation; and instances of ritualization appear to be more common. However, it is not clear from a theoretical perspective why this should be the case, nor if there are any other routes to communication. Here, we develop an analytical model to examine how communication can emerge. We show that: (i) a state of non-interaction is evolutionarily stable, and so communication will not necessarily emerge even when it is in both parties' interest; (ii) the conditions for sensory manipulation are more stringent than for ritualization, and hence ritualization is likely to be more common; and (iii) communication can arise by a third route, when the intention to communicate can itself be communicated, but this may be limited to humans. More generally, our results demonstrate the utility of a functional approach to communication.
Proceedings of the Royal Society B: Biological Sciences 279(1741):3256--3263, 2012
The expansion of Bantu languages represents one of the most momentous events in the history of Africa. While it is well accepted that Bantu languages spread from their homeland (Cameroon/Nigeria) approximately 5000 years ago (ya), there is no consensus about the timing and ...MORE ⇓
The expansion of Bantu languages represents one of the most momentous events in the history of Africa. While it is well accepted that Bantu languages spread from their homeland (Cameroon/Nigeria) approximately 5000 years ago (ya), there is no consensus about the timing and geographical routes underlying this expansion. Two main models of Bantu expansion have been suggested: The ‘early-split’ model claims that the most recent ancestor of Eastern languages expanded north of the rainforest towards the Great Lakes region approximately 4000 ya, while the ‘late-split’ model proposes that Eastern languages diversified from Western languages south of the rainforest approximately 2000 ya. Furthermore, it is unclear whether the language dispersal was coupled with the movement of people, raising the question of language shift versus demic diffusion. We use a novel approach taking into account both the spatial and temporal predictions of the two models and formally test these predictions with linguistic and genetic data. Our results show evidence for a demic diffusion in the genetic data, which is confirmed by the correlations between genetic and linguistic distances. While there is little support for the early-split model, the late-split model shows a relatively good fit to the data. Our analyses demonstrate that subsequent contact among languages/populations strongly affected the signal of the initial migration via isolation by distance.
Proceedings of the Royal Society B: Biological Sciences 279(1747):4643-4651, 2012
Joint attention (JA) is important to many social, communicative activities, including language, and humans exhibit a considerably high level of JA compared with non-human primates. We propose a coevolutionary hypothesis to explain this degree-difference in JA: once JA started to ...MORE ⇓
Joint attention (JA) is important to many social, communicative activities, including language, and humans exhibit a considerably high level of JA compared with non-human primates. We propose a coevolutionary hypothesis to explain this degree-difference in JA: once JA started to aid linguistic comprehension, along with language evolution, communicative success (CS) during cultural transmission could enhance the levels of JA among language users. We illustrate this hypothesis via a multi-agent computational model, where JA boils down to a genetically transmitted ability to obtain non-linguistic cues aiding comprehension. The simulation results and statistical analysis show that: (i) the level of JA is correlated with the understandability of the emergent language; and (ii) CS can boost an initially low level of JA and ratchet it up to a stable high level. This coevolutionary perspective helps explain the degree-difference in many language-related competences between humans and non-human primates, and reflects the importance of biological evolution, individual learning and cultural transmission to language evolution.
2011
Proceedings of the Royal Society B: Biological Sciences 278(1704):474--479, 2011
Abstract Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic ...
Proceedings of the Royal Society B: Biological Sciences 278(1710):1399--1404, 2011
Abstract Reconstructing the rise and fall of social complexity in human societies through time is fundamental for understanding some of the most important transformations in human history. Phylogenetic methods based on language diversity provide a means to ...
Proceedings of the Royal Society B: Biological Sciences 278(1713):1794--1803, 2011
Abstract Language evolution is traditionally described in terms of family trees with ancestral languages splitting into descendent languages. However, it has long been recognized that language evolution also entails horizontal components, most commonly through lexical ...
Proceedings of the Royal Society B: Biological Sciences 278(1718):2562--2567, 2011
Abstract Phylogenetic inference based on language is a vital tool for tracing the dynamics of human population expansions. The timescale of agriculture-based expansions around the world provides an informative amount of linguistic change ideal for reconstructing ...
Proceedings of the Royal Society B: Biological Sciences 278(1725):3662--3669, 2011
Abstract Languages, like genes, evolve by a process of descent with modification. This striking similarity between biological and linguistic evolution allows us to apply phylogenetic methods to explore how languages, as well as the people who speak them, are related to ...
2010
Proceedings of the Royal Society B: Biological Sciences 277(1680):429-436, 2010
Scientists studying how languages change over time often make an analogy between biological and cultural evolution, with words or grammars behaving like traits subject to natural selection. Recent work has exploited this analogy by using models of biological evolution to explain ...MORE ⇓
Scientists studying how languages change over time often make an analogy between biological and cultural evolution, with words or grammars behaving like traits subject to natural selection. Recent work has exploited this analogy by using models of biological evolution to explain the properties of languages and other cultural artefacts. However, the mechanisms of biological and cultural evolution are very different: biological traits are passed between generations by genes, while languages and concepts are transmitted through learning. Here we show that these different mechanisms can have the same results, demonstrating that the transmission of frequency distributions over variants of linguistic forms by Bayesian learners is equivalent to the Wright ``Fisher model of genetic drift. This simple learning mechanism thus provides a justification for the use of models of genetic drift in studying language evolution. In addition to providing an explicit connection between biological and cultural evolution, this allows us to define a neutral model that indicates how languages can change in the absence of selection at the level of linguistic variants. We demonstrate that this neutral model can account for three phenomena: the s-shaped curve of language change, the distribution of word frequencies, and the relationship between word frequencies and extinction rates.
Proceedings of the Royal Society B: Biological Sciences 277(1684):1003--1009, 2010
Abstract Humans readily distinguish spoken words that closely resemble each other in acoustic structure, irrespective of audible differences between individual voices or sex of the speakers. There is an ongoing debate about whether the ability to form phonetic ...
Proceedings of the Royal Society B: Biological Sciences 277(1693):2443-2450, 2010
There are approximately 7000 languages spoken in the world today. This diversity reflects the legacy of thousands of years of cultural evolution. How far back we can trace this history depends largely on the rate at which the different components of language evolve. Rates of ...MORE ⇓
There are approximately 7000 languages spoken in the world today. This diversity reflects the legacy of thousands of years of cultural evolution. How far back we can trace this history depends largely on the rate at which the different components of language evolve. Rates of lexical evolution are widely thought to impose an upper limit of 6000-10 000 years on reliably identifying language relationships. In contrast, it has been argued that certain structural elements of language are much more stable. Just as biologists use highly conserved genes to uncover the deepest branches in the tree of life, highly stable linguistic features hold the promise of identifying deep relationships between the world's languages. Here, we present the first global network of languages based on this typological information. We evaluate the relative evolutionary rates of both typological and lexical features in the Austronesian and Indo-European language families. The first indications are that typological features evolve at similar rates to basic vocabulary but their evolution is substantially less tree-like. Our results suggest that, while rates of vocabulary change are correlated between the two language families, the rates of evolution of typological features and structural subtypes show no consistent relationship across families.
2009
Proceedings of the Royal Society B: Biological Sciences 276(1664):1957--1964, 2009
Abstract The nature of social life in human prehistory is elusive, yet knowing how kinship systems evolve is critical for understanding population history and cultural diversity. Post-marital residence rules specify sex-specific dispersal and kin association, influencing the ...
Proceedings of the Royal Society B: Biological Sciences 276(1665):2299-2306, 2009
Phylogenetic methods have recently been applied to studies of cultural evolution. However, it has been claimed that the large amount of horizontal transmission that sometimes occurs between cultural groups invalidates the use of these methods. Here, we use a natural model of ...MORE ⇓
Phylogenetic methods have recently been applied to studies of cultural evolution. However, it has been claimed that the large amount of horizontal transmission that sometimes occurs between cultural groups invalidates the use of these methods. Here, we use a natural model of linguistic evolution to simulate borrowing between languages. The results show that tree topologies constructed with Bayesian phylogenetic methods are robust to realistic levels of borrowing. Inferences about divergence dates are slightly less robust and show a tendency to underestimate dates. Our results demonstrate that realistic levels of reticulation between cultures do not invalidate a phylogenetic approach to cultural and linguistic evolution.
Proceedings of the Royal Society B: Biological Sciences 276(1668):2703-2710, 2009
The evolution of languages provides a unique opportunity to study human population history. The origin of Semitic and the nature of dispersals by Semitic-speaking populations are of great importance to our understanding of the ancient history of the Middle East and Horn of ...MORE ⇓
The evolution of languages provides a unique opportunity to study human population history. The origin of Semitic and the nature of dispersals by Semitic-speaking populations are of great importance to our understanding of the ancient history of the Middle East and Horn of Africa. Semitic populations are associated with the oldest written languages and urban civilizations in the region, which gave rise to some of the world's first major religious and literary traditions. In this study, we employ Bayesian computational phylogenetic techniques recently developed in evolutionary biology to analyse Semitic lexical data by modelling language evolution and explicitly testing alternative hypotheses of Semitic history. We implement a relaxed linguistic clock to date language divergences and use epigraphic evidence for the sampling dates of extinct Semitic languages to calibrate the rate of language evolution. Our statistical tests of alternative Semitic histories support an initial divergence of Akkadian fromancestral Semitic over competing hypotheses (e.g. an African origin of Semitic). We estimate an Early Bronze Age origin for Semitic approximately 5750 years ago in the Levant, and further propose that contemporary Ethiosemitic languages of Africa reflect a single introduction of early Ethiosemitic from southern Arabia approximately 2800 years ago.
Proceedings of the Royal Society B: Biological Sciences 276(1674):3835-3843, 2009
The question as to whether cultures evolve in a manner analogous to that of genetic evolution can be addressed by attempting to reconstruct population histories using cultural data. As others have argued, this can only succeed if cultures are isolated enough to maintain and pass ...MORE ⇓
The question as to whether cultures evolve in a manner analogous to that of genetic evolution can be addressed by attempting to reconstruct population histories using cultural data. As others have argued, this can only succeed if cultures are isolated enough to maintain and pass on a central core of traditions that can be modified over time. In this study we used a set of cultural data (canoe design traits from Polynesia) to look for the kinds of patterns and relationships normally found in population genetic studies. After developing new techniques to accommodate the peculiarities of cultural data, we were able to infer an ancestral region (Fiji) and a sequence of cultural origins for these Polynesian societies. In addition, we found evidence of cultural exchange, migration and a serial founder effect. Results were stronger when analyses were based on functional traits (presumably subject to natural selection and convergence) rather than symbolic or stylistic traits (probably subject to cultural selection for rapid divergence). These patterns strongly suggest that cultural evolution, while clearly affected by cultural exchange, is also subject to some of the same processes and constraints as genetic evolution.
2005
Proceedings of the Royal Society B: Biological Sciences, 2005
Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal ...MORE ⇓
Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal communication systems, but do not explain how signals naturally combine. Even more recent work has shown that if a communication system maximizes communicative efficiency while minimizing the cost of communication, or if a communication system constrains ambiguity in a non-trivial way while a certain entropy is maximized, signal frequencies will be distributed according to Zipf's law. Here we show that such communication principles give rise not only to signals that have many traits in common with the linking words in real human languages, but also to a rudimentary sort of syntax and symbolic reference.
2004
Proceedings of the Royal Society B: Biological Sciences 271(1540):701-704, 2004
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other ...MORE ⇓
Human language is a complex and expressive communication system. Children spontaneously develop a native language from speech they hear in their community. Languages change dramatically and unpredictably by accumulating small changes over time and by interacting with other languages. This paper describes a mathematical model illustrating language change. Children learn their parents' language imperfectly, and in the case presented here, the result is a simulated population that maintains an ever-changing mixture of grammars. This research is part of a growing attempt to use mathematical models to better understand the social and biological history of language.
2003
Proceedings of the Royal Society B: Biological Sciences 270(1510):69-76, 2003
We investigate how the evolution of communication strategies affects signal credibility when there is common interest as well as a conflict between communicating individuals. Taking alarm calls as an example, we show that if the temptation to cheat is low, a single signal is used ...MORE ⇓
We investigate how the evolution of communication strategies affects signal credibility when there is common interest as well as a conflict between communicating individuals. Taking alarm calls as an example, we show that if the temptation to cheat is low, a single signal is used in the population. If the temptation increases cheaters will erode the credibility of a signal, and an honest mutant using a different signal ('a private code') will be very successful until this, in turn, is cracked by cheaters. In such a system, signal use fluctuates in time and space and hence the meaning of a given signal is not constant. When the temptation to cheat is too large, no honest communication can maintain itself in a Tower of Babel of many signals. We discuss our analysis in the light of the Green Beard mechanism for the evolution of altruism.
2001
Proceedings of the Royal Society B: Biological Sciences 268(1472):1189-1196, 2001
The language acquisition period in humans lasts about 13 years. After puberty it becomes increasingly difficult to learn a language. We explain this phenomenon by using an evolutionary framework. We present a dynamical system describing competition between language acquisition ...MORE ⇓
The language acquisition period in humans lasts about 13 years. After puberty it becomes increasingly difficult to learn a language. We explain this phenomenon by using an evolutionary framework. We present a dynamical system describing competition between language acquisition devices, which differ in the length of the learning period. There are two selective forces that play a role in determining the critical learning period: (i) having a longer learning period increases the accuracy of language acquisition; (ii) learning is associated with certain costs that affect fitness. As a result, there exists a limited learning period which is evolutionarily stable. This result is obtained analytically by means of a Nash equilibrium analysis of language acquisition devices. Interestingly, the evolutionarily stable learning period does not maximize the average fitness of the population.
Proceedings of the Royal Society B: Biological Sciences 268(1482):2261-2265, 2001
Words in human language interact in sentences in non-random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co-occurrence of words in sentences reflects ...MORE ⇓
Words in human language interact in sentences in non-random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co-occurrence of words in sentences reflects language organization in a subtle manner that can be described in terms of a graph of word interactions. Here, we show that such graphs display two important features recently found in a disparate number of complex systems. (i) The so called small-world effect. In particular, the average distance between two words, d (i.e. the average minimum number of links to be crossed from an arbitrary word to another), is shown to be d approximate to 2-3, even though the human brain can store many thousands. (ii) A scale-free distribution of degrees. The known pronounced effects of disconnecting the most connected vertices in such networks can be identified in some language disorders. These observations indicate some unexpected features of language organization that might reflect the evolutionary and social history of lexicons and the origins of their flexibility and combinatorial nature.
Proceedings of the Royal Society B: Biological Sciences 268(1485):2603-2606, 2001
Human language may be described as a complex network of linked words. In such a treatment, each distinct word in language is a vertex of this web, and interacting words in sentences are connected by edges. The empirical distribution of the number of connections of words in this ...MORE ⇓
Human language may be described as a complex network of linked words. In such a treatment, each distinct word in language is a vertex of this web, and interacting words in sentences are connected by edges. The empirical distribution of the number of connections of words in this network is of a peculiar form that includes two pronounced power-law regions. Here we propose a theory of the evolution of language, which treats language as a self-organizing network of interacting words. In the framework of this concept., we completely describe the observed word web structure without any fitting. We show that the two regimes in the distribution naturally emerge from the evolutionary dynamics of the word web. It follows front our theory that the size of the core part of language, the 'kernel lexicon', does not vary as language evolves.
1999
Proceedings of the Royal Society B: Biological Sciences 266(1433):2131-2136, 1999
On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all ...MORE ⇓
On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all languages (i) a small number of phonemes are used to generate a large number of words; and (ii) a large number of words are used to a produce an unlimited number of sentences. The first (and simpler) step is the topic of the current paper. We study the evolution of communication in the presence of errors and show that this limits the number of objects (or concepts) that can be described by a simple communication system. The evolutionary optimum is achieved by using only a small number of signals to describe a few valuable concepts. Adding more signals does not increase the fitness of a language. This represents an error limit for the evolution of communication. We show that this error limit can be overcome by combining signals (phonemes) into words. The transition from an analogue to a digital system was a necessary step toward the evolution of human language.
1995
Spatial structure and the evolution of honest cost-free signalling
Proceedings of the Royal Society B: Biological Sciences 260:365-372, 1995
Models of animal signalling stress that among unrelated individuals the transfer of honest information normally requires that signals are costly, and costly in a way related to the true information revealed by the signal. In the absence of such a cost, `cheats', that lie about ...MORE ⇓
Models of animal signalling stress that among unrelated individuals the transfer of honest information normally requires that signals are costly, and costly in a way related to the true information revealed by the signal. In the absence of such a cost, `cheats', that lie about their states or needs, are able to evolve and exploit the preferences of receivers. We show here that spatial constraints imposed on the interactions between signallers and receivers favour honest signalling even in the absence of any costs: `islands' of honesty coexist in `seas' of dishonesty. The extent to which honest or dishonest strategies are favoured, is shown to depend upon the relative payoffs from signalling and receiving. As the receiving component of fitness becomes greater than the signalling component of fitness, as might be true in `life-dinner' type interactions, honesty is increasingly favoured. In addition, in spatial populations, honesty can be favoured locally even when the mean global payoffs to honesty are lower than the mean payoffs to dishonesty. Our model provides a general framework for analysing signals in spatially structured populations and might therefore apply to signalling in both natural and cultural situations.