Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Richard A. Blythe
2018
Quantifying the dynamics of topical fluctuations in languagePDF
arXiv, 2018
The availability of large diachronic corpora has provided the impetus for a growing body of quantitative research on language evolution and meaning change. The central quantities in this research are token frequencies of linguistic elements in the texts, with changes in frequency ...MORE ⇓
The availability of large diachronic corpora has provided the impetus for a growing body of quantitative research on language evolution and meaning change. The central quantities in this research are token frequencies of linguistic elements in the texts, with changes in frequency taken to reflect the popularity or selective fitness of an element. However, corpus frequencies may change for a wide variety of reasons, including purely random sampling effects, or because corpora are composed of contemporary media and fiction texts within which the underlying topics ebb and flow with cultural and socio-political trends. In this work, we introduce a computationally simple model for controlling for topical fluctuations in corpora—the topical-cultural advection model—and demonstrate how it provides a robust baseline of variability in word frequency changes over time. We validate the model on a diachronic corpus spanning two centuries, and a carefully-controlled artificial language change scenario, and then use it to correct for topical fluctuations in historical time series. Finally, we show that the model can be used to show that emergence of new words typically corresponds with the rise of a trending topic. This suggests that some lexical innovations occur due to growing communicative need in a subspace of the lexicon, and that the topical-cultural advection model can be used to quantify this.
2015
PloS one 10, 2015
How communication systems emerge is a topic of relevance to several academic disciplines. Numerous existing models, both mathematical and computational, study this emergence. However, with few exceptions, these models all build some form of communication into their initial ...MORE ⇓
How communication systems emerge is a topic of relevance to several academic disciplines. Numerous existing models, both mathematical and computational, study this emergence. However, with few exceptions, these models all build some form of communication into their initial specification. Consequently, what these models study is how communication systems transition from one form to another, and not how communication itself emerges in the first place. Here we present a new computational model of the emergence of communication which, unlike previous models, does not pre-specify the existence of communication. We conduct two experiments using this model, in order to derive general statements about how communication systems emerge. The two main routes to communication that we identify correspond with findings from the empirical literature on the evolution of animal signals. We use this finding to explain when and why we should expect communication to emerge in nature. We also compare our model to experimental research on the origins of human communication systems, and hence show that humans are an important exception to the general trends we observe. We argue that this is because humans, and probably only humans, are able to ‘signal signalhood’, i.e. to express communicative intentions.
2012
Advances in Complex Systems 15(03n04):1150015, 2012
We review the task of aligning simple models for language dynamics with relevant empirical data, motivated by the fact that this is rarely attempted in practice despite an abundance of abstract models. We propose that one way to meet this challenge is through the careful ...MORE ⇓
We review the task of aligning simple models for language dynamics with relevant empirical data, motivated by the fact that this is rarely attempted in practice despite an abundance of abstract models. We propose that one way to meet this challenge is through the careful construction of null models. We argue in particular that rejection of a null model must have important consequences for theories about language dynamics if modeling is truly to be worthwhile. Our main claim is that the stochastic process of neutral evolution (also known as genetic drift or random copying) is a viable null model for language dynamics. We survey empirical evidence in favor and against neutral evolution as a mechanism behind historical language changes, highlighting the theoretical implications in each case.
S-curves and the mechanisms of propagation in language change
Language 88(2):269--304, 2012
Abstract A variety of mechanisms have been proposed in sociolinguistics for the propagation of an innovation through the speech community. The complexity of social systems makes it difficult to evaluate the different mechanisms empirically. We use the four-way typology of ...
Proceedings of the Royal Society B: Biological Sciences 279(1735):1943--1949, 2012
Communication involves a pair of behaviours—a signal and a response—that are functionally interdependent. Consequently, the emergence of communication involves a chicken-and-egg problem: if signals and responses are dependent on one another, then how does such a relationship ...MORE ⇓
Communication involves a pair of behaviours—a signal and a response—that are functionally interdependent. Consequently, the emergence of communication involves a chicken-and-egg problem: if signals and responses are dependent on one another, then how does such a relationship emerge in the first place? The empirical literature suggests two solutions to this problem: ritualization and sensory manipulation; and instances of ritualization appear to be more common. However, it is not clear from a theoretical perspective why this should be the case, nor if there are any other routes to communication. Here, we develop an analytical model to examine how communication can emerge. We show that: (i) a state of non-interaction is evolutionarily stable, and so communication will not necessarily emerge even when it is in both parties' interest; (ii) the conditions for sensory manipulation are more stringent than for ritualization, and hence ritualization is likely to be more common; and (iii) communication can arise by a third route, when the intention to communicate can itself be communicated, but this may be limited to humans. More generally, our results demonstrate the utility of a functional approach to communication.
Fast fixation with a generic network structure
Physical Review E 86(3):031142, 2012
We investigate the dynamics of a broad class of stochastic copying processes on a network that includes examples from population genetics (spatially structured Wright-Fisher models), ecology (Hubbell-type models), linguistics (the utterance selection model), and opinion dynamics ...MORE ⇓
We investigate the dynamics of a broad class of stochastic copying processes on a network that includes examples from population genetics (spatially structured Wright-Fisher models), ecology (Hubbell-type models), linguistics (the utterance selection model), and opinion dynamics (the voter model) as special cases. These models all have absorbing states of fixation where all the nodes are in the same state. Earlier studies of these models showed that the mean time when this occurs can be made to grow as different powers of the network size by varying the degree distribution of the network. Here we demonstrate that this effect can also arise if one varies the asymmetry of the copying dynamics while holding the degree distribution constant. In particular, we show that the mean time to fixation can be accelerated even on homogeneous networks when certain nodes are very much more likely to be copied from than copied to. We further show that there is a complex interplay between degree distribution and asymmetry when they may covary, and that the results are robust to correlations in the network or the initial condition.
2011
The Interplay of Replication, Variation and Selection in the Dynamics of Evolving Populations
Principles of Evolution, pages 81--118, 2011
Evolution is a process by which change occurs through replication. Variation can be introduced into a population during the replication process. Some of the resulting variants may be replicated more rapidly than others, and so the characteristics of the population – and ...MORE ⇓
Evolution is a process by which change occurs through replication. Variation can be introduced into a population during the replication process. Some of the resulting variants may be replicated more rapidly than others, and so the characteristics of the population – and individuals within it – change over time. These processes can be recognised most obviously in genetics and ecology; but they also arise in the context of cultural change. We discuss two key questions that are crucial to the development of evolutionary theory. First, we consider how different application domains may be usefully placed within a single framework; and second, we ask how one can distinguish directed, deterministic change from changes that occur purely because of the stochastic nature of the underlying replication process.
2010
Adaptive Behavior 18(1):12-20, 2010
Using our interdisciplinary research collaboration as a case study, we discuss the question of whether formal modeling and empirical approaches can be successfully integrated into a single line of research. We argue that to avoid an undesirable disconnect between the two, one ...MORE ⇓
Using our interdisciplinary research collaboration as a case study, we discuss the question of whether formal modeling and empirical approaches can be successfully integrated into a single line of research. We argue that to avoid an undesirable disconnect between the two, one needs considerable time and patience for a science-humanities collaboration to bear fruit. In our collaboration and, we believe, in science-humanities collaborations in general, certain shared goals are required for success, including: starting with simple models before moving to more complex models; the importance of continually comparing models with empirical data where possible; and a focus on explaining statistical patterns rather than accounting for single data points individually.
2009
Language Variation and Change 21(2):257-296, 2009
Trudgill (2004) proposed that the emergence of New Zealand English, and of isolated new dialects generally, is purely deterministic. It can be explained solely in terms of the frequency of occurrence of particular variants and the frequency of interactions between different ...MORE ⇓
Trudgill (2004) proposed that the emergence of New Zealand English, and of isolated new dialects generally, is purely deterministic. It can be explained solely in terms of the frequency of occurrence of particular variants and the frequency of interactions between different speakers in the society. Trudgill's theory is closely related to usage-based models of language, in which frequency plays a role in the representation of linguistic knowledge and in language change. Trudgill's theory also corresponds to a neutral evolution model Of language change. We use a mathematical model based on Croft's usage-based evolutionary framework for language change (Baxter, Blythe, Croft, \& McKane, 2006), and investigate whether Trudgill's theory is a plausible model of the emergence of new dialects. The results of our modeling indicate that determinism cannot be a sufficient mechanism for the emergence of a new dialect. Our approach illustrates the utility of mathematical modeling of theories and of empirical data for the study of language change.
Journal of Statistical Mechanics: Theory and Experiment, pages P02059, 2009
We introduce a class of stochastic models for the dynamics of two linguistic variants that are competing to become the single, shared convention within an unstructured community of speakers. Different instances of the model are distinguished by the way agents handle variability ...MORE ⇓
We introduce a class of stochastic models for the dynamics of two linguistic variants that are competing to become the single, shared convention within an unstructured community of speakers. Different instances of the model are distinguished by the way agents handle variability in the language (i.e., multiple forms for the same meaning). The class of models includes as special cases two previously-studied models of language dynamics, the Naming Game, in which agents tend to standardise on variants they have encountered most frequently, and the Utterance Selection Model, in which agents tend to preserve variability by uniform sampling of a pool of utterances. We reduce the full complexities of the dynamics to a single-coordinate stochastic model which allows the probability and time taken for speakers to reach consensus on a single variant to be calculated for large communities. This analysis suggests that in the broad class of models considered, consensus is formed in one of three generic ways, according to whether agents tend to eliminate, accentuate or sample neutrally the variability in the language. These different regimes are observed in simulations of the full dynamics, and for which the simplified model in some cases makes good quantitative predictions. We use these results, along with comparisons with related models, to conjecture the likely behaviour of more general models, and further make use of empirical data to argue that in reality, biases away from neutral sampling behaviour are likely to be small.
Language Learning 59(s1):47-63, 2009
Language is a complex adaptive system: Speakers are agents who interact with each other, and their past and current interactions feed into speakers' future behavior in complex ways. In this article, we describe the social cognitive linguistic basis for this analysis of language ...MORE ⇓
Language is a complex adaptive system: Speakers are agents who interact with each other, and their past and current interactions feed into speakers' future behavior in complex ways. In this article, we describe the social cognitive linguistic basis for this analysis of language and a mathematical model developed in collaboration between researchers in linguistics and statistical physics. The model has led us to posit two mechanisms of selectionaneutral interactor selection and weighted interactor selectionain addition to neutral evolution and replicator selection (fitness). We describe current results in modeling language change in terms of neutral interactor selection and weighted interactor selection.
Language Learning 59(s1):1-26, 2009
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language ...MORE ⇓
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language is acquired, is used, and changes. These processes are not independent of one another but are facets of the same complex adaptive system (CAS). Language as a CAS involves the following key features: The system consists of multiple agents (the speakers in the speech community) interacting with one another. The system is adaptive; that is, speakers' behavior is based on their past interactions, and current and past interactions together feed forward into future behavior. A speaker's behavior is the consequence of competing factors ranging from perceptual constraints to social motivations. The structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms. The CAS approach reveals commonalities in many areas of language research, including first and second language acquisition, historical linguistics, psycholinguistics, language evolution, and computational modeling.
Trends in Cognitive Sciences 13(11):464-469, 2009
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe ...MORE ⇓
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe recent advances within a new emerging framework for the study of language change, one that models such change as an evolutionary process among competing linguistic variants. We argue that a crucial and unifying element of this framework is the use of probabilistic, data-driven models both to infer change and to compare competing claims about social and cognitive influences on language change.
2008
Physical Review Letters 101(25):258701, 2008
We investigate a set of stochastic models of biodiversity, population genetics, language evolution and opinion dynamics on a network within a common framework. Each node has a state, 0 < xi < 1, with interactions specified by strengths mij. For any set of mij we derive an ...MORE ⇓
We investigate a set of stochastic models of biodiversity, population genetics, language evolution and opinion dynamics on a network within a common framework. Each node has a state, 0 < xi < 1, with interactions specified by strengths mij. For any set of mij we derive an approximate expression for the mean time to reach fixation or consensus (all xi=0 or 1). Remarkably in a case relevant to language change this time is independent of the network structure.
2006
Physical Review E 73:046118, 2006
We present a mathematical formulation of a theory of language change. The theory is evolutionary in nature and has close analogies with theories of population genetics. The mathematical structure we construct similarly has correspondences with the Fisher-Wright model of ...MORE ⇓
We present a mathematical formulation of a theory of language change. The theory is evolutionary in nature and has close analogies with theories of population genetics. The mathematical structure we construct similarly has correspondences with the Fisher-Wright model of population genetics, but there are significant differences. The continuous time formulation of the model is expressed in terms of a Fokker-Planck equation. This equation is exactly soluble in the case of a single speaker and can be investigated analytically in the case of multiple speakers who communicate equally with all other speakers and give their utterances equal weight. Whilst the stationary properties of this system have much in common with the single-speaker case, time-dependent properties are richer. In the particular case where linguistic forms can become extinct, we find that the presence of many speakers causes a two-stage relaxation, the first being a common marginal distribution that persists for a long time as a consequence of ultimate extinction being due to rare fluctuations.
Symbol Grounding and Beyond: Proceedings of the Third International Workshop on the Emergence and Evolution of Linguistic Communication, pages 31-44, 2006
We present a mathematical model of cross-situational learning, in which we quantify the learnability of words and vocabularies. We find that high levels of uncertainty are not an impediment to learning single words or whole vocabulary systems, as long as the level of uncertainty ...MORE ⇓
We present a mathematical model of cross-situational learning, in which we quantify the learnability of words and vocabularies. We find that high levels of uncertainty are not an impediment to learning single words or whole vocabulary systems, as long as the level of uncertainty is somewhat lower than the total number of meanings in the system. We further note that even large vocabularies are learnable through cross-situational learning.