Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Dietrich Stauffer
2009
Annals of the New York Academy of Sciences, pages 221-229, 2009
We use agent-based Monte Carlo simulations to address the problem of language choice dynamics in a tripartite community which is linguistically homogeneous but politically divided. We observe the process of non-local pattern formation that causes populations to self-organize into ...MORE ⇓
We use agent-based Monte Carlo simulations to address the problem of language choice dynamics in a tripartite community which is linguistically homogeneous but politically divided. We observe the process of non-local pattern formation that causes populations to self-organize into stable antagonistic groups due to the local dynamics of attraction and influence between individual computational agents. Our findings uncover some of the unique properties of opinion formation in social groups when the process is affected by asymmetric noise distribution, unstable inter-group boundaries, and different migratory behaviors. Although we focus on one particular study, the proposed stochastic dynamic models can be easily generalized and applied to investigate the evolution of other complex and nonlinear features of human collective behavior.
2008
Journal of Linguistics 44(3):659-675, 2008
This paper presents computer simulations of language populations and the development of language families, showing how a simple model can lead to distributions similar to those observed empirically by Wichmann (2005) and others. The model combines features of two models used in ...MORE ⇓
This paper presents computer simulations of language populations and the development of language families, showing how a simple model can lead to distributions similar to those observed empirically by Wichmann (2005) and others. The model combines features of two models used in earlier work for the simulation of competition among languages: the `Viviane' model for the migration of peoples and the propagation of languages, and the `Schulze' model, which uses bit-strings as a way of characterising structural features of languages.
Physica A: Statistical Mechanics and its Applications 387(13):3242-3252, 2008
The standard three-state voter model is enlarged by including the outside pressure favouring one of the three language choices and by adding some biased internal random noise. The Monte Carlo simulations are motivated by states with the population divided into three groups of ...MORE ⇓
The standard three-state voter model is enlarged by including the outside pressure favouring one of the three language choices and by adding some biased internal random noise. The Monte Carlo simulations are motivated by states with the population divided into three groups of various affinities to each other. We show the crucial influence of the boundaries for moderate lattice sizes like 500 x 500. By removing the fixed boundary at one side, we demonstrate that this can lead to the victory of one single choice. Noise in contrast stabilizes the choices of all three populations. In addition, we compute the persistence probability, i.e., the number of sites who have never changed their opinion during the simulation, and we consider the case of ``rigid-minded'' decision makers.
Birth, survival and death of languages by Monte Carlo simulationPDF
Communications in Computational Physics 3(2):271-294, 2008
Simulations mostly by physicists of the competition between adult languages since 2003 are reviewed. The Viviane and Schulze models give good and reasonable agreement, respectively, with the empirical histogram of language sizes. Also the numbers of different languages within one ...MORE ⇓
Simulations mostly by physicists of the competition between adult languages since 2003 are reviewed. The Viviane and Schulze models give good and reasonable agreement, respectively, with the empirical histogram of language sizes. Also the numbers of different languages within one language family is modeled reasonably in an intermediate range. Bilingualism is now incorporated into the Schulze model. Also the rate at which the majority shifts from one language to another is found to be nearly independent of the population size, or to depend strongly on it, according to details of the Schulze model. Other simulations, like Nettle-Culicover-Nowak, are reviewed more briefly.
Advances in Complex Systems 11(3):357-369, 2008
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model ...MORE ⇓
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model combined with a Barabasi-Albert network) and a richer empirical dataset [12]. Our simulations show either a negligible or a strong dependence of language change on population sizes, depending on the parameter settings; while empirical data, like some of the simulations, show a negligible dependence.
2007
Linguistic Typology 11(2):395-423, 2007
Modern linguistic typology is increasingly less concerned with what is possible in human languages (universals) and increasingly more with the question ``what's where why?'' (Bickel 2007). Moreover, as several recent papers in this journal show, typologists increasingly turn to ...MORE ⇓
Modern linguistic typology is increasingly less concerned with what is possible in human languages (universals) and increasingly more with the question ``what's where why?'' (Bickel 2007). Moreover, as several recent papers in this journal show, typologists increasingly turn to quantitative approaches as a means to understanding typological distributions. In order to provide the quantitative study of typological distributions with a firm methodological foundation it is preferable to gain a grasp of simple facts before starting to ask the more complicated questions. In this article the only assumptions we make about languages are that (i) they may be partly described by a set of typological characteristics, each of which may either be found or not found in any given language; that (ii) languages may be genealogically related or not; and that (iii) languages are spoken in certain places. Given these minimal assumptions we can begin to ask how to express the differences and similarities among languages as functions of the geographical distances among them, whether different functions apply to genealogically related and unrelated languages, and whether it is possible to distinguish in some quantitative way between languages that are related and languages that are not, even when the languages in question are spoken at great distances from one another. Moreover, we may investigate the effects that factors such as ecology, migration, and rates of linguistic change or diffusion have on the degree of similarities among languages in cases where they are either related or unrelated. We will approach these questions from two perspectives. The first perspective is an empirical one, where observations primarily derive from analyses of the data of Haspelmath et al. (eds.) (2005). The second perspective is a computational one, where simulations are drawn upon to test the effects of different parameters on the development of structural linguistic diversity.
Physica A-Statistical Mechanics And Its Applications 376:609--616, 2007
The language competition model of Viviane de Oliveira et al. is modified by associating with each language a string of 32 bits. Whenever a language changes in this Viviane model, also one randomly selected bit is flipped. If then only languages with different bit-strings are ...MORE ⇓
The language competition model of Viviane de Oliveira et al. is modified by associating with each language a string of 32 bits. Whenever a language changes in this Viviane model, also one randomly selected bit is flipped. If then only languages with different bit-strings are counted as different, the resulting size distribution of languages agrees with the empirically observed slightly asymmetric log-normal distribution. Several other modifications were also tried but either had more free parameters or agreed less well with reality.
Physica A: Statistical Mechanics and its Applications 379(2):661-664, 2007
Using the Schulze model for Monte Carlo simulations of language competition, we include a barrier between the top half and the bottom half of the lattice. We check under which conditions two different languages evolve as dominating in the two halves.
Computer simulation of language competition by physicistsPDF
Econophysics and Sociophysics, pages 3807--3819, 2007
Physica A: Statistical Mechanics and its Applications 374(2):835-842, 2007
The differential equation of Abrams and Strogatz for the competition between two languages is compared with agent-based Monte Carlo simulations for fully connected networks as well as for lattices in one, two and three dimensions, with up to 10(9) agents. In the case of socially ...MORE ⇓
The differential equation of Abrams and Strogatz for the competition between two languages is compared with agent-based Monte Carlo simulations for fully connected networks as well as for lattices in one, two and three dimensions, with up to 10(9) agents. In the case of socially equivalent languages, agent-based models and a mean-field approximation give grossly different results.
Transactions of the Philological Society 105(2):126-147, 2007
This paper presents the results of the application of a bit-string model of languages (Schulze and Stauffer 2005) to problems of taxonomic patterns. The questions addressed include the following: (1) Which parameters are minimally ne eded for the development of a taxonomic ...MORE ⇓
This paper presents the results of the application of a bit-string model of languages (Schulze and Stauffer 2005) to problems of taxonomic patterns. The questions addressed include the following: (1) Which parameters are minimally ne eded for the development of a taxonomic dynamics leading to the type of distribution of language family sizes currently attested (as measured in the i number of languages per family), which appears to be a power-law? (2) How may such a model be coupled with one of the dynamics of speaker populations leading to the type of language size seen today, which appears to follow a log-normal distribution?
2006
Computing in Science and Engineering 8(3):60-67, 2006
Will we all eventually speak the same language and its dialects? Here, we summarize several language models and present variants of our own language model in greater detail.
Advances in complex systems 9(3):183-191, 2006
Our earlier language model is modified to allow for the survival of a minority language without higher status, just because of the pride of its speakers in their linguistic identity. An appendix studies the roughness of the interface for linguistic regions when one language ...MORE ⇓
Our earlier language model is modified to allow for the survival of a minority language without higher status, just because of the pride of its speakers in their linguistic identity. An appendix studies the roughness of the interface for linguistic regions when one language conquers the whole territory.
Physica A: Statistical Mechanics and its Applications 371(2):719-724, 2006
The bit-string model of Schulze and Stauffer (2005) is applied to non-equilibrium situations and then gives better agreement with the empirical distribution of language sizes. Here the size is the number of people having this language as mother tongue. In contrast, when ...MORE ⇓
The bit-string model of Schulze and Stauffer (2005) is applied to non-equilibrium situations and then gives better agreement with the empirical distribution of language sizes. Here the size is the number of people having this language as mother tongue. In contrast, when equilibrium is combined with irreversible mutations of languages, one language always dominates and is spoken by at least 80 percent of the population.
2005
International Journal of Modern Physics C 16(5):781-787, 2005
Similar to biological evolution and speciation we define a language through a string of 8 or 16 bits. The parent gives its language to its children, apart from a random mutation from zero to one or from one to zero; initially all bits are zero. The Verhulst deaths are taken as ...MORE ⇓
Similar to biological evolution and speciation we define a language through a string of 8 or 16 bits. The parent gives its language to its children, apart from a random mutation from zero to one or from one to zero; initially all bits are zero. The Verhulst deaths are taken as proportional to the total number of people, while in addition languages spoken by many people are preferred over small languages. For a fixed population size, a sharp phase transition is observed: For low mutation rates, one language contains nearly all people; for high mutation rates, no language dominates and the size distribution of languages is roughly log-normal as for present human languages. A simple scaling law is valid.
Physics of Life Reviews 2(2):89-116, 2005
The similarity of the evolution of human languages (or alphabets, bird songs, ...) to biological evolution of species is utilized to study with up to $10^9$ people the rise and fall of languages either by macroscopic differential equations similar to biological Lotka-Volterra ...MORE ⇓
The similarity of the evolution of human languages (or alphabets, bird songs, ...) to biological evolution of species is utilized to study with up to $10^9$ people the rise and fall of languages either by macroscopic differential equations similar to biological Lotka-Volterra equation, or by microscopic Monte Carlo simulations of bit-strings incorporating the birth, maturity, and death of every individual. For our bit-string model, depending on parameters either one language comprises the majority of speakers (dominance), or the population splits into many languages having in order of magnitude the same number of speakers (fragmentation); in the latter case the size distribution is log-normal, with upward deviations for small sizes, just as in reality for human languages. On a lattice two different dominating languages can coexist in neighbouring regions, without being favoured or disfavoured by different status. We deal with modifications and competition for existing languages, not with the evolution or learning of one language.