Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Jinyun Ke
2009
Language Learning 59(s1):1-26, 2009
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language ...MORE ⇓
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language is acquired, is used, and changes. These processes are not independent of one another but are facets of the same complex adaptive system (CAS). Language as a CAS involves the following key features: The system consists of multiple agents (the speakers in the speech community) interacting with one another. The system is adaptive; that is, speakers' behavior is based on their past interactions, and current and past interactions together feed forward into future behavior. A speaker's behavior is the consequence of competing factors ranging from perceptual constraints to social motivations. The structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms. The CAS approach reveals commonalities in many areas of language research, including first and second language acquisition, historical linguistics, psycholinguistics, language evolution, and computational modeling.
2008
Language change and social networksPDF
Communications in Computational Physics 3(4):935-949, 2008
Social networks play an important role in determining the dynamics and outcome of language change. Early empirical studies only examine small-scale local social networks, and focus on the relationship between the individual speakers' linguistic behaviors and their characteristics ...MORE ⇓
Social networks play an important role in determining the dynamics and outcome of language change. Early empirical studies only examine small-scale local social networks, and focus on the relationship between the individual speakers' linguistic behaviors and their characteristics in the network. In contrast, computer models can provide an efficient tool to consider large-scale networks with different structures and discuss the long-term effect of individuals' learning and interaction on language change. This paper presents an agent-based computer model which simulates language change as a process of innovation diffusion, to address the threshold problem of language change. In the model, the population is implemented as a network of agents with age differences and different learning abilities, and the population is changing, with new agents born periodically to replace old ones. Four typical types of networks and their effect on the diffusion dynamics are examined. When the functional bias is sufficiently high, innovations always diffuse to the whole population in a linear manner in regular and small-world networks, but diffuse quickly in a sharp S-curve in random and scale-free networks. The success rate of diffusion is higher in regular and small-world networks than in random and scale-free networks. In addition, the model shows that as long as the population contains a small number of statistical learners who can learn and use both linguistic variants statistically according to the impact of these variants in the input, there is a very high probability for linguistic innovations with only small functional advantage to overcome the threshold of diffusion.
2006
Applied Linguistics 27(4):691-716, 2006
In recent decades, there has been a surge of interest in the origin of language across a wide range of disciplines. Emergentism provides a new perspective to integrate investigations from different areas of study. This paper discusses how the study of language acquisition can ...MORE ⇓
In recent decades, there has been a surge of interest in the origin of language across a wide range of disciplines. Emergentism provides a new perspective to integrate investigations from different areas of study. This paper discusses how the study of language acquisition can contribute to the inquiry, in particular when computer modeling is adopted as the research methodology. An agent-based model is described as an illustration, which simulates how word order in a language could have emerged at the very beginning of language origin. Two important features of emergence, heterogeneity and nonlinearity, are demonstrated in the model, and their implications for applied linguistics are discussed.
2005
Complexity 10(6):50-62, 2005
Whether simple syntax (in the form of simple word order) can emerge during the emergence of lexicon is studied from a simulation perspective; a multiagent computational model is adopted to trace a lexicon-syntax coevolution through iterative communications. Several factors that ...MORE ⇓
Whether simple syntax (in the form of simple word order) can emerge during the emergence of lexicon is studied from a simulation perspective; a multiagent computational model is adopted to trace a lexicon-syntax coevolution through iterative communications. Several factors that may affect this self-organizing process are discussed. An indirect meaning transference is simulated to study the effect of nonlinguistic information in listener's comprehension. Besides the theoretical and empirical argumentations, this computational model, following the Emergentism, demonstrates an adaptation of syntax from some domain-general abilities, which provides an argumentation against the Innatism.
2004
A computational framework to simulate the co-evolution of language and social structurePDF
Artificial Life IX, 2004
In this paper, a multi-agent computational model is proposed to simulate the coevolution of social structure and compositional protolanguage from a holistic signaling system through iterative interactions within a heterogeneous population. We implement an indirect meaning ...MORE ⇓
In this paper, a multi-agent computational model is proposed to simulate the coevolution of social structure and compositional protolanguage from a holistic signaling system through iterative interactions within a heterogeneous population. We implement an indirect meaning transference based on both linguistic and nonlinguistic information in communications, together with a feedback without direct meaning check. The emergent social structure, triggered by two locally selective strategies, friendship and popularity, has small-world characteristics. The influence of these selective strategies on the emergent language and the emergent social structure are discussed.
Self-organization and Language Evolution: System, Population and IndividualPDF
Department of Electronic Engineering, City University of Hong Kong, 2004
This thesis proposes a framework adopting the self-organization theory for the study of language evolution. Self-organization explains collective behaviors and evolution with the observation that the patterns at the global level in a complex system are often properties ...MORE ⇓
This thesis proposes a framework adopting the self-organization theory for the study of language evolution. Self-organization explains collective behaviors and evolution with the observation that the patterns at the global level in a complex system are often properties spontaneously emergent from the numerous local interactions among the individual components, and they cannot be understood by only examining the individual components.

Language can be viewed as such emergent properties instead of products from some innate blueprint in humans. We highlight the importance of recognizing language at two distinctive but inter-dependent levels of existence, i.e. in the idiolect and in the communal language, and a self-organizing process existing at each of the two levels. It is necessary to clarify what phenomena are properties of the idiolects, and what properties are the collective behaviors at the population level.

In linguistics, however, very often an abstract language system is taken as the object of analysis. This level of analysis disregards the distinction between idiolect and communal language, and neglects the heterogeneous nature of language at both levels. As a consequence, explanations for observed patterns based on this abstract level of analysis are often inadequate. However, this is a necessary step for linguists to identify interesting phenomena in the first place. At this abstract level of analysis, the self-organization framework can also be applied. It is assumed that the abstract language system self-organizes. A study on homophony in languages is taken as an example to illustrate the analysis at this level. It is shown that the existence of homophony reflects several self-organization characteristics in a dynamic process of language evolution, such as the predictable degree of homophony, the disyllabification in Chinese dialects, the differentiation of homophone pairs in grammatical class.

We are further interested in how the self-organization is implemented. To answer this question, we need to look into the idiolects in this self-organizing process, to know how the idiolects are formed and affect each other. Language change provides an informative window in addressing these issues. Language change is the result of the collective behaviors of idiolects, even as it affects the idiolects. The heterogeneity among idiolects is exposed to the greatest extent in on-going changes.

An on-going sound change in Cantonese is taken as a case study to scrutinize the heterogeneity in the self-organizing processes. The fieldwork data reveal a large degree of variation both in the population (VT-I) and in the set of words (VT-II). Another type of variation (VT-III) is highlighted, that is, a word may also show variation within one single speaker. But this VT-III within speakers only exists in a proportion, but not all, of the words subject to the change. Also we find that if a speaker has some words consistently in the unchanged state and some words in the changed state, then this speaker must have some other words in the variation state. Most speakers show the existence of VT-III, but they vary in degree. The observed individual differences in the degree of VT-III suggest that the large heterogeneity may be not only accounted for by the variability of linguistic input, but also by individuals' different learning styles. We hypothesize two types of lexical learning styles, i.e. probabilistic and categorical learning. These differences in learning styles suggest that when we examine the agent's internal properties in the self-organization framework, it is not only necessary to examine the commonalities among agents, but also the differences among them.

In addition to empirical studies, this thesis employs computational modeling as a major tool for investigation, as modeling provides effective ways to test hypotheses beyond empirical studies, and suggests new questions. After a brief review of the modeling studies in the field, some models developed in this thesis for language origin and language change are reported.

The first model is to simulate the emergence of a consistent vocabulary from a set of random mappings between meanings and forms. It emphasizes the importance of implementing the actual process of interaction among agents, and the cumulative effect on agents' linguistic behaviors. The model suggests that the Saussurean sign with identical speaking and listening mappings may not be a biological predisposition from natural selection, but rather a result from the process of language learning and use. The process exhibits a phase transition from a long period of small oscillation to an abrupt convergence. Such phase transition is often observed in self-organizing systems.

The second model simulates language change as innovation diffusion, and examines the effects of various factors, including some concerning properties of agents and some affecting agents' interactions. By comparing the outcome under different conditions, the model illustrates the importance of incorporating realistic assumptions, such as finite population size, age-dependent propensity to change, different learning environment in a social network, etc. The model compares the dynamics of language change in different types of network structures and shows that in non-regular networks, the rate of innovation diffusion increases little as population size increases. The model also tests the effect of the two types of hypothesized learning styles, and shows that in a population with the presence of probabilistic learners, an innovation with a small advantage will easily spread into the population and lead to a change. This may explain why language changes are so frequent.

This thesis demonstrates that both empirical and modeling studies on language evolution can greatly benefit from adopting a self-organization framework. The convergence and interplay of the two lines of exploration, i.e. biological bases in agents and the long term effect of interactions among them, should bring us a deeper understanding of how language has evolved and is evolving.

Computational studies of language evolutionPDF
Computational Linguistics and Beyond: Perspectives at the beginning of the 21st Century, Frontiers in Linguistics 1. Language and Linguistics, pages 65-106, 2004
The study of language evolution has revitalized recently due to converging interests from many disciplines. Computational modeling is one such fruitful area. Various aspects of language evolution have been studied using mathematical modeling and simulation. In this paper we ...MORE ⇓
The study of language evolution has revitalized recently due to converging interests from many disciplines. Computational modeling is one such fruitful area. Various aspects of language evolution have been studied using mathematical modeling and simulation. In this paper we discuss several computational studies in language change and language emergence.
2003
Modeling evolution of sound systems with genetic algorithmPDF
Computational Linguistics 29(1):1-18, 2003
In this study, optimization models using Genetic Algorithms are proposed to study the conguration of vowels and tone systems. Similar to previous explanatory models that have been used to study vowel systems, certain criteria, which are assumed to be the principles governing the ...MORE ⇓
In this study, optimization models using Genetic Algorithms are proposed to study the conguration of vowels and tone systems. Similar to previous explanatory models that have been used to study vowel systems, certain criteria, which are assumed to be the principles governing the structure of sound systems, are used to predict optimal vowels and tone systems. In most of the earlier studies only one criterion has been considered. When two criteria are considered, they are often combined into one scalar function. The GA model proposed for the study of tone systems uses a Pareto-ranking method which is highly applicable for dealing with optimization problems having multiple criteria. For optimization of tone systems, perceptual contrast and markedness complexity are considered simultaneously. Although the consistency between the predicted systems and the observed systems is not as significant as those obtained for vowel systems, further investigation along this line is promising.
2002
Complexity 7(3):41-54, 2002
Human language may have started from a consistent set of mappings between meanings and signals. These mappings, referred to as the early vocabulary, are considered to be the results of conventions established among the agents of a population. In this study, we report simulation ...MORE ⇓
Human language may have started from a consistent set of mappings between meanings and signals. These mappings, referred to as the early vocabulary, are considered to be the results of conventions established among the agents of a population. In this study, we report simulation models for investigating how such conventions can be reached. We propose that convention is essentially the product of self-organization of the population through interactions among the agents; and that cultural selection is another mechanism that speeds up the establishment of convention. Whereas earlier studies emphasized either one or the other of these two mechanisms, our focus is to integrate them into one hybrid model. The combination of these two complementary mechanisms, i.e. self-organization and cultural selection, provides a plausible explanation for cultural evolution which progresses with high transmission rate. Furthermore, we observe that as the vocabulary tends to convergence there is a uniform tendency to exhibit a sharp phase transition.