Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Journal :: Advances in Complex Systems
2012
Advances in Complex Systems 15(01n02), 2012
Humans divide themselves up into groups based on a shared cultural identity and common descent. Culturally inherited differences in dress, language, and institutions are often used as symbolic markers of the boundaries of these ethnic groups. Relatively little is known ...
Advances in Complex Systems 15(03n04):1150015, 2012
We review the task of aligning simple models for language dynamics with relevant empirical data, motivated by the fact that this is rarely attempted in practice despite an abundance of abstract models. We propose that one way to meet this challenge is through the careful ...MORE ⇓
We review the task of aligning simple models for language dynamics with relevant empirical data, motivated by the fact that this is rarely attempted in practice despite an abundance of abstract models. We propose that one way to meet this challenge is through the careful construction of null models. We argue in particular that rejection of a null model must have important consequences for theories about language dynamics if modeling is truly to be worthwhile. Our main claim is that the stochastic process of neutral evolution (also known as genetic drift or random copying) is a viable null model for language dynamics. We survey empirical evidence in favor and against neutral evolution as a mechanism behind historical language changes, highlighting the theoretical implications in each case.
Advances in Complex Systems 15(03n04):1150016, 2012
It is widely known that color names across the world's languages tend to be organized into a neat hierarchy with a small set of 'basic names' featuring in a comparatively fixed order across linguistic societies. However, to date, the basic names have only been defined through a ...MORE ⇓
It is widely known that color names across the world's languages tend to be organized into a neat hierarchy with a small set of 'basic names' featuring in a comparatively fixed order across linguistic societies. However, to date, the basic names have only been defined through a set of linguistic principles. There is no statistical definition that quantitatively separates the basic names from the rest of the color words across languages. Here we present a rigorous statistical analysis of the World Color Survey database hosting color word information from 110 non-industrialized languages. The central result is that those names for which a population of individuals show a larger overall agreement across languages turn out to be the basic ones exactly reproducing the color name hierarchy and, thereby, providing, for the first time, an empirical definition of the basic color names.
Advances in Complex Systems 15(03n04):1150017, 2012
Human language is unparalleled in both its expressive capacity and its diversity. What accounts for the enormous diversity of human languages [13]? Recent evidence suggests that the structure of languages may be shaped by the social and demographic environment in which the ...MORE ⇓
Human language is unparalleled in both its expressive capacity and its diversity. What accounts for the enormous diversity of human languages [13]? Recent evidence suggests that the structure of languages may be shaped by the social and demographic environment in which the languages are learned and used. In an analysis of over 2000 languages Lupyan and Dale [25] demonstrated that socio-demographic variables, such as population size, significantly predicted the complexity of inflectional morphology. Languages spoken by smaller populations tend to employ more complex inflectional systems. Languages spoken by larger populations tend to avoid complex morphological paradigms, employing lexical constructions instead. This relationship may exist because of how language learning takes place in these different social contexts [44, 45]. In a smaller population, a tightly-knit social group combined with exclusive or almost exclusive language acquisition by infants permits accumulation of complex inflectional forms. In larger populations, adult language learning and more extensive cross-group interactions produce pressures that lead to morphological simplification. In the current paper, we explore this learning-based hypothesis in two ways. First, we develop an agent-based simulation that serves as a simple existence proof: As adult interaction increases, languages lose inflections. Second, we carry out a correlational study showing that English-speaking adults who had more interaction with non-native speakers as children showed a relative preference for over-regularized (i.e. morphologically simpler) forms. The results of the simulation and experiment lend support to the linguistic niche hypothesis: Languages may vary in the ways they do in part due to different social environments in which they are learned and used. In short, languages adapt to the learning constraints and biases of their learners.
Advances in Complex Systems 15(03n04):1150018, 2012
Evolutionary game theory is used to form a finite partition of a continuous hue circle in which perceptually similar hues are each represented by an icon chip and the circle by a finite but game dynamically determined number of icon chips. On the basis of such icon chip ...MORE ⇓
Evolutionary game theory is used to form a finite partition of a continuous hue circle in which perceptually similar hues are each represented by an icon chip and the circle by a finite but game dynamically determined number of icon chips. On the basis of such icon chip structures, a color categorization for both an individual learner and a population of learners is then evolved. These results remove limitations of some particular previous color categorization simulation work which assumed a fixed number of color stimuli and a maximal number of predefined color categories. These simulations are extended to demonstrate that learners need neither to share the same icon chip structures, nor do these structures have to be fully developed for a population of learners to produce a stable color categorization system. Additionally, when a naive learner is introduced into a population with a stable color categorization, the game dynamics result in the learner's adopting the existing categorization. All results are shown to hold while the underlying icon chip structures evolve continuously in response to novel stimuli. The usefulness of the approach as well as some of the potential implications of the results for human learning of color categories are discussed.
Advances in Complex Systems 15(03n04):1150019, 2012
The paper investigates the quantitative distribution of language types across languages of the world. The studies are based on three large-scale typological data bases: The World Color Survey, the Automated Similarity Judgment Project data base, and the World Atlas of Language ...MORE ⇓
The paper investigates the quantitative distribution of language types across languages of the world. The studies are based on three large-scale typological data bases: The World Color Survey, the Automated Similarity Judgment Project data base, and the World Atlas of Language Structures. The main finding is that a surprisingly large and varied collection of linguistic typologies show power law behavior. The bulk of the paper deals with the statistical validation of these findings.
Advances in Complex Systems 15(03n04):1150020, 2012
configure a pattern on a board to communicate with each other. Distinct from related studies, players in this game have no explicit game scores or tasks to optimize. Any dynamics occurring in this game are therefore ad-hoc and on-going processes. There were three major findings ...MORE ⇓
configure a pattern on a board to communicate with each other. Distinct from related studies, players in this game have no explicit game scores or tasks to optimize. Any dynamics occurring in this game are therefore ad-hoc and on-going processes. There were three major findings in this paper. (i) The subjects mainly interacted in two modes: a dynamic mode where players proceed through the game without assigning any meanings to the pattern, and a metaphoric mode, where players process with narrative reflection. (ii) Subjects spontaneously switch between the two modes, but this switching is suppressed when playing alone. (iii) A transition diagram of the board pattern can be used to label the two modes, e.g. linearity of the diagram is correlated with the metaphoric mode. One of the main features of grammar is to display subjects' intentionality in a systematic way. We argue that the switching between the two modes observed in our experiment can be taken as a grammatical aspect that emerged in the process. These modes express the speaker's perspective in the same manner as grammatical elements do in natural language. The switching behavior should be seen as a process that embodies a player's intention using the medium (in this case, the patterns in the wall game), and a player's exploration of the medium is a necessary step before generating a grammar structure.
Advances in Complex Systems 15(03n04):1150021, 2012
This paper reviews how the structure of form and meaning spaces influences the nature and the dynamics of the form-meaning mappings in language. In general, in a structured form or meaning space, not all forms and meanings are equivalent: some forms and some meanings are more ...MORE ⇓
This paper reviews how the structure of form and meaning spaces influences the nature and the dynamics of the form-meaning mappings in language. In general, in a structured form or meaning space, not all forms and meanings are equivalent: some forms and some meanings are more easily confused with each other than with other forms or meanings. We first give a formalization of this idea, and explore how it influences robust form-meaning mappings. It is shown that some fundamental properties of human language, such as discreteness and combinatorial structure as well as universals of sound systems of human languages follow from optimal communication in structured form and meaning spaces. We also argue that some properties of human language follow less from these fundamental issues, and more from cognitive constraints.

We then show that it is possible to experimentally investigate the relative contribution of functional constraints and of cognitive constraints. We illustrate this with an example of one of our own experiments, in which experimental participants have to learn a set of complex form-meaning mappings that have been produced by a previous generation of participants. Theoretically predicted properties appear in the sets of signals that emerge in this iterated learning experiment.

Advances in Complex Systems 15(03n04):1150022, 2012
Linguistic meaning is a convention. This article investigates how such conventions can arise for color categories in populations of simulated 'agents'. The method uses concepts from evolutionary game theory: A language game where agents assign names to color patches and is played ...MORE ⇓
Linguistic meaning is a convention. This article investigates how such conventions can arise for color categories in populations of simulated 'agents'. The method uses concepts from evolutionary game theory: A language game where agents assign names to color patches and is played repeatedly by members of a population. The evolutionary dynamics employed make minimal assumptions about agents' perceptions and learning processes. Through various simulations it is shown that under different kinds of reasonable conditions involving outcomes of individual games, the evolutionary dynamics push populations to stationary equilibria, which can be interpreted as achieving shared population meaning systems. Optimal population agreement for meaning is characterized through a mathematical formula, and the simulations presented reveal that for a wide variety of situations, optimality is achieved.
Advances in Complex Systems 15(03n04):1150026, 2012
The recent growth of Experimental Semiotics (ES) offers us a new option to investigate human communication. We briefly introduce ES, presenting results from three themes of research which emerged within it. Then we illustrate the contribution ES can make to the investigation of ...MORE ⇓
The recent growth of Experimental Semiotics (ES) offers us a new option to investigate human communication. We briefly introduce ES, presenting results from three themes of research which emerged within it. Then we illustrate the contribution ES can make to the investigation of human communication systems, particularly in comparison with the other existing options. This comparison highlights how ES can provide an engine of discovery for understanding human communication. In fact, in complementing the other options, ES offers us unique opportunities to test assumptions about communicative behavior, both through the experimenters' planned manipulations and through the unexpected behaviors humans exhibit in experimental settings. We provide three examples of such opportunities, one from each of the three research themes we present.
Advances in Complex Systems 15(03n04):1203002, 2012
Thirty authors of different disciplines, ranging from cognitive science and linguistics to mathematics and physics, address the topic of language origin and evolution. Language dynamics is investigated through an interdisciplinary effort, involving field and synthetic ...MORE ⇓
Thirty authors of different disciplines, ranging from cognitive science and linguistics to mathematics and physics, address the topic of language origin and evolution. Language dynamics is investigated through an interdisciplinary effort, involving field and synthetic experiments, modelling and comparison of the theoretical predictions with empirical data. The result consists in new insights that significantly contribute to the ongoing debate on the origin and the evolution of language. In this Topical Issue the state of the art of this novel and fertile approach is reported by major experts of the field.
Advances in Complex Systems 15(03n04):1250031, 2012
The problem of how young learners acquire the meaning of words is fundamental to language development and cognition. A host of computational models exist which demonstrate various mechanisms in which words and their meanings can be transferred between a teacher and learner. ...MORE ⇓
The problem of how young learners acquire the meaning of words is fundamental to language development and cognition. A host of computational models exist which demonstrate various mechanisms in which words and their meanings can be transferred between a teacher and learner. However these models often assume that the learner can easily distinguish between the referents of words, and do not show if the learning mechanisms still function when there is perceptual ambiguity about the referent of a word. This paper presents two models that acquire meaning-word mappings in a continuous semantic space. The first model is a cross-situational learning model in which the learner induces word-meaning mappings through statistical learning from repeated exposures. The second model is a social model, in which the learner and teacher engage in a dyadic learning interaction to transfer word-meaning mappings. We show how cross-situational learning, despite there being no information to the learner as to the exact referent of a word during learning, still can learn successfully. However, social learning outperforms cross-situational strategies both in speed of acquisition and performance. The results suggest that cross-situational learning is efficient for situations where referential ambiguity is limited, but in more complex situations social learning is the more optimal strategy.
Advances in Complex Systems 15(03n04):1250039, 2012
The question how a shared vocabulary can arise in a multi-agent population despite the fact that each agent autonomously invents and acquires words has been solved. The solution is based on alignment: Agents score all associations between words and meanings in their lexicons and ...MORE ⇓
The question how a shared vocabulary can arise in a multi-agent population despite the fact that each agent autonomously invents and acquires words has been solved. The solution is based on alignment: Agents score all associations between words and meanings in their lexicons and update these preference scores based on communicative success. A positive feedback loop between success and use thus arises which causes the spontaneous self-organization of a shared lexicon. The same approach has been proposed for explaining how a population can arrive at a shared grammar, in which we get the same problem of variation because each agent invents and acquires their own grammatical constructions. However, a problem arises if constructions reuse parts that can also exist on their own. This happens particularly when frequent usage patterns, which are based on compositional rules, are stored as such. The problem is how to maintain systematicity. This paper identifies this problem and proposes a solution in the form of multilevel alignment. Multilevel alignment means that the updating of preference scores is not restricted to the constructions that were used in the utterance but also downward and upward in the subsumption hierarchy.
Advances in Complex Systems 15(03n04):1250048, 2012
During the last decade, much attention has been paid to language competition in the complex systems community, that is, how the fractions of speakers of several competing languages evolve in time. In this paper, we review recent advances in this direction and focus on three ...MORE ⇓
During the last decade, much attention has been paid to language competition in the complex systems community, that is, how the fractions of speakers of several competing languages evolve in time. In this paper, we review recent advances in this direction and focus on three aspects. First, we consider the shift from two-state models to three-state models that include the possibility of bilingual individuals. The understanding of the role played by bilingualism is essential in sociolinguistics. In particular, the question addressed is whether bilingualism facilitates the coexistence of languages. Second, we will analyze the effect of social interaction networks and physical barriers. Finally, we will show how to analyze the issue of bilingualism from a game theoretical perspective.
Advances in Complex Systems 15(03n04):1250051, 2012
In this position paper we discuss how language influences the mind by comparing robots that have language with robots that do not have language. Robots with language respond more adaptively to objects belonging to different categories and requiring different behaviors compared to ...MORE ⇓
In this position paper we discuss how language influences the mind by comparing robots that have language with robots that do not have language. Robots with language respond more adaptively to objects belonging to different categories and requiring different behaviors compared to robots without language, and it is possible to show that categories of objects are represented differently in the neural network which controls the behavior of the two types of robots. By exposing the robots to sounds which co-vary systematically with specific aspects of their experience, the robots can distinguish nouns from verbs and can respond appropriately to simple noun-verb sentences. Robots can also be used to show that, while all animals develop a mental (neural) model of their environment which incorporates the co-variations among different aspects of their experiences, human beings develop a more analytical and modular model because specific sounds co-vary with different aspects of their experiences and this may explain why human beings have a more articulated and creative behavioral repertoire.
Advances in Complex Systems 15(03n04):1250054, 2012
We investigate the directed and weighted complex network of free word associations in which players write a word in response to another word given as input. We analyze in details two large datasets resulting from two very different experiments: On the one hand the massive ...MORE ⇓
We investigate the directed and weighted complex network of free word associations in which players write a word in response to another word given as input. We analyze in details two large datasets resulting from two very different experiments: On the one hand the massive multiplayer web-based Word Association Game known as Human Brain Cloud, and on the other hand the South Florida Free Association Norms experiment. In both cases, the networks of associations exhibit quite robust properties like the small world property, a slight assortativity and a strong asymmetry between in-degree and out-degree distributions. A particularly interesting result concerns the existence of a characteristic scale for the word association process, arguably related to specific conceptual contexts for each word. After mapping, the Human Brain Cloud network onto the WordNet semantics network, we point out the basic cognitive mechanisms underlying word associations when they are represented as paths in an underlying semantic network. We derive in particular an expression describing the growth of the HBC graph and we highlight the existence of a characteristic scale for the word association process.
2010
Advances in Complex Systems 13(02):135--153, 2010
Written language is a complex communication signal capable of conveying information encoded in the form of ordered sequences of words. Beyond the local order ruled by grammar, semantic and thematic structures affect long-range patterns in word usage. Here ...
Advances in Complex Systems 13(04):469--482, 2010
This paper reports the results of a multi-agent simulation designed to study the emergence and evolution of symbolic communication. The novelty of this model is that it considers some interactional and spatial constraints to this process that have been disregarded by ...
2009
Advances in Complex Systems 12(3):371-392, 2009
Language development in children provides a window to understand the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years ...MORE ⇓
Language development in children provides a window to understand the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years of age from a (pre-syntactic) tree-like structure to a scale-free, small world syntax network. The development of these networks thus reveals a nonlinear dynamical pattern where the global topology of syntax graphs shifts from a hierarchical, tree-like pattern, to a scale-free organization. Such change seems difficult to be explained under a self-organization framework. Instead, it actually supports the presence of some underlying innate component, as early suggested by some authors.
2008
Advances in Complex Systems 11(3):357-369, 2008
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model ...MORE ⇓
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model combined with a Barabasi-Albert network) and a richer empirical dataset [12]. Our simulations show either a negligible or a strong dependence of language change on population sizes, depending on the parameter settings; while empirical data, like some of the simulations, show a negligible dependence.
Advances in Complex Systems 11(3):371-392, 2008
In this work, we attempt to capture patterns of co-occurrence across vowel systems and at the same time figure out the nature of the force leading to the emergence of such patterns. For this purpose we define a weighted network where the vowels are the nodes and an edge between ...MORE ⇓
In this work, we attempt to capture patterns of co-occurrence across vowel systems and at the same time figure out the nature of the force leading to the emergence of such patterns. For this purpose we define a weighted network where the vowels are the nodes and an edge between two nodes (read vowels) signify their co-occurrence likelihood over the vowel inventories. Through this network we identify communities of vowels, which essentially reflect their patterns of co-occurrence across languages. We observe that in the assortative vowel communities the constituent nodes (read vowels) are largely uncorrelated in terms of their features and show that they are formed based on the principle of maximal perceptual contrast. However, in the rest of the communities, strong correlations are reflected among the constituent vowels with respect to their features indicating that it is the principle of feature economy that binds them together. We validate the above observations by proposing a quantitative measure of perceptual contrast as well as feature economy and subsequently comparing the results obtained due to these quantifications with those where we assume that the vowel inventories had evolved just by chance.
Advances in Complex Systems 11(3):393-414, 2008
In this paper, we propose a mathematical framework for studying word order optimization. The framework relies on the well-known positive correlation between cognitive cost and the Euclidean distance between the elements (e.g. words) involved in a syntactic link. We study the ...MORE ⇓
In this paper, we propose a mathematical framework for studying word order optimization. The framework relies on the well-known positive correlation between cognitive cost and the Euclidean distance between the elements (e.g. words) involved in a syntactic link. We study the conditions under which a certain word order is more economical than an alternative word order by proposing a mathematical approach. We apply our methodology to two different cases: (a) the ordering of subject (S), verb (V) and object (O), and (b) the covering of a root word by a syntactic link. For the former, we find that SVO and its symmetric, OVS, are more economical than OVS, SOV, VOS and VSO at least 2/3 of the time. For the latter, we find that uncovering the root word is more economical than covering it at least 1/2 of the time. With the help of our framework, one can explain some Greenbergian universals. Our findings provide further theoretical support for the hypothesis that the limited resources of the brain introduce biases toward certain word orders. Our theoretical findings could inspire or illuminate future psycholinguistics or corpus linguistics studies.
Advances in Complex Systems 11(3):415-420, 2008
This is a reply to Ramon Ferrer-I-Cancho's paper in this issue ``Some Word Order Biases from Limited Brain Resources: A Mathematical Approach.'' In this reply, I challenge the Euclidean distance model proposed in that paper by proposing a simple alternative model based on linear ...MORE ⇓
This is a reply to Ramon Ferrer-I-Cancho's paper in this issue ``Some Word Order Biases from Limited Brain Resources: A Mathematical Approach.'' In this reply, I challenge the Euclidean distance model proposed in that paper by proposing a simple alternative model based on linear ordering.
Advances in Complex Systems 11(3):421-432, 2008
This article is a critical analysis of Michael Cysouw's comment ``Linear Order as a Predictor of Word Order Regularities.''
2006
Advances in complex systems 9(3):183-191, 2006
Our earlier language model is modified to allow for the survival of a minority language without higher status, just because of the pride of its speakers in their linguistic identity. An appendix studies the roughness of the interface for linguistic regions when one language ...MORE ⇓
Our earlier language model is modified to allow for the survival of a minority language without higher status, just because of the pride of its speakers in their linguistic identity. An appendix studies the roughness of the interface for linguistic regions when one language conquers the whole territory.
2003
Advances in Complex Systems 6(4):537-558, 2003
Language arises from the interaction of three complex adaptive systems -- biological evolution, learning, and culture. We focus here on cultural evolution, and present an Iterated Learning Model of the emergence of compositionality, a fundamental structural property of language. ...MORE ⇓
Language arises from the interaction of three complex adaptive systems -- biological evolution, learning, and culture. We focus here on cultural evolution, and present an Iterated Learning Model of the emergence of compositionality, a fundamental structural property of language. Our main result is to show that the poverty of the stimulus available to language learners leads to a pressure for linguistic structure. When there is a bottleneck on cultural transmission, only a language which is generalizable from sparse input data is stable. Language itself evolves on a cultural time-scale, and compositionality is language's adaptation to stimulus poverty.
2002
Advances in Complex Systems 5(1):1-6, 2002
Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language ...MORE ⇓
Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.
1999
Advances in Complex Systems 1(4):301-323, 1999
The paper investigates the dynamical properties of spatially distributed naming games. Naming games are interactions between two agents, a speaker and a hearer, in which the speaker identifies an object using a name. Adaptive naming games imply that speaker and hearer update ...MORE ⇓
The paper investigates the dynamical properties of spatially distributed naming games. Naming games are interactions between two agents, a speaker and a hearer, in which the speaker identifies an object using a name. Adaptive naming games imply that speaker and hearer update their lexicons to become better in future games. By engaging in adaptive naming games, a coherent shared vocabulary arises through self-organisation in a population of distributed agents. When the agents are spatially distributed, diversity can be shown to arise, and changes, in population contact lead to language changes.
Advances in complex systems 2(2):143-172, 1999
Animal behavior is often altruistic. In the frame of the theory of natural selection, altruism can only exist under specific conditions like kin selection or reciprocal cooperation. We show that reciprocal cooperation, which is generally invoked to explain non-kin altruism, ...MORE ⇓
Animal behavior is often altruistic. In the frame of the theory of natural selection, altruism can only exist under specific conditions like kin selection or reciprocal cooperation. We show that reciprocal cooperation, which is generally invoked to explain non-kin altruism, requires very restrictive conditions to be evolutionary stable. Some of these conditions are not met in many cases of altruism observed in nature. In the search of another explanation of non-kin altruism, we consider Zahavis's theory of prestige. We extend it to propose a 'political' model of altruism. We give evidence showing that non-kin altruism can evolve in the context of inter-subgroup competition. Under such circumstances, altruistic behavior can be used by individuals to advertise their quality as efficient coalition members. In this model, only abilities which positively correlate with the subgroup success can evolve into altruistic behaviors.