Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Ricard V. Sole
2016
Philosophical Transactions of the Royal Society B: Biological Sciences 371(1701), 2016
Evolution is marked by well-defined events involving profound innovations that are known as 'major evolutionary transitions'. They involve the integration of autonomous elements into a new, higher-level organization whereby the former isolated units interact in novel ways, losing ...MORE ⇓
Evolution is marked by well-defined events involving profound innovations that are known as 'major evolutionary transitions'. They involve the integration of autonomous elements into a new, higher-level organization whereby the former isolated units interact in novel ways, losing their original autonomy. All major transitions, which include the origin of life, cells, multicellular systems, societies or language (among other examples), took place millions of years ago. Are these transitions unique, rare events? Have they instead universal traits that make them almost inevitable when the right pieces are in place? Are there general laws of evolutionary innovation? In order to approach this problem under a novel perspective, we argue that a parallel class of evolutionary transitions can be explored involving the use of artificial evolutionary experiments where alternative paths to innovation can be explored. These 'synthetic' transitions include, for example, the artificial evolution of multicellular systems or the emergence of language in evolved communicating robots. These alternative scenarios could help us to understand the underlying laws that predate the rise of major innovations and the possibility for general laws of evolved complexity. Several key examples and theoretical approaches are summarized and future challenges are outlined.This article is part of the themed issue 'The major synthetic evolutionary transitions'.
Philosophical Transactions of the Royal Society B: Biological Sciences 371(1701), 2016
The evolution of life in our biosphere has been marked by several major innovations. Such major complexity shifts include the origin of cells, genetic codes or multicellularity to the emergence of non-genetic information, language or even consciousness. Understanding the nature ...MORE ⇓
The evolution of life in our biosphere has been marked by several major innovations. Such major complexity shifts include the origin of cells, genetic codes or multicellularity to the emergence of non-genetic information, language or even consciousness. Understanding the nature and conditions for their rise and success is a major challenge for evolutionary biology. Along with data analysis, phylogenetic studies and dedicated experimental work, theoretical and computational studies are an essential part of this exploration. With the rise of synthetic biology, evolutionary robotics, artificial life and advanced simulations, novel perspectives to these problems have led to a rather interesting scenario, where not only the major transitions can be studied or even reproduced, but even new ones might be potentially identified. In both cases, transitions can be understood in terms of phase transitions, as defined in physics. Such mapping (if correct) would help in defining a general framework to establish a theory of major transitions, both natural and artificial. Here, we review some advances made at the crossroads between statistical physics, artificial life, synthetic biology and evolutionary robotics.This article is part of the themed issue 'The major synthetic evolutionary transitions'.
2011
Physical Review E 83(3):036115, 2011
Zipf’s law seems to be ubiquitous in human languages and appears to be a universal property of complex communicating systems. Following the early proposal made by Zipf concerning the presence of a tension between the efforts of speaker and hearer in a communication system, we ...MORE ⇓
Zipf’s law seems to be ubiquitous in human languages and appears to be a universal property of complex communicating systems. Following the early proposal made by Zipf concerning the presence of a tension between the efforts of speaker and hearer in a communication system, we introduce evolution by means of a variational approach to the problem based on Kullback’s Minimum Discrimination of Information Principle. Therefore, using a formalism fully embedded in the framework of information theory, we demonstrate that Zipf’s law is the only expected outcome of an evolving communicative system under a rigorous definition of the communicative tension described by Zipf.
2010
Evolution of Communication and Language in Embodied Agents, pages 83-101, 2010
The evolution of human language allowed the efficient propagation of nongenetic information, thus creating a new form of evolutionary change. Language development in children offers the opportunity of exploring the emergence of such complex communication system and provides a ...MORE ⇓
The evolution of human language allowed the efficient propagation of nongenetic information, thus creating a new form of evolutionary change. Language development in children offers the opportunity of exploring the emergence of such complex communication system and provides a window to understanding the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years of age from a (pre-syntactic) tree-like structure to a scale-free, small world syntax network. The observed combinatorial patterns provide valuable data to understand the nature of the cognitive processes involved in the acquisition of syntax, introducing a new ingredient to understand the possible biological endowment of human beings which results in the emergence of complex language. We explore this problem by using a minimal, data-driven model that is able to capture several statistical traits, but some key features related to the emergence of syntactic complexity display important divergences.
Complexity 15(6):20-26, 2010
Human language is the key evolutionary innovation that makes humans different from other species. And yet, the fabric of language is tangled and all levels of description (from semantics to syntax) involve multiple layers of complexity. Recent work indicates that the global ...MORE ⇓
Human language is the key evolutionary innovation that makes humans different from other species. And yet, the fabric of language is tangled and all levels of description (from semantics to syntax) involve multiple layers of complexity. Recent work indicates that the global traits displayed by such levels can be analyzed in terms of networks of connected words. Here, we review the state of the art on language webs and their potential relevance to cognitive science. The emergence of syntax through language acquisition is used as a case study to illustrate how the approach can shed light into relevant questions concerning language organization and its evolution.
Journal of The Royal Society Interface 7(53):1647--1664, 2010
Abstract As indicated early by Charles Darwin, languages behave and change very much like living species. They display high diversity, differentiate in space and time, emerge and disappear. A large body of literature has explored the role of information exchanges and ...
2009
Advances in Complex Systems 12(3):371-392, 2009
Language development in children provides a window to understand the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years ...MORE ⇓
Language development in children provides a window to understand the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years of age from a (pre-syntactic) tree-like structure to a scale-free, small world syntax network. The development of these networks thus reveals a nonlinear dynamical pattern where the global topology of syntax graphs shifts from a hierarchical, tree-like pattern, to a scale-free organization. Such change seems difficult to be explained under a self-organization framework. Instead, it actually supports the presence of some underlying innate component, as early suggested by some authors.
2006
Journal of Theoretical Biology 241(2):438-441, 2006
CiteSeerX - Document Details (Isaac Councill, Lee Giles, Pradeep Teregowda):
Scaling laws in language evolutionPDF
Power Laws in the Social Sciences, 2006
The emergence of complex language is one of the fundamental hallmarks of human evolution. It shaped and constrained the emergence of social structures and makes us different from other animals. Beyond their differences, several remarkable features indicate the presence of ...MORE ⇓
The emergence of complex language is one of the fundamental hallmarks of human evolution. It shaped and constrained the emergence of social structures and makes us different from other animals. Beyond their differences, several remarkable features indicate the presence of fundamental principles of organization shared by all known languages. The best known is the so called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank. A different, but related property of human language involves the architecture of word interactions. It has been recently shown that linguistic webs of different types display a global organization that is not very different from the ones observed in other natural and artificial complex networks, from the genome to the internet. In this chapter we explore the statistical features displayed by these seemingly universal laws and their possible origins. It is shown that fundamental principles of organization pervade the origin of power laws in human language and shape its evolutionary history.
2005
Language Networks: their structure, function and evolutionPDF
Trends in Cognitive Sciences, 2005
Several important recent advances in various sciences (particularly biology and physics) are based on complex network analysis, which provides tools for characterizing statistical properties of networks and explaining how they may arise. This article examines the relevance of ...MORE ⇓
Several important recent advances in various sciences (particularly biology and physics) are based on complex network analysis, which provides tools for characterizing statistical properties of networks and explaining how they may arise. This article examines the relevance of this trend for the study of human languages. We review some early efforts to build up language networks, characterize their properties, and show in which direction models are being developed to explain them. These insights are relevant, both for studying fundamental unsolved puzzles in cognitive science, in particular the origins and evolution of language, but also for recent data-driven statistical approaches to natural language.
Nature 434:289, 2005
Human language is based on syntax, a complex set of rules about how words can be combined. In theory, the emergence of syntactic communication might have been a comparatively straightforward process.
2004
Physical Review E 69:051915, 2004
Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e., the ability of combining words for forming sentences. The origin of such traits is an issue of open debate. By using recent ...MORE ⇓
Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e., the ability of combining words for forming sentences. The origin of such traits is an issue of open debate. By using recent developments from the statistical physics of complex networks, we show that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing. Such previously unreported features of syntax organization are not a trivial consequence of the structure of sentences, but an emergent trait at the global scale.
2003
PNAS 100:788-791, 2003
The emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization. These principles seem to be common to all languages. The best known is the so-called Zipf's ...MORE ⇓
The emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization. These principles seem to be common to all languages. The best known is the so-called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank. The possible origins of this law have been controversial, and its meaningfulness is still an open question. In this article, the early hypothesis of Zipf of a principle of least effort for explaining the law is shown to be sound. Simultaneous minimization in the effort of both hearer and speaker is formalized with a simple optimization process operating on a binary matrix of signal-object associations. Zipf's law is found in the transition between referentially useless systems and indexical reference systems. Our finding strongly suggests that Zipf's law is a hallmark of symbolic reference and not a meaningless feature. The implications for the evolution of language are discussed. We explain how language evolution can take advantage of a communicative phase transition.
2002
Advances in Complex Systems 5(1):1-6, 2002
Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language ...MORE ⇓
Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.
2001
Proceedings of the Royal Society B: Biological Sciences 268(1482):2261-2265, 2001
Words in human language interact in sentences in non-random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co-occurrence of words in sentences reflects ...MORE ⇓
Words in human language interact in sentences in non-random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co-occurrence of words in sentences reflects language organization in a subtle manner that can be described in terms of a graph of word interactions. Here, we show that such graphs display two important features recently found in a disparate number of complex systems. (i) The so called small-world effect. In particular, the average distance between two words, d (i.e. the average minimum number of links to be crossed from an arbitrary word to another), is shown to be d approximate to 2-3, even though the human brain can store many thousands. (ii) A scale-free distribution of degrees. The known pronounced effects of disconnecting the most connected vertices in such networks can be identified in some language disorders. These observations indicate some unexpected features of language organization that might reflect the evolutionary and social history of lexicons and the origins of their flexibility and combinatorial nature.
Journal of Quantitative Linguistics 8(3):165-173, 2001
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real ...MORE ⇓
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, \approx (-)1 for the first regime and \approx (-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed.