Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Eric W. Holman
2011
Automated dating of the worlds language families based on lexical similarity
Current Anthropology 52(6):841--875, 2011
This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from ...
2010
Entropy 12(4):844-858, 2010
The relationship between meanings of words and their sound shapes is to a large extent arbitrary, but it is well known that languages exhibit sound symbolism effects violating arbitrariness. Evidence for sound symbolism is typically anecdotal, however. Here we present a ...MORE ⇓
The relationship between meanings of words and their sound shapes is to a large extent arbitrary, but it is well known that languages exhibit sound symbolism effects violating arbitrariness. Evidence for sound symbolism is typically anecdotal, however. Here we present a systematic approach. Using a selection of basic vocabulary in nearly one half of the worldas languages we find commonalities among sound shapes for words referring to same concepts. These are interpreted as due to sound symbolism. Studying the effects of sound symbolism cross-linguistically is of key importance for the understanding of language evolution.
2009
Human Biology 81(2-3):259-274, 2009
Previous empirical studies of population size and language change have produced equivocal results. We therefore address the question with a new set of lexical data from nearly one half of the world's languages. We first show that relative population sizes of modern languages can ...MORE ⇓
Previous empirical studies of population size and language change have produced equivocal results. We therefore address the question with a new set of lexical data from nearly one half of the world's languages. We first show that relative population sizes of modern languages can be extrapolated to ancestral languages, albeit with diminishing accuracy, up to several thousand years into the past. We then test for an effect of population against the null hypothesis that the ultrametric inequality is satisfied by lexical distances among triples of related languages. The test shows mainly negligible effects of population, the exception being an apparently faster rate of change in the larger of two very closely related variants. A possible explanation for the exception may be the influence on emerging standard (or cross-regional) variants from speakers that shift from different dialects to the standard. Our results strongly indicate that the sizes of speaker populations do not in and of themselves determine rates of language change. Comparison of this empirical finding with previously published computer simulations suggests that the most plausible model for language change is one in which changes propagate at a local level in a type of network where the individuals have different degrees of connectivity.
2008
Advances in Complex Systems 11(3):357-369, 2008
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model ...MORE ⇓
An earlier study [24] concluded, based on computer simulations and some inferences from empirical data, that languages will change the more slowly the larger the population gets. We replicate this study using a more complete language model for simulations (the Schulze model combined with a Barabasi-Albert network) and a richer empirical dataset [12]. Our simulations show either a negligible or a strong dependence of language change on population sizes, depending on the parameter settings; while empirical data, like some of the simulations, show a negligible dependence.
2007
Linguistic Typology 11(2):395-423, 2007
Modern linguistic typology is increasingly less concerned with what is possible in human languages (universals) and increasingly more with the question ``what's where why?'' (Bickel 2007). Moreover, as several recent papers in this journal show, typologists increasingly turn to ...MORE ⇓
Modern linguistic typology is increasingly less concerned with what is possible in human languages (universals) and increasingly more with the question ``what's where why?'' (Bickel 2007). Moreover, as several recent papers in this journal show, typologists increasingly turn to quantitative approaches as a means to understanding typological distributions. In order to provide the quantitative study of typological distributions with a firm methodological foundation it is preferable to gain a grasp of simple facts before starting to ask the more complicated questions. In this article the only assumptions we make about languages are that (i) they may be partly described by a set of typological characteristics, each of which may either be found or not found in any given language; that (ii) languages may be genealogically related or not; and that (iii) languages are spoken in certain places. Given these minimal assumptions we can begin to ask how to express the differences and similarities among languages as functions of the geographical distances among them, whether different functions apply to genealogically related and unrelated languages, and whether it is possible to distinguish in some quantitative way between languages that are related and languages that are not, even when the languages in question are spoken at great distances from one another. Moreover, we may investigate the effects that factors such as ecology, migration, and rates of linguistic change or diffusion have on the degree of similarities among languages in cases where they are either related or unrelated. We will approach these questions from two perspectives. The first perspective is an empirical one, where observations primarily derive from analyses of the data of Haspelmath et al. (eds.) (2005). The second perspective is a computational one, where simulations are drawn upon to test the effects of different parameters on the development of structural linguistic diversity.
1996
Journal of Classification 13(1):27-56, 1996
Statistical analyses of a published phylogenetic classification of languages show some properties attributable to taxonomic methods and others that reflect the nature of linguistic evolution. The inferred phylogenetic tree is less well resolved and more asymmetric at the highest ...MORE ⇓
Statistical analyses of a published phylogenetic classification of languages show some properties attributable to taxonomic methods and others that reflect the nature of linguistic evolution. The inferred phylogenetic tree is less well resolved and more asymmetric at the highest taxonomic ranks, where the tree is constructed mainly by phenetic methods. At lower ranks, where cladistic methods are more prevalent, the asymmetry of well resolved parts of the tree is consistent with a stochastic birth and death process in which languages originate and become extinct at constant rates, although poorly resolved parts of the tree are still more asymmetric than predicted. Other tests applied to a sample of historically recorded languages reveal substantial fluctuations in the rates of origination and extinction, with both rates temporarily reduced when languages enter the historical record. For languages in general, the average origination rate is estimated to be only slightly higher than the average extinction rate, which in turn corresponds to an average lifetime of about 500 years or less.