Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Daniel J. Hruschka
2018
Journal of Language Evolution 3(2):94-129, 2018
The increasing availability of large digital corpora of cross-linguistic data is revolutionizing many branches of linguistics. Overall, it has triggered a shift of attention from detailed questions about individual features to more global patterns amenable to rigorous, but ...MORE ⇓
The increasing availability of large digital corpora of cross-linguistic data is revolutionizing many branches of linguistics. Overall, it has triggered a shift of attention from detailed questions about individual features to more global patterns amenable to rigorous, but statistical, analyses. This engenders an approach based on successive approximations where models with simplified assumptions result in frameworks that can then be systematically refined, always keeping explicit the methodological commitments and the assumed prior knowledge. Therefore, they can resolve disputes between competing frameworks quantitatively by separating the support provided by the data from the underlying assumptions. These methods, though, often appear as a ‘black box’ to traditional practitioners. In fact, the switch to a statistical view complicates comparison of the results from these newer methods with traditional understanding, sometimes leading to misinterpretation and overly broad claims. We describe here this evolving methodological shift, attributed to the advent of big, but often incomplete and poorly curated data, emphasizing the underlying similarity of the newer quantitative to the traditional comparative methods and discussing when and to what extent the former have advantages over the latter. In this review, we cover briefly both randomization tests for detecting patterns in a largely model-independent fashion and phylolinguistic methods for a more model-based analysis of these patterns. We foresee a fruitful division of labor between the ability to computationally process large volumes of data and the trained linguistic insight identifying worthy prior commitments and interesting hypotheses in need of comparison.
2015
Current Biology 25:1-9, 2015
BACKGROUND Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a ...MORE ⇓
BACKGROUND Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. RESULTS Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular sound change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. CONCLUSIONS We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.
2009
Trends in Cognitive Sciences 13(11):464-469, 2009
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe ...MORE ⇓
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe recent advances within a new emerging framework for the study of language change, one that models such change as an evolutionary process among competing linguistic variants. We argue that a crucial and unifying element of this framework is the use of probabilistic, data-driven models both to infer change and to compare competing claims about social and cognitive influences on language change.