William Croft
2018
Journal of Language Evolution 3(2):94-129, 2018
The increasing availability of large digital corpora of cross-linguistic data is revolutionizing many branches of linguistics. Overall, it has triggered a shift of attention from detailed questions about individual features to more global patterns amenable to rigorous, but ...MORE ⇓
The increasing availability of large digital corpora of cross-linguistic data is revolutionizing many branches of linguistics. Overall, it has triggered a shift of attention from detailed questions about individual features to more global patterns amenable to rigorous, but statistical, analyses. This engenders an approach based on successive approximations where models with simplified assumptions result in frameworks that can then be systematically refined, always keeping explicit the methodological commitments and the assumed prior knowledge. Therefore, they can resolve disputes between competing frameworks quantitatively by separating the support provided by the data from the underlying assumptions. These methods, though, often appear as a ‘black box’ to traditional practitioners. In fact, the switch to a statistical view complicates comparison of the results from these newer methods with traditional understanding, sometimes leading to misinterpretation and overly broad claims. We describe here this evolving methodological shift, attributed to the advent of big, but often incomplete and poorly curated data, emphasizing the underlying similarity of the newer quantitative to the traditional comparative methods and discussing when and to what extent the former have advantages over the latter. In this review, we cover briefly both randomization tests for detecting patterns in a largely model-independent fashion and phylolinguistic methods for a more model-based analysis of these patterns. We foresee a fruitful division of labor between the ability to computationally process large volumes of data and the trained linguistic insight identifying worthy prior commitments and interesting hypotheses in need of comparison.
2016
PNAS 113(7):1766-71, 2016
How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through ...MORE ⇓
How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides indirect access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here, we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries to translate words to and from languages carefully selected to be representative of worldwide diversity. These translations reveal cases where a particular language uses a single "polysemous" word to express multiple concepts that another language represents using distinct words. We use the frequency of such polysemies linking two concepts as a measure of their semantic proximity and represent the pattern of these linkages by a weighted network. This network is highly structured: Certain concepts are far more prone to polysemy than others, and naturally interpretable clusters of closely related concepts emerge. Statistical analysis of the polysemies observed in a subset of the basic vocabulary shows that these structural properties are consistent across different language groups, and largely independent of geography, environment, and the presence or absence of a literary tradition. The methods developed here can be applied to any semantic domain to reveal the extent to which its conceptual structure is, similarly, a universal attribute of human cognition and language use.
2012
S-curves and the mechanisms of propagation in language change
Language 88(2):269--304, 2012
Abstract A variety of mechanisms have been proposed in sociolinguistics for the propagation of an innovation through the speech community. The complexity of social systems makes it difficult to evaluate the different mechanisms empirically. We use the four-way typology of ...
2011
Greenbergian universals, diachrony, and statistical analyses
Linguistic Typology 15(2):433--453, 2011
In their article “Evolved structure of language shows lineage-specific trends in word order universals”, Dunn, Greenhill, Levinson, & Gray present evidence purporting to demonstrate that both Chomskyan and Greenbergian language universals are invalid. In particular, ...
2010
Adaptive Behavior 18(1):12-20, 2010
Using our interdisciplinary research collaboration as a case study, we discuss the question of whether formal modeling and empirical approaches can be successfully integrated into a single line of research. We argue that to avoid an undesirable disconnect between the two, one ...MORE ⇓
Using our interdisciplinary research collaboration as a case study, we discuss the question of whether formal modeling and empirical approaches can be successfully integrated into a single line of research. We argue that to avoid an undesirable disconnect between the two, one needs considerable time and patience for a science-humanities collaboration to bear fruit. In our collaboration and, we believe, in science-humanities collaborations in general, certain shared goals are required for success, including: starting with simple models before moving to more complex models; the importance of continually comparing models with empirical data where possible; and a focus on explaining statistical patterns rather than accounting for single data points individually.
2009
Language Variation and Change 21(2):257-296, 2009
Trudgill (2004) proposed that the emergence of New Zealand English, and of isolated new dialects generally, is purely deterministic. It can be explained solely in terms of the frequency of occurrence of particular variants and the frequency of interactions between different ...MORE ⇓
Trudgill (2004) proposed that the emergence of New Zealand English, and of isolated new dialects generally, is purely deterministic. It can be explained solely in terms of the frequency of occurrence of particular variants and the frequency of interactions between different speakers in the society. Trudgill's theory is closely related to usage-based models of language, in which frequency plays a role in the representation of linguistic knowledge and in language change. Trudgill's theory also corresponds to a neutral evolution model Of language change. We use a mathematical model based on Croft's usage-based evolutionary framework for language change (Baxter, Blythe, Croft, \& McKane, 2006), and investigate whether Trudgill's theory is a plausible model of the emergence of new dialects. The results of our modeling indicate that determinism cannot be a sufficient mechanism for the emergence of a new dialect. Our approach illustrates the utility of mathematical modeling of theories and of empirical data for the study of language change.
Language Learning 59(s1):47-63, 2009
Language is a complex adaptive system: Speakers are agents who interact with each other, and their past and current interactions feed into speakers' future behavior in complex ways. In this article, we describe the social cognitive linguistic basis for this analysis of language ...MORE ⇓
Language is a complex adaptive system: Speakers are agents who interact with each other, and their past and current interactions feed into speakers' future behavior in complex ways. In this article, we describe the social cognitive linguistic basis for this analysis of language and a mathematical model developed in collaboration between researchers in linguistics and statistical physics. The model has led us to posit two mechanisms of selectionaneutral interactor selection and weighted interactor selectionain addition to neutral evolution and replicator selection (fitness). We describe current results in modeling language change in terms of neutral interactor selection and weighted interactor selection.
Language Learning 59(s1):1-26, 2009
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language ...MORE ⇓
Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language is acquired, is used, and changes. These processes are not independent of one another but are facets of the same complex adaptive system (CAS). Language as a CAS involves the following key features: The system consists of multiple agents (the speakers in the speech community) interacting with one another. The system is adaptive; that is, speakers' behavior is based on their past interactions, and current and past interactions together feed forward into future behavior. A speaker's behavior is the consequence of competing factors ranging from perceptual constraints to social motivations. The structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms. The CAS approach reveals commonalities in many areas of language research, including first and second language acquisition, historical linguistics, psycholinguistics, language evolution, and computational modeling.
Trends in Cognitive Sciences 13(11):464-469, 2009
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe ...MORE ⇓
Studies of language change have begun to contribute to answering several pressing questions in cognitive sciences, including the origins of human language capacity, the social construction of cognition and the mechanisms underlying culture change in general. Here, we describe recent advances within a new emerging framework for the study of language change, one that models such change as an evolutionary process among competing linguistic variants. We argue that a crucial and unifying element of this framework is the use of probabilistic, data-driven models both to infer change and to compare competing claims about social and cognitive influences on language change.
2008
Evolutionary linguistics
Annual Review of Anthropology 37(1):219, 2008
Both qualitative concepts and quantitative methods from evolutionary biology have been applied to linguistics. Many linguists have noted the similarity between biological evolution and language change, but usually have employed only selective analogies or metaphors. ...
2006
Physical Review E 73:046118, 2006
We present a mathematical formulation of a theory of language change. The theory is evolutionary in nature and has close analogies with theories of population genetics. The mathematical structure we construct similarly has correspondences with the Fisher-Wright model of ...MORE ⇓
We present a mathematical formulation of a theory of language change. The theory is evolutionary in nature and has close analogies with theories of population genetics. The mathematical structure we construct similarly has correspondences with the Fisher-Wright model of population genetics, but there are significant differences. The continuous time formulation of the model is expressed in terms of a Fokker-Planck equation. This equation is exactly soluble in the case of a single speaker and can be investigated analytically in the case of multiple speakers who communicate equally with all other speakers and give their utterances equal weight. Whilst the stationary properties of this system have much in common with the single-speaker case, time-dependent properties are richer. In the particular case where linguistic forms can become extinct, we find that the presence of many speakers causes a two-stage relaxation, the first being a common marginal distribution that persists for a long time as a consequence of ultimate extinction being due to rare fluctuations.
2002
Selection 3(1):75-91, 2002
Linguistics and evolutionary biology have substantially diverged until recently. The chief reason for this divergence was the dominance of essentialist thinking in linguistics during the twentieth century. Croft (2000) describes a thoroughgoing application of Hull's (1988) ...MORE ⇓
Linguistics and evolutionary biology have substantially diverged until recently. The chief reason for this divergence was the dominance of essentialist thinking in linguistics during the twentieth century. Croft (2000) describes a thoroughgoing application of Hull's (1988) generalized theory of selection to language change. In this model, tokens of linguistic structure in utterances (`linguemes') are replicators and speakers are interactors. Current debates in the philosophy of evolutionary biology (e.g. Sterelny and Griffiths, 1999) are then applied to language change. Hull's generalized theory is post-synthesis: it recognizes a distinction between replicator and interactor and is independent of levels of biological organization. Biological issues such as mechanisms of inheritance (e.g. Lamarckism) and of selection (e.g. intentional behavior) are simply irrelevant to the generalized theory of selection outside biology. However, there are many striking parallels between biological evolution and language change that are likely to be consequences of the generalized theory of selection, including flexibility of adaptation to the environment, emergent structure, evolutionary conservatism, vestigial traits, exaptation, and the absence of ``progress''. The evolutionary theory of language change is not evolutionary psychology, but it is mimetics; this approach is defended against Sterelny and Griffith's criticisms.
2000