L. Nakhleh

2006

Proceedings of the 12th Annual International Computing and Combinatorics Conference (COCOON 2006), pages 299-308, 2006

In a recent article, Nakhleh, Ringe and Warnow introduced perfect phylogenetic networks --a model of language evolution where languages do not evolve via clean speciation-- and formulated a set of problems for their accurate reconstruction. Their new methodology assumes networks, ...MORE ⇓

In a recent article, Nakhleh, Ringe and Warnow introduced perfect phylogenetic networks --a model of language evolution where languages do not evolve via clean speciation-- and formulated a set of problems for their accurate reconstruction. Their new methodology assumes networks, rather than trees, as the correct model to capture the evolutionary history of natural languages. They proved the NP-hardness of the problem of testing whether a network is a perfect phy- logenetic one for characters exhibiting at least three states, leaving open the case of binary characters, and gave a straightforward brute-force parameterized algorithm for the problem of running time O(3k n), where k is the number of bidirectional edges in the network and n is its size. In this paper, we first establish the NP-hardness of the binary case of the problem. Then we provide a more efficient parameterized algorithm for this case running in time O(2k n 2). The presented algorithm is very simple, and utilizes some structural results and elegant operations developed in this paper that can be useful on their own in the design of heuristic algorithms for the problem. The analysis phase of the algorithm is very elegant using amortized techniques to show that the upper bound on the running time of the algorithm is much tighter than the upper bound obtained under a conservative worst-case scenario assumption. Our results bear significant impact on reconstructing evolutionary histories of languages --particularly from phonological and morphological character data, most of which exhibit at most two states (i.e., are binary), as well as on the design and analysis of parameterized algorithms.

A Stochastic model of language evolution that incorporates homoplasy and borrowingPDF

Phylogenetic Methods and the Prehistory of Languages 7.0:75-, 2006

The inference of evolutionary history, whether in biology or in linguistics, is aided by a carefully considered model of the evolutionary process and a reconstruction method which is expected to produce a reasonably accurate estimation of the true evolutionary history when the ...MORE ⇓

The inference of evolutionary history, whether in biology or in linguistics, is aided by a carefully considered model of the evolutionary process and a reconstruction method which is expected to produce a reasonably accurate estimation of the true evolutionary history when the real data match the model assumptions and are of sufficient quantity. In molecular systematics (i.e., the inference of evolutionary histories from molecular data), much of the research effort has focused in two areas: first, the development of increasingly parameter rich models of molecular sequence evolution, and second, the development of increasingly sophisticated software tools and algorithms for reconstructing phylogenies under these models. The plethora of software for reconstructing phylogenies from molecular data is staggering. By comparison, much less has been done in historical linguistics in terms of developing statistical models of character evolution or reconstruction methods, suggesting that there is perhaps much to be gained by doing so. ...

2005

Transactions of the Philological Society 3(2):171-192, 2005

Researchers interested in the history of the Indo-European family of languages have used a variety of methods to estimate the phylogeny of the family, and have obtained widely differing results. In this paper we explore the reconstructions of the Indo- European phylogeny obtained ...MORE ⇓

Researchers interested in the history of the Indo-European family of languages have used a variety of methods to estimate the phylogeny of the family, and have obtained widely differing results. In this paper we explore the reconstructions of the Indo- European phylogeny obtained by using the major phylogeny estimation procedures on an existing database of 336 characters (including lexical, phonological, and morpho- logical characters) for 24 Indo-European languages. Our study finds that the different methods agree in part, but that there are also several striking differences. We dis- cuss the reasons for these differences, and make proposals with respect to phylogenetic reconstruction in historical linguistics.

2003

Reconstructing the evolutionary history of Indo-European languages using answer set programmingdoi.orgPDF

Proceedings of the 5th International Symposium on Practical Aspects of Declarative Languages (PADL 03), 2003

The evolutionary history of languages can be modeled as a tree, called a phylogeny, where the leaves represent the extant lan- guages, the internal vertices represent the ancestral languages, and the edges represent the genetic relations between the languages. Languages not only ...MORE ⇓

The evolutionary history of languages can be modeled as a tree, called a phylogeny, where the leaves represent the extant lan- guages, the internal vertices represent the ancestral languages, and the edges represent the genetic relations between the languages. Languages not only inherit characteristics from their ancestors but also sometimes borrow them from other languages. Such borrowings can be represented by additional non-tree edges. This paper addresses the problem of com- puting a small number of additional edges that turn a phylogeny into a 'perfect phylogenetic network'. To solve this problem, we use answer set programming, which represents a given computational problem as a logic program whose answer sets correspond to solutions. Using the answer set solver smodels, with some heuristics and optimization tech- niques, we have generated a few conjectures regarding the evolution of Indo-European languages.