Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Journal :: Glottometrics
2007
Probability distribution of dependency distancePDF
Glottometrics 15:1-12, 2007
This paper investigates probability distributions of dependency distances in six texts extracted from a Chinese dependency treebank. The fitting results reveal that the investigated distribution can be well captured by the right truncated Zeta distribution. In order to restrict ...MORE ⇓
This paper investigates probability distributions of dependency distances in six texts extracted from a Chinese dependency treebank. The fitting results reveal that the investigated distribution can be well captured by the right truncated Zeta distribution. In order to restrict the model only to natural language, two samples with randomly generated governors are investigated. One of them can be described e.g. by the Hyperpoisson distribution, the other satisfies the Zeta distribution. The paper also presents a study on sequential plot and mean dependency distance of six texts with three analyses (syntactic, and two random). Of these three analyses, syntactic analysis has a minimum (mean) dependency distance.
2005
Can simple models explain Zipf's law in all cases?PDF
Glottometrics 11:1-8, 2005
H. Simon proposed a simple stochastic process for explaining Zipf's law for word frequencies. Here we introduce two similar generalizations of Simon's model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount ...MORE ⇓
H. Simon proposed a simple stochastic process for explaining Zipf's law for word frequencies. Here we introduce two similar generalizations of Simon's model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simon's model. Reviewing what is known from other simple explanations of Zipf's law, we conclude there is no single radically simple explanation covering the whole range of variation of the exponent of Zipf's law in humans. The meaningfulness of Zipf's law for word frequencies remains an open question.
Hidden communication aspects in the exponent of Zipf's law
Glottometrics 11:98-119, 2005
Here we focus on communication systems following Zipf's law. We study the relationship between the properties of those communication systems and the exponent of the law. We describe the properties of communication systems using quantitative measures of the semantic vagueness and ...MORE ⇓
Here we focus on communication systems following Zipf's law. We study the relationship between the properties of those communication systems and the exponent of the law. We describe the properties of communication systems using quantitative measures of the semantic vagueness and the cost of word use. We try to reduce the precision and the economy of a communication system to a func tion of the exponent of Zipf's law and the size of the communication system. Taking the exponent of the frequency spectrum, we show that semantic precision grows with the exponent whereas the cost of word use reaches a global minimum between 1.5 and 2 if the size of the communication system re mains constant. We show that the exponent of Zipf's law is a key aspect for knowing about the num ber of stimuli handled by a communication system and determining which of two systems is less vague or less expensive. We argue that the ideal exponent of Zipf's law should be very slightly above 2.