Language Evolution and Computation Bibliography

Our site (www.isrl.uiuc.edu/amag/langev) retired, please use https://langev.com instead.
Chunyu Kit
2005
Unsupervised Lexical Learning As Inductive Inference via CompressionPDF
Language Acquisition, Change and Emergence: Essays in Evolutionary Linguistics, 2005
This paper presents a learning-via-compression approach to unsupervised acquisition of word forms with no a priori knowledge. Following the basic ideas in Solomonoff's theory of inductive inference and Rissanen's MDL framework, the learning is formulated as a process of inferring ...MORE ⇓
This paper presents a learning-via-compression approach to unsupervised acquisition of word forms with no a priori knowledge. Following the basic ideas in Solomonoff's theory of inductive inference and Rissanen's MDL framework, the learning is formulated as a process of inferring regularities, in the form of string patterns (i.e., words), from a given set of data. A segmentation algorithm is designed to segment each input utterance into a sequence of word candidates giving an optimal sum of description length gain (DLG). The learning model has a lexical refinement module to exploit this algorithm to derive finer-grained word candidates recursively until no more compression effect is available. Experimental results on an infant-directed speech corpus show that this approach reaches a state-of-art performance in terms of precision and recall of both words and word boundaries