Shuigeng Zhou
2008
Physica A: Statistical Mechanics and its Applications 387(12):3039-3047, 2008
Chinese is spoken by the largest number of people in the world, and it is regarded as one of the most important languages. In this paper, we explore the statistical properties of Chinese language networks (CLNs) within the framework of complex network theory. Based on one of the ...MORE ⇓
Chinese is spoken by the largest number of people in the world, and it is regarded as one of the most important languages. In this paper, we explore the statistical properties of Chinese language networks (CLNs) within the framework of complex network theory. Based on one of the largest Chinese corpora, i.e. People's Daily Corpus, we construct two networks (CLN1 and CLN2) from two different respects, with Chinese words as nodes. In CLN1, a link between two nodes exists if they appear next to each other in at least one sentence; in CLN2, a link represents that two nodes appear simultaneously in a sentence. We show that both networks exhibit small-world effect, scale-free structure, hierarchical organization and disassortative mixing. These results indicate that in many topological aspects Chinese language shapes complex networks with organizing principles similar to other previously studied language systems, which shows that different languages may have some common characteristics in their evolution processes. We believe that our research may shed some new light into the Chinese language and find some potentially significant implications.