Abstract
Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of "hits" in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese).
Original language | English |
---|---|
Pages (from-to) | 131-139 |
Number of pages | 9 |
Journal | BioSystems |
Volume | 73 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2004 Feb |
Externally published | Yes |
Keywords
- Ideogram
- Phonogram
- Weibull distribution
ASJC Scopus subject areas
- Statistics and Probability
- Modelling and Simulation
- Biochemistry, Genetics and Molecular Biology(all)
- Applied Mathematics