Zipf's law in phonograms and Weibull distribution in ideograms

Comparison of English with Japanese

Terutaka Nabeshima, Yukio Gunji

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of "hits" in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese).

Original languageEnglish
Pages (from-to)131-139
Number of pages9
JournalBioSystems
Volume73
Issue number2
DOIs
Publication statusPublished - 2004 Feb
Externally publishedYes

Fingerprint

Zipf's law
Weibull distribution
Weibull Distribution
Weibull statistics
Distribution functions
Websites
Language
Semantics
Exponential distribution
Hits
Power Law
Distribution Function
Retrieval
capping
Exponent
Evaluate
power law
comparison
distribution
Text

Keywords

  • Ideogram
  • Phonogram
  • Weibull distribution

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Biotechnology
  • Drug Discovery

Cite this

Zipf's law in phonograms and Weibull distribution in ideograms : Comparison of English with Japanese. / Nabeshima, Terutaka; Gunji, Yukio.

In: BioSystems, Vol. 73, No. 2, 02.2004, p. 131-139.

Research output: Contribution to journalArticle

@article{fb1d25fdfb3f44358ad5687c276774f7,
title = "Zipf's law in phonograms and Weibull distribution in ideograms: Comparison of English with Japanese",
abstract = "Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of {"}hits{"} in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese).",
keywords = "Ideogram, Phonogram, Weibull distribution",
author = "Terutaka Nabeshima and Yukio Gunji",
year = "2004",
month = "2",
doi = "10.1016/j.biosystems.2003.11.002",
language = "English",
volume = "73",
pages = "131--139",
journal = "BioSystems",
issn = "0303-2647",
publisher = "Elsevier Ireland Ltd",
number = "2",

}

TY - JOUR

T1 - Zipf's law in phonograms and Weibull distribution in ideograms

T2 - Comparison of English with Japanese

AU - Nabeshima, Terutaka

AU - Gunji, Yukio

PY - 2004/2

Y1 - 2004/2

N2 - Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of "hits" in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese).

AB - Frequency distribution of word usage in a word sequence generated by capping is estimated in terms of the number of "hits" in retrieval of web-pages, to evaluate structure of semantics proper not to a particular text but to a language. Especially we compare distribution of English sequences with Japanese ones and obtain that, for English and Japanese phonogram, frequency of word usage against rank follows power-law function with exponent 1 and, for Japanese ideogram, it follows stretched exponential (Weibull distribution) function. We also discuss that such a difference can result from difference of phonogram based- (English) and ideogram-based language (Japanese).

KW - Ideogram

KW - Phonogram

KW - Weibull distribution

UR - http://www.scopus.com/inward/record.url?scp=0842328616&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0842328616&partnerID=8YFLogxK

U2 - 10.1016/j.biosystems.2003.11.002

DO - 10.1016/j.biosystems.2003.11.002

M3 - Article

VL - 73

SP - 131

EP - 139

JO - BioSystems

JF - BioSystems

SN - 0303-2647

IS - 2

ER -