Chinese word segmentation algorithm based on pair coding

Bingyi Zhang, Bo Wei, Jiancheng Chen, Jie Wei, Guozheng Rao

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

To improve the segmentation velocity and storage efficiency of the Chinese word segmentation algorithm, this paper proposes a characteristic matching algorithm based on pair coding. The characteristic value is extracted from the Chinese character position. This method can support fuzzy matching and don't need match multi-character Chinese words, so the characteristic value extraction is extracted from the adjacent Chinese character position. In addition, the data compression method can contribute to reduce storage space and improve the performance of Chinese word segmentation.

Original languageEnglish
Pages (from-to)526-530
Number of pages5
JournalNanjing Li Gong Daxue Xuebao/Journal of Nanjing University of Science and Technology
Volume38
Issue number4
Publication statusPublished - 2014 Aug 30
Externally publishedYes

Keywords

  • Characteristic matching
  • Characteristic value
  • Chinese word segmentation
  • Data compression
  • Fuzzy matching
  • Hash
  • Pair coding

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Chinese word segmentation algorithm based on pair coding'. Together they form a unique fingerprint.

  • Cite this