Transformation-based Khmer Part-of-Speech tagger

Chenda Nou, Wataru Kameyama

研究成果: Conference contribution

2 引用 (Scopus)

抜粋

This paper introduces an initiative research on Khmer Part-of-Speech (POS) tagger based on Transformation based approach. Due to a few researches on natural language processing for Khmer, many pre-processing tasks are needed before the automatic tagging can take place. The first Khmer annotated corpus is tagged with 27 tags based on the traditional and modern grammar theories. The learner, based on learning algorithm introduced by Brill [2], is built with 32 transformation templates. After applying the transformation rules with our sophisticated ranking algorithm, the error rate of tagging on trained and untrained data can be reduced around 41% and 18% accordingly over the baseline. The experiments provide very encouraging results; however, some future works are drawn to improve the accuracy and the performance of the tagger to reach the better level.

元の言語English
ホスト出版物のタイトルProceedings of the 2007 International Conference on Artificial Intelligence, ICAI 2007
ページ581-587
ページ数7
出版物ステータスPublished - 2007 12 1
イベント2007 International Conference on Artificial Intelligence, ICAI 2007 - Las Vegas, NV, United States
継続期間: 2007 6 252007 6 28

出版物シリーズ

名前Proceedings of the 2007 International Conference on Artificial Intelligence, ICAI 2007
2

Conference

Conference2007 International Conference on Artificial Intelligence, ICAI 2007
United States
Las Vegas, NV
期間07/6/2507/6/28

ASJC Scopus subject areas

  • Artificial Intelligence

フィンガープリント Transformation-based Khmer Part-of-Speech tagger' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Nou, C., & Kameyama, W. (2007). Transformation-based Khmer Part-of-Speech tagger. : Proceedings of the 2007 International Conference on Artificial Intelligence, ICAI 2007 (pp. 581-587). (Proceedings of the 2007 International Conference on Artificial Intelligence, ICAI 2007; 巻数 2).