Constraint-based bilingual lexicon induction for closely related languages

Arbi Haza Nasution, Yohei Murakami, Toru Ishida

研究成果: Conference contribution

11 被引用数 (Scopus)

抄録

The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction becomes a difficult task for low-resource languages. Pivot language and cognate recognition approach have been proven useful to induce bilingual lexicons for such languages. We analyze the features of closely related languages and define a semantic constraint assumption. Based on the assumption, we propose a constraint-based bilingual lexicon induction for closely related languages by extending constraints and translation pair candidates from recent pivot language approach. We further define three constraint sets based on language characteristics. In this paper, two controlled experiments are conducted. The former involves four closely related language pairs with different language pair similarities, and the latter focuses on sense connectivity between non-pivot words and pivot words. We evaluate our result with F-measure. The result indicates that our method works better on voluminous input dictionaries and high similarity languages. Finally, we introduce a strategy to use proper constraint sets for different goals and language characteristics.

本文言語English
ホスト出版物のタイトルProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
編集者Nicoletta Calzolari, Khalid Choukri, Helene Mazo, Asuncion Moreno, Thierry Declerck, Sara Goggi, Marko Grobelnik, Jan Odijk, Stelios Piperidis, Bente Maegaard, Joseph Mariani
出版社European Language Resources Association (ELRA)
ページ3291-3298
ページ数8
ISBN(電子版)9782951740891
出版ステータスPublished - 2016
外部発表はい
イベント10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia
継続期間: 2016 5月 232016 5月 28

出版物シリーズ

名前Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

Other

Other10th International Conference on Language Resources and Evaluation, LREC 2016
国/地域Slovenia
CityPortoroz
Period16/5/2316/5/28

ASJC Scopus subject areas

  • 言語学および言語
  • 図書館情報学
  • 言語および言語学
  • 教育

フィンガープリント

「Constraint-based bilingual lexicon induction for closely related languages」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル