Automatic detection and semi-automatic revision of non-machine-translatable parts of a sentence

Kiyotaka Uchimoto, Naoko Hayashida, Toru Ishida, Hitoshi Isahara

Research output: Contribution to conferencePaper

3 Citations (Scopus)

Abstract

We developed a method for automatically distinguishing the machine-translatable and non-machine-translatable parts of a given sentence for a particular machine translation (MT) system. They can be distinguished by calculating the similarity between a source-language sentence and its back translation for each part of the sentence. The parts with low similarities are highly likely to be non-machinetranslatable parts. We showed that the parts of a sentence that are automatically distinguished as non-machine-translatable provide useful information for paraphrasing or revising the sentence in the source language to improve the quality of the translation by the MT system. We also developed a method of providing knowledge useful to effectively paraphrasing or revising the detected non-machine-translatable parts. Two types of knowledge were extracted from the EDR dictionary: one for transforming a lexical entry into an expression used in the definition and the other for conducting the reverse paraphrasing, which transforms an expression found in a definition into the lexical entry. We found that the information provided by the methods helped improve the machine translatability of the originally input sentences.

Original languageEnglish
Pages703-708
Number of pages6
Publication statusPublished - 2006 Jan 1
Event5th International Conference on Language Resources and Evaluation, LREC 2006 - Genoa, Italy
Duration: 2006 May 222006 May 28

Other

Other5th International Conference on Language Resources and Evaluation, LREC 2006
CountryItaly
CityGenoa
Period06/5/2206/5/28

    Fingerprint

ASJC Scopus subject areas

  • Education
  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics

Cite this

Uchimoto, K., Hayashida, N., Ishida, T., & Isahara, H. (2006). Automatic detection and semi-automatic revision of non-machine-translatable parts of a sentence. 703-708. Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.