Generating transliteration rules for cross-language information retrieval from machine translation dictionaries

Tetsuya Sakai, Akira Kumano, Toshihiko Manabe

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.

Original languageEnglish
Pages (from-to)290-295
Number of pages6
JournalProceedings of the IEEE International Conference on Systems, Man and Cybernetics
Volume6
DOIs
Publication statusPublished - 2002 Jan 1
Externally publishedYes

Keywords

  • Cross-language information retrieval
  • Katakana
  • Machine translation
  • Transliteration

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Generating transliteration rules for cross-language information retrieval from machine translation dictionaries'. Together they form a unique fingerprint.

Cite this