Generating transliteration rules for cross-language information retrieval from machine translation dictionaries

Tetsuya Sakai, Akira Kumano, Toshihiko Manabe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross-language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.

Original languageEnglish
Title of host publicationProceedings of the IEEE International Conference on Systems, Man and Cybernetics
EditorsA. El Kamel, K. Mellouli, P. Borne
Pages290-295
Number of pages6
Volume6
Publication statusPublished - 2002
Externally publishedYes
Event2002 IEEE International Conference on Systems, Man and Cybernetics - Yasmine Hammamet, Tunisia
Duration: 2002 Oct 62002 Oct 9

Other

Other2002 IEEE International Conference on Systems, Man and Cybernetics
CountryTunisia
CityYasmine Hammamet
Period02/10/602/10/9

Fingerprint

Query languages
Glossaries
Experiments

Keywords

  • Cross-language information retrieval
  • Katakana
  • Machine translation
  • Transliteration

ASJC Scopus subject areas

  • Hardware and Architecture
  • Control and Systems Engineering

Cite this

Sakai, T., Kumano, A., & Manabe, T. (2002). Generating transliteration rules for cross-language information retrieval from machine translation dictionaries. In A. El Kamel, K. Mellouli, & P. Borne (Eds.), Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (Vol. 6, pp. 290-295)

Generating transliteration rules for cross-language information retrieval from machine translation dictionaries. / Sakai, Tetsuya; Kumano, Akira; Manabe, Toshihiko.

Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. ed. / A. El Kamel; K. Mellouli; P. Borne. Vol. 6 2002. p. 290-295.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sakai, T, Kumano, A & Manabe, T 2002, Generating transliteration rules for cross-language information retrieval from machine translation dictionaries. in A El Kamel, K Mellouli & P Borne (eds), Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. vol. 6, pp. 290-295, 2002 IEEE International Conference on Systems, Man and Cybernetics, Yasmine Hammamet, Tunisia, 02/10/6.
Sakai T, Kumano A, Manabe T. Generating transliteration rules for cross-language information retrieval from machine translation dictionaries. In El Kamel A, Mellouli K, Borne P, editors, Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. Vol. 6. 2002. p. 290-295
Sakai, Tetsuya ; Kumano, Akira ; Manabe, Toshihiko. / Generating transliteration rules for cross-language information retrieval from machine translation dictionaries. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. editor / A. El Kamel ; K. Mellouli ; P. Borne. Vol. 6 2002. pp. 290-295
@inproceedings{6b69ccae6d5a492d98dee6b128b703e8,
title = "Generating transliteration rules for cross-language information retrieval from machine translation dictionaries",
abstract = "This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross-language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75{\%} of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55{\%} of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.",
keywords = "Cross-language information retrieval, Katakana, Machine translation, Transliteration",
author = "Tetsuya Sakai and Akira Kumano and Toshihiko Manabe",
year = "2002",
language = "English",
volume = "6",
pages = "290--295",
editor = "{El Kamel}, A. and K. Mellouli and P. Borne",
booktitle = "Proceedings of the IEEE International Conference on Systems, Man and Cybernetics",

}

TY - GEN

T1 - Generating transliteration rules for cross-language information retrieval from machine translation dictionaries

AU - Sakai, Tetsuya

AU - Kumano, Akira

AU - Manabe, Toshihiko

PY - 2002

Y1 - 2002

N2 - This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross-language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.

AB - This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross-language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.

KW - Cross-language information retrieval

KW - Katakana

KW - Machine translation

KW - Transliteration

UR - http://www.scopus.com/inward/record.url?scp=0036969353&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036969353&partnerID=8YFLogxK

M3 - Conference contribution

VL - 6

SP - 290

EP - 295

BT - Proceedings of the IEEE International Conference on Systems, Man and Cybernetics

A2 - El Kamel, A.

A2 - Mellouli, K.

A2 - Borne, P.

ER -