The use of external text data in cross-language information retrieval based on machine translation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudorelevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.

Original languageEnglish
Title of host publicationProceedings of the IEEE International Conference on Systems, Man and Cybernetics
EditorsA. El Kamel, K. Mellouli, P. Borne
Pages284-289
Number of pages6
Volume6
Publication statusPublished - 2002
Externally publishedYes
Event2002 IEEE International Conference on Systems, Man and Cybernetics - Yasmine Hammamet, Tunisia
Duration: 2002 Oct 62002 Oct 9

Other

Other2002 IEEE International Conference on Systems, Man and Cybernetics
CountryTunisia
CityYasmine Hammamet
Period02/10/602/10/9

Fingerprint

Query languages
Feedback
Experiments

Keywords

  • Cross-language information retrieval
  • External document collections
  • Machine translation
  • Pseudo-relevance feedback

ASJC Scopus subject areas

  • Hardware and Architecture
  • Control and Systems Engineering

Cite this

Sakai, T. (2002). The use of external text data in cross-language information retrieval based on machine translation. In A. El Kamel, K. Mellouli, & P. Borne (Eds.), Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (Vol. 6, pp. 284-289)

The use of external text data in cross-language information retrieval based on machine translation. / Sakai, Tetsuya.

Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. ed. / A. El Kamel; K. Mellouli; P. Borne. Vol. 6 2002. p. 284-289.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sakai, T 2002, The use of external text data in cross-language information retrieval based on machine translation. in A El Kamel, K Mellouli & P Borne (eds), Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. vol. 6, pp. 284-289, 2002 IEEE International Conference on Systems, Man and Cybernetics, Yasmine Hammamet, Tunisia, 02/10/6.
Sakai T. The use of external text data in cross-language information retrieval based on machine translation. In El Kamel A, Mellouli K, Borne P, editors, Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. Vol. 6. 2002. p. 284-289
Sakai, Tetsuya. / The use of external text data in cross-language information retrieval based on machine translation. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. editor / A. El Kamel ; K. Mellouli ; P. Borne. Vol. 6 2002. pp. 284-289
@inproceedings{59a9093d5e1f44d4bac5c3a25ff437b6,
title = "The use of external text data in cross-language information retrieval based on machine translation",
abstract = "This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudorelevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.",
keywords = "Cross-language information retrieval, External document collections, Machine translation, Pseudo-relevance feedback",
author = "Tetsuya Sakai",
year = "2002",
language = "English",
volume = "6",
pages = "284--289",
editor = "{El Kamel}, A. and K. Mellouli and P. Borne",
booktitle = "Proceedings of the IEEE International Conference on Systems, Man and Cybernetics",

}

TY - GEN

T1 - The use of external text data in cross-language information retrieval based on machine translation

AU - Sakai, Tetsuya

PY - 2002

Y1 - 2002

N2 - This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudorelevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.

AB - This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudorelevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.

KW - Cross-language information retrieval

KW - External document collections

KW - Machine translation

KW - Pseudo-relevance feedback

UR - http://www.scopus.com/inward/record.url?scp=0036968666&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036968666&partnerID=8YFLogxK

M3 - Conference contribution

VL - 6

SP - 284

EP - 289

BT - Proceedings of the IEEE International Conference on Systems, Man and Cybernetics

A2 - El Kamel, A.

A2 - Mellouli, K.

A2 - Borne, P.

ER -