A framework for cross-language information access: Application to english and Japanese

Gareth Jones, Nigel Collier, Tetsuya Sakai, Kazuo Sumita, Hideki Hirakawa

Research output: Contribution to journalArticle

Abstract

Internet search engines allow access to online information from all over the world. However, there is currently a general assumption that users are fluent in the languages of all documents that they might search for. This has for historical reasons usually been a choice between English and the locally supported language. Given the rapidly growing size of the Internet, it is likely that future users will need to access information in languages in which they are not fluent or have no knowledge of at all. This paper shows how information retrieval and machine translation can be combined in a cross-language information access framework to help overcome the language barrier. We present encouraging preliminary experimental results using English queries to retrieve documents from the standard Japanese language BMIR-J2 retrieval test collection. We outline the scope and purpose of cross-language information access and provide an example application to suggest that technology already exists to provide effective and potentially useful applications.

Original languageEnglish
Pages (from-to)371-388
Number of pages18
JournalLanguage Resources and Evaluation
Volume35
Issue number4
Publication statusPublished - 2001 Dec 1
Externally publishedYes

Keywords

  • Cross-language information retrieval
  • Information access
  • Japanese-English
  • Machine translation
  • Probabilistic retrieval

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'A framework for cross-language information access: Application to english and Japanese'. Together they form a unique fingerprint.

  • Cite this