A framework for cross-language information access: Application to English and Japanese

Gareth Jones, Nigel Collier, Tetsuya Sakai, Kazuo Sumita, Hideki Hirakawa

Research output: Contribution to journalArticlepeer-review

Abstract

Internet search engines allow access to online information from all over the world. However, there is currently a general assumption that users are fluent in the languages of all documents that they might search for. This has for historical reasons usually been a choice between English and the locally supported language. Given the rapidly growing size of the Internet, it is likely that future users will need to access information in languages in which they are not fluent or have no knowledge of at all. This paper shows how information retrieval and machine translation can be combined in a cross-language information access framework to help overcome the language barrier. We present encouraging preliminary experimental results using English queries to retrieve documents from the standard Japanese language BMIR-J2 retrieval test collection. We outline the scope and purpose of cross-language information access and provide an example application to suggest that technology already exists to provide effective and potentially useful applications.

Original languageEnglish
Pages (from-to)371-388
Number of pages18
JournalComputers and the Humanities
Volume35
Issue number4
DOIs
Publication statusPublished - 2001 Nov
Externally publishedYes

Keywords

  • Cross-language information retrieval
  • Information access
  • Japanese-English
  • Machine translation
  • Probabilistic retrieval

ASJC Scopus subject areas

  • Social Sciences(all)

Fingerprint Dive into the research topics of 'A framework for cross-language information access: Application to English and Japanese'. Together they form a unique fingerprint.

Cite this