TY - JOUR
T1 - A framework for cross-language information access
T2 - Application to English and Japanese
AU - Jones, Gareth
AU - Collier, Nigel
AU - Sakai, Tetsuya
AU - Sumita, Kazuo
AU - Hirakawa, Hideki
PY - 2001/11
Y1 - 2001/11
N2 - Internet search engines allow access to online information from all over the world. However, there is currently a general assumption that users are fluent in the languages of all documents that they might search for. This has for historical reasons usually been a choice between English and the locally supported language. Given the rapidly growing size of the Internet, it is likely that future users will need to access information in languages in which they are not fluent or have no knowledge of at all. This paper shows how information retrieval and machine translation can be combined in a cross-language information access framework to help overcome the language barrier. We present encouraging preliminary experimental results using English queries to retrieve documents from the standard Japanese language BMIR-J2 retrieval test collection. We outline the scope and purpose of cross-language information access and provide an example application to suggest that technology already exists to provide effective and potentially useful applications.
AB - Internet search engines allow access to online information from all over the world. However, there is currently a general assumption that users are fluent in the languages of all documents that they might search for. This has for historical reasons usually been a choice between English and the locally supported language. Given the rapidly growing size of the Internet, it is likely that future users will need to access information in languages in which they are not fluent or have no knowledge of at all. This paper shows how information retrieval and machine translation can be combined in a cross-language information access framework to help overcome the language barrier. We present encouraging preliminary experimental results using English queries to retrieve documents from the standard Japanese language BMIR-J2 retrieval test collection. We outline the scope and purpose of cross-language information access and provide an example application to suggest that technology already exists to provide effective and potentially useful applications.
KW - Cross-language information retrieval
KW - Information access
KW - Japanese-English
KW - Machine translation
KW - Probabilistic retrieval
UR - http://www.scopus.com/inward/record.url?scp=33750642500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750642500&partnerID=8YFLogxK
U2 - 10.1023/A:1011851209975
DO - 10.1023/A:1011851209975
M3 - Article
AN - SCOPUS:33750642500
VL - 35
SP - 371
EP - 388
JO - Language Resources and Evaluation
JF - Language Resources and Evaluation
SN - 1574-020X
IS - 4
ER -