TY - GEN
T1 - Cross-lingual knowledge projection using machine translation and target-side knowledge base completion
AU - Otani, Naoki
AU - Kiyomaru, Hirokazu
AU - Kawahara, Daisuke
AU - Kurohashi, Sadao
N1 - Funding Information:
This work was partially supported by
Publisher Copyright:
© 2018 COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings. All rights reserved.
PY - 2018
Y1 - 2018
N2 - Considerable effort has been devoted to building commonsense knowledge bases. However, they are not available in many languages because the construction of KBs is expensive. To bridge the gap between languages, this paper addresses the problem of projecting the knowledge in English, a resource-rich language, into other languages, where the main challenge lies in projection ambiguity. This ambiguity is partially solved by machine translation and target-side knowledge base completion, but neither of them is adequately reliable by itself. We show their combination can project English commonsense knowledge into Japanese and Chinese with high precision. Our method also achieves a top-10 accuracy of 90% on the crowdsourced English–Japanese benchmark. Furthermore, we use our method to obtain 18,747 facts of accurate Japanese commonsense within a very short period.
AB - Considerable effort has been devoted to building commonsense knowledge bases. However, they are not available in many languages because the construction of KBs is expensive. To bridge the gap between languages, this paper addresses the problem of projecting the knowledge in English, a resource-rich language, into other languages, where the main challenge lies in projection ambiguity. This ambiguity is partially solved by machine translation and target-side knowledge base completion, but neither of them is adequately reliable by itself. We show their combination can project English commonsense knowledge into Japanese and Chinese with high precision. Our method also achieves a top-10 accuracy of 90% on the crowdsourced English–Japanese benchmark. Furthermore, we use our method to obtain 18,747 facts of accurate Japanese commonsense within a very short period.
UR - http://www.scopus.com/inward/record.url?scp=85070861358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85070861358&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85070861358
T3 - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
SP - 1508
EP - 1520
BT - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
A2 - Bender, Emily M.
A2 - Derczynski, Leon
A2 - Isabelle, Pierre
PB - Association for Computational Linguistics (ACL)
T2 - 27th International Conference on Computational Linguistics, COLING 2018
Y2 - 20 August 2018 through 26 August 2018
ER -