A framework for compiling high quality knowledge resources from raw corpora

Gongye Jin, Daisuke Kawahara, Sadao Kurohashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The identification of various types of relations is a necessary step to allow computers to understand natural language text. In particular, the clarification of relations between predicates and their arguments is essential because predicate - argument structures convey most of the information in natural languages. To precisely capture these relations, wide-coverage knowledge resources are indispensable. Such knowledge resources can be derived from automatic parses of raw corpora, but unfortunately parsing still has not achieved a high enough performance for precise knowledge acquisition. We present a framework for compiling high quality knowledge resources from raw corpora. Our proposed framework selects high quality dependency relations from automatic parses and makes use of them for not only the calculation of fundamental distributional similarity but also the acquisition of knowledge such as case frames.

Original languageEnglish
Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
EditorsNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
PublisherEuropean Language Resources Association (ELRA)
Pages109-114
Number of pages6
ISBN (Electronic)9782951740884
Publication statusPublished - 2014
Externally publishedYes
Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
Duration: 2014 May 262014 May 31

Publication series

NameProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

Other

Other9th International Conference on Language Resources and Evaluation, LREC 2014
CountryIceland
CityReykjavik
Period14/5/2614/5/31

Keywords

  • Case frames
  • Dependency selection
  • Knowledge acquisition

ASJC Scopus subject areas

  • Linguistics and Language
  • Library and Information Sciences
  • Education
  • Language and Linguistics

Fingerprint Dive into the research topics of 'A framework for compiling high quality knowledge resources from raw corpora'. Together they form a unique fingerprint.

  • Cite this

    Jin, G., Kawahara, D., & Kurohashi, S. (2014). A framework for compiling high quality knowledge resources from raw corpora. In N. Calzolari, K. Choukri, S. Goggi, T. Declerck, J. Mariani, B. Maegaard, A. Moreno, J. Odijk, H. Mazo, S. Piperidis, & H. Loftsson (Eds.), Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 (pp. 109-114). (Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014). European Language Resources Association (ELRA).