Rapid development of a corpus with discourse annotations using two-stage crowdsourcing

Daisuke Kawahara, Yuichiro Machida, Tomohide Shibata, Sadao Kurohashi, Hayato Kobayashi, Manabu Sassano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

We present a novel approach for rapidly developing a corpus with discourse annotations using crowdsourcing. Although discourse annotations typically require much time and cost owing to their complex nature, we realize discourse annotations in an extremely short time while retaining good quality of the annotations by crowdsourcing two annotation subtasks. In fact, our experiment to create a corpus comprising 30,000 Japanese sentences took less than eight hours to run. Based on this corpus, we also develop a supervised discourse parser and evaluate its performance to verify the usefulness of the acquired corpus.

Original languageEnglish
Title of host publicationCOLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages269-278
Number of pages10
ISBN (Electronic)9781941643266
Publication statusPublished - 2014
Externally publishedYes
Event25th International Conference on Computational Linguistics, COLING 2014 - Dublin, Ireland
Duration: 2014 Aug 232014 Aug 29

Publication series

NameCOLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers

Conference

Conference25th International Conference on Computational Linguistics, COLING 2014
CountryIreland
CityDublin
Period14/8/2314/8/29

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Rapid development of a corpus with discourse annotations using two-stage crowdsourcing'. Together they form a unique fingerprint.

  • Cite this

    Kawahara, D., Machida, Y., Shibata, T., Kurohashi, S., Kobayashi, H., & Sassano, M. (2014). Rapid development of a corpus with discourse annotations using two-stage crowdsourcing. In COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers (pp. 269-278). (COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers). Association for Computational Linguistics, ACL Anthology.