Building a diverse document leads corpus annotated with semantic relations

Masatsugu Hangyo, Daisuke Kawahara, Sadao Kurohashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

In these days, semantic analysis has been actively studied in natural language processing. For the study of semantic analysis, corpora with semantic annotations are essential. Although there are such corpora annotated on newspaper articles, there are various genres and styles, including linguistic expressions that are not found in newspaper articles. In this paper, we build a diverse document leads corpus annotated with semantic relations. To reduce the workload of annotators and annotate as many various documents as possible, we restrict the annotation target of each document to only the first three sentences. We have completed building a corpus of 1,000 documents and report the statistics of this corpus.

Original languageEnglish
Title of host publicationProceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012
Pages535-544
Number of pages10
Publication statusPublished - 2012
Externally publishedYes
Event26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012 - Bali, Indonesia
Duration: 2012 Nov 72012 Nov 7

Publication series

NameProceedings of the 26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012

Conference

Conference26th Pacific Asia Conference on Language, Information and Computation, PACLIC 2012
Country/TerritoryIndonesia
CityBali
Period12/11/712/11/7

ASJC Scopus subject areas

  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Building a diverse document leads corpus annotated with semantic relations'. Together they form a unique fingerprint.

Cite this