Large-scale AMR Corpus with Re-generated Sentences: Domain Adaptive Pre-training on ACL Anthology Corpus

Ming Zhao, Yaling Wang, Yves Lepage

研究成果

抄録

Meaning Representation (AMR) is a broad -coverage formalism for capturing the semantics of a given sentence. However, domain adaptation of AMR is limited by the shortage of annotated AMR graphs. In this paper, we explore and build a new large-scale dataset with 2.3 million AMRs in the domain of academic writing. Additionally, we prove that 30% of them are of similar quality as the annotated data in the downstream AMR-to-text task. Our results outperform previous graph-based approaches by over 11 BLEU points. We provide a pipeline that integrates automated generation and evaluation. This can help explore other AMR benchmarks.

本文言語English
ホスト出版物のタイトルProceedings - ICACSIS 2022
ホスト出版物のサブタイトル14th International Conference on Advanced Computer Science and Information Systems
出版社Institute of Electrical and Electronics Engineers Inc.
ページ19-24
ページ数6
ISBN(電子版)9781665489362
DOI
出版ステータスPublished - 2022
イベント14th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2022 - Virtual, Online, Indonesia
継続期間: 2022 10月 12022 10月 3

出版物シリーズ

名前Proceedings - ICACSIS 2022: 14th International Conference on Advanced Computer Science and Information Systems

Conference

Conference14th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2022
国/地域Indonesia
CityVirtual, Online
Period22/10/122/10/3

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ サイエンスの応用
  • 情報システム
  • 情報システムおよび情報管理

フィンガープリント

「Large-scale AMR Corpus with Re-generated Sentences: Domain Adaptive Pre-training on ACL Anthology Corpus」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル