Organising lexica into analogical grids: a study of a holistic approach for morphological generation under various sizes of data in various languages

Rashel Fam*, Yves Lepage

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Morphological generation is a task where given a lemma and a morphosyntactic description of the target form, we are asked to generate the target form. Knowing that the syntactic and semantic relations to other forms are reflected by the word form itself, we show how to exploit these relations between word forms, holistically, that is, as a whole, to derive the target form without even breaking them into morphemes. Experimental results show that by organising the lexica into analogical grids we are able to improve the accuracy of morphological generation by up to 8% in low data scenarios. Our holistic approach always performs better than a morpheme-based baseline. We also enquire possible improvements by using data augmentation for neural approaches, especially in low data scenarios. However, our system seems not to gain any advantage from having more data after some point in time.

Original languageEnglish
JournalJournal of Experimental and Theoretical Artificial Intelligence
DOIs
Publication statusAccepted/In press - 2022

Keywords

  • Analogical grids
  • language productivity
  • morphological complexity
  • morphological generation
  • organisation of lexicon

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Organising lexica into analogical grids: a study of a holistic approach for morphological generation under various sizes of data in various languages'. Together they form a unique fingerprint.

Cite this