Sampling-based multilingual alignment

Adrien Lardilleux, Yves Lepage

Research output: Contribution to journalConference article

29 Citations (Scopus)

Abstract

We present a sub-sentential alignment method that extracts high quality multi-word alignments from sentence-aligned multilingual parallel corpora. Unlike other methods, it exploits low frequency terms, which makes it highly scalable. As it relies on alingual concepts, it can process any number of languages at once. Experiments have shown that it is competitive with state-of-the-art methods.

Original languageEnglish
Pages (from-to)214-218
Number of pages5
JournalInternational Conference Recent Advances in Natural Language Processing, RANLP
Publication statusPublished - 2009 Dec 1
EventInternational Conference on Recent Advances in Natural Language Processing, RANLP-2009 - Borovets, Bulgaria
Duration: 2009 Sep 142009 Sep 16

Keywords

  • Hapax
  • Low frequency term
  • Sampling
  • Sub-sentential alignment

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Sampling-based multilingual alignment'. Together they form a unique fingerprint.

  • Cite this