Hierarchical sub-sentential alignment with anymalign

Adrien Lardilleux, François Yvon, Yves Lepage

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-of-the-art results, with gains up to more than 4 BLEU points compared to previous work, with a method that is simple, independent of the size of the corpus to be aligned, and directly computes symmetric alignments. This work also provides new insights regarding the use of "heuristic" alignment scores in statistical machine translation.

Original languageEnglish
Title of host publicationProceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012
PublisherEuropean Association for Machine Translation
Pages279-286
Number of pages8
Publication statusPublished - 2012
Event16th Annual Conference of the European Association for Machine Translation, EAMT 2012 - Trento, Italy
Duration: 2012 May 282012 May 30

Other

Other16th Annual Conference of the European Association for Machine Translation, EAMT 2012
CountryItaly
CityTrento
Period12/5/2812/5/30

Fingerprint

Alignment
Heuristics
Statistical Machine Translation
Machine Translation
Segmentation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Cite this

Lardilleux, A., Yvon, F., & Lepage, Y. (2012). Hierarchical sub-sentential alignment with anymalign. In Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012 (pp. 279-286). European Association for Machine Translation.

Hierarchical sub-sentential alignment with anymalign. / Lardilleux, Adrien; Yvon, François; Lepage, Yves.

Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012. European Association for Machine Translation, 2012. p. 279-286.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lardilleux, A, Yvon, F & Lepage, Y 2012, Hierarchical sub-sentential alignment with anymalign. in Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012. European Association for Machine Translation, pp. 279-286, 16th Annual Conference of the European Association for Machine Translation, EAMT 2012, Trento, Italy, 12/5/28.
Lardilleux A, Yvon F, Lepage Y. Hierarchical sub-sentential alignment with anymalign. In Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012. European Association for Machine Translation. 2012. p. 279-286
Lardilleux, Adrien ; Yvon, François ; Lepage, Yves. / Hierarchical sub-sentential alignment with anymalign. Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012. European Association for Machine Translation, 2012. pp. 279-286
@inproceedings{60a79c805a964ddca8b8bf2561c64040,
title = "Hierarchical sub-sentential alignment with anymalign",
abstract = "We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-of-the-art results, with gains up to more than 4 BLEU points compared to previous work, with a method that is simple, independent of the size of the corpus to be aligned, and directly computes symmetric alignments. This work also provides new insights regarding the use of {"}heuristic{"} alignment scores in statistical machine translation.",
author = "Adrien Lardilleux and Fran{\cc}ois Yvon and Yves Lepage",
year = "2012",
language = "English",
pages = "279--286",
booktitle = "Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012",
publisher = "European Association for Machine Translation",

}

TY - GEN

T1 - Hierarchical sub-sentential alignment with anymalign

AU - Lardilleux, Adrien

AU - Yvon, François

AU - Lepage, Yves

PY - 2012

Y1 - 2012

N2 - We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-of-the-art results, with gains up to more than 4 BLEU points compared to previous work, with a method that is simple, independent of the size of the corpus to be aligned, and directly computes symmetric alignments. This work also provides new insights regarding the use of "heuristic" alignment scores in statistical machine translation.

AB - We present a sub-sentential alignment algorithm that relies on association scores between words or phrases. This algorithm is inspired by previous work on alignment by recursive binary segmentation and on document clustering. We evaluate the resulting alignments on machine translation tasks and show that we can obtain state-of-the-art results, with gains up to more than 4 BLEU points compared to previous work, with a method that is simple, independent of the size of the corpus to be aligned, and directly computes symmetric alignments. This work also provides new insights regarding the use of "heuristic" alignment scores in statistical machine translation.

UR - http://www.scopus.com/inward/record.url?scp=85001104699&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85001104699&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85001104699

SP - 279

EP - 286

BT - Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012

PB - European Association for Machine Translation

ER -