Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables

Juan Luo, Adrien Lardilleux, Yves Lepage

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Citation (Scopus)

Abstract

This paper describes an approach to improve the performance of sampling-based multilingual alignment on translation tasks by investigating the distribution of n-grams in the translation tables. This approach consists in enforcing the alignment of n-grams. The quality of phrase translation tables output by this approach and that of MGIZA++ is compared in statistical machine translation tasks. Significant improvements for this approach are reported. In addition, merging translation tables is shown to outperform state-of-the-art techniques.

Original languageEnglish
Title of host publicationPACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation
Pages150-159
Number of pages10
Publication statusPublished - 2011
Event25th Pacific Asia Conference on Language, Information and Computation, PACLIC 25 -
Duration: 2011 Dec 162011 Dec 18

Other

Other25th Pacific Asia Conference on Language, Information and Computation, PACLIC 25
Period11/12/1611/12/18

Fingerprint

Sampling
Merging
N-gram
Alignment
Statistical Machine Translation

Keywords

  • Alignment
  • Phrase translation table
  • Statistical machine translation

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Cite this

Luo, J., Lardilleux, A., & Lepage, Y. (2011). Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables. In PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (pp. 150-159)

Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables. / Luo, Juan; Lardilleux, Adrien; Lepage, Yves.

PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. 2011. p. 150-159.

Research output: Chapter in Book/Report/Conference proceedingChapter

Luo, J, Lardilleux, A & Lepage, Y 2011, Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables. in PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. pp. 150-159, 25th Pacific Asia Conference on Language, Information and Computation, PACLIC 25, 11/12/16.
Luo J, Lardilleux A, Lepage Y. Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables. In PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. 2011. p. 150-159
Luo, Juan ; Lardilleux, Adrien ; Lepage, Yves. / Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables. PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. 2011. pp. 150-159
@inbook{5b0ebab6a179496ea5c19cf0bd05b65c,
title = "Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables",
abstract = "This paper describes an approach to improve the performance of sampling-based multilingual alignment on translation tasks by investigating the distribution of n-grams in the translation tables. This approach consists in enforcing the alignment of n-grams. The quality of phrase translation tables output by this approach and that of MGIZA++ is compared in statistical machine translation tasks. Significant improvements for this approach are reported. In addition, merging translation tables is shown to outperform state-of-the-art techniques.",
keywords = "Alignment, Phrase translation table, Statistical machine translation",
author = "Juan Luo and Adrien Lardilleux and Yves Lepage",
year = "2011",
language = "English",
isbn = "9784905166023",
pages = "150--159",
booktitle = "PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation",

}

TY - CHAP

T1 - Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables

AU - Luo, Juan

AU - Lardilleux, Adrien

AU - Lepage, Yves

PY - 2011

Y1 - 2011

N2 - This paper describes an approach to improve the performance of sampling-based multilingual alignment on translation tasks by investigating the distribution of n-grams in the translation tables. This approach consists in enforcing the alignment of n-grams. The quality of phrase translation tables output by this approach and that of MGIZA++ is compared in statistical machine translation tasks. Significant improvements for this approach are reported. In addition, merging translation tables is shown to outperform state-of-the-art techniques.

AB - This paper describes an approach to improve the performance of sampling-based multilingual alignment on translation tasks by investigating the distribution of n-grams in the translation tables. This approach consists in enforcing the alignment of n-grams. The quality of phrase translation tables output by this approach and that of MGIZA++ is compared in statistical machine translation tasks. Significant improvements for this approach are reported. In addition, merging translation tables is shown to outperform state-of-the-art techniques.

KW - Alignment

KW - Phrase translation table

KW - Statistical machine translation

UR - http://www.scopus.com/inward/record.url?scp=84863876792&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863876792&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:84863876792

SN - 9784905166023

SP - 150

EP - 159

BT - PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

ER -