Leveraging the advantages of associative alignment methods for PB-SMT systems

Baosong Yang, Yves Lepage

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.

Original languageEnglish
Title of host publicationHuman Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers
PublisherSpringer-Verlag
Pages214-228
Number of pages15
ISBN (Print)9783319937816
DOIs
Publication statusPublished - 2018 Jan 1
Event7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC 2015 - Poznan, Poland
Duration: 2015 Nov 272015 Nov 29

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10930 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC 2015
CountryPoland
CityPoznan
Period15/11/2715/11/29

Fingerprint

Surface mount technology
Alignment
Statistical Machine Translation
Multiprocessing
Probabilistic Approach
Approximation
Sampling
Engineering
Processing
Modeling
Training

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Yang, B., & Lepage, Y. (2018). Leveraging the advantages of associative alignment methods for PB-SMT systems. In Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers (pp. 214-228). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10930 LNAI). Springer-Verlag. https://doi.org/10.1007/978-3-319-93782-3_16

Leveraging the advantages of associative alignment methods for PB-SMT systems. / Yang, Baosong; Lepage, Yves.

Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers. Springer-Verlag, 2018. p. 214-228 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10930 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yang, B & Lepage, Y 2018, Leveraging the advantages of associative alignment methods for PB-SMT systems. in Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10930 LNAI, Springer-Verlag, pp. 214-228, 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC 2015, Poznan, Poland, 15/11/27. https://doi.org/10.1007/978-3-319-93782-3_16
Yang B, Lepage Y. Leveraging the advantages of associative alignment methods for PB-SMT systems. In Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers. Springer-Verlag. 2018. p. 214-228. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-93782-3_16
Yang, Baosong ; Lepage, Yves. / Leveraging the advantages of associative alignment methods for PB-SMT systems. Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers. Springer-Verlag, 2018. pp. 214-228 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{9e28ed52e38e488689b6ba9ece4e4c5a,
title = "Leveraging the advantages of associative alignment methods for PB-SMT systems",
abstract = "Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.",
author = "Baosong Yang and Yves Lepage",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-319-93782-3_16",
language = "English",
isbn = "9783319937816",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "214--228",
booktitle = "Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers",

}

TY - GEN

T1 - Leveraging the advantages of associative alignment methods for PB-SMT systems

AU - Yang, Baosong

AU - Lepage, Yves

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.

AB - Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.

UR - http://www.scopus.com/inward/record.url?scp=85049108485&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049108485&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-93782-3_16

DO - 10.1007/978-3-319-93782-3_16

M3 - Conference contribution

SN - 9783319937816

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 214

EP - 228

BT - Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers

PB - Springer-Verlag

ER -