Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages

Rashel Fam, Yves Lepage

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We release a Python module containing several tools to build analogical grids from words contained in a corpus. The module implements several previously presented algorithms. The tools are language-independent. This permits their use with any language and any writing system. We hope that the tools will ease research in morphology by allowing researchers to automatically obtain structured representations of the vocabulary contained in corpora or linguistic data. We also release analogical grids built on the vocabularies contained in 1,000 corresponding lines of the 11 different language versions of the Europarl corpus v.3. The grids were built on N-grams of different lengths, from words to 6-grams. We hope that the use of structured parallel data will foster research in comparative linguistics.

Original languageEnglish
Title of host publicationLREC 2018 - 11th International Conference on Language Resources and Evaluation
EditorsHitoshi Isahara, Bente Maegaard, Stelios Piperidis, Christopher Cieri, Thierry Declerck, Koiti Hasida, Helene Mazo, Khalid Choukri, Sara Goggi, Joseph Mariani, Asuncion Moreno, Nicoletta Calzolari, Jan Odijk, Takenobu Tokunaga
PublisherEuropean Language Resources Association (ELRA)
Pages1060-1066
Number of pages7
ISBN (Electronic)9791095546009
Publication statusPublished - 2019 Jan 1
Event11th International Conference on Language Resources and Evaluation, LREC 2018 - Miyazaki, Japan
Duration: 2018 May 72018 May 12

Other

Other11th International Conference on Language Resources and Evaluation, LREC 2018
CountryJapan
CityMiyazaki
Period18/5/718/5/12

Fingerprint

vocabulary
language
resources
linguistics
Language
N-gram
Resources
Grid
Module
Vocabulary
Writing Systems
Length
Comparative Linguistics

Keywords

  • Analogical grids
  • Analogy
  • Morphology

ASJC Scopus subject areas

  • Linguistics and Language
  • Education
  • Library and Information Sciences
  • Language and Linguistics

Cite this

Fam, R., & Lepage, Y. (2019). Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages. In H. Isahara, B. Maegaard, S. Piperidis, C. Cieri, T. Declerck, K. Hasida, H. Mazo, K. Choukri, S. Goggi, J. Mariani, A. Moreno, N. Calzolari, J. Odijk, ... T. Tokunaga (Eds.), LREC 2018 - 11th International Conference on Language Resources and Evaluation (pp. 1060-1066). European Language Resources Association (ELRA).

Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages. / Fam, Rashel; Lepage, Yves.

LREC 2018 - 11th International Conference on Language Resources and Evaluation. ed. / Hitoshi Isahara; Bente Maegaard; Stelios Piperidis; Christopher Cieri; Thierry Declerck; Koiti Hasida; Helene Mazo; Khalid Choukri; Sara Goggi; Joseph Mariani; Asuncion Moreno; Nicoletta Calzolari; Jan Odijk; Takenobu Tokunaga. European Language Resources Association (ELRA), 2019. p. 1060-1066.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fam, R & Lepage, Y 2019, Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages. in H Isahara, B Maegaard, S Piperidis, C Cieri, T Declerck, K Hasida, H Mazo, K Choukri, S Goggi, J Mariani, A Moreno, N Calzolari, J Odijk & T Tokunaga (eds), LREC 2018 - 11th International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), pp. 1060-1066, 11th International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, 18/5/7.
Fam R, Lepage Y. Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages. In Isahara H, Maegaard B, Piperidis S, Cieri C, Declerck T, Hasida K, Mazo H, Choukri K, Goggi S, Mariani J, Moreno A, Calzolari N, Odijk J, Tokunaga T, editors, LREC 2018 - 11th International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA). 2019. p. 1060-1066
Fam, Rashel ; Lepage, Yves. / Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages. LREC 2018 - 11th International Conference on Language Resources and Evaluation. editor / Hitoshi Isahara ; Bente Maegaard ; Stelios Piperidis ; Christopher Cieri ; Thierry Declerck ; Koiti Hasida ; Helene Mazo ; Khalid Choukri ; Sara Goggi ; Joseph Mariani ; Asuncion Moreno ; Nicoletta Calzolari ; Jan Odijk ; Takenobu Tokunaga. European Language Resources Association (ELRA), 2019. pp. 1060-1066
@inproceedings{9d74133b999c4c13b92c8b16610864b6,
title = "Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages",
abstract = "We release a Python module containing several tools to build analogical grids from words contained in a corpus. The module implements several previously presented algorithms. The tools are language-independent. This permits their use with any language and any writing system. We hope that the tools will ease research in morphology by allowing researchers to automatically obtain structured representations of the vocabulary contained in corpora or linguistic data. We also release analogical grids built on the vocabularies contained in 1,000 corresponding lines of the 11 different language versions of the Europarl corpus v.3. The grids were built on N-grams of different lengths, from words to 6-grams. We hope that the use of structured parallel data will foster research in comparative linguistics.",
keywords = "Analogical grids, Analogy, Morphology",
author = "Rashel Fam and Yves Lepage",
year = "2019",
month = "1",
day = "1",
language = "English",
pages = "1060--1066",
editor = "Hitoshi Isahara and Bente Maegaard and Stelios Piperidis and Christopher Cieri and Thierry Declerck and Koiti Hasida and Helene Mazo and Khalid Choukri and Sara Goggi and Joseph Mariani and Asuncion Moreno and Nicoletta Calzolari and Jan Odijk and Takenobu Tokunaga",
booktitle = "LREC 2018 - 11th International Conference on Language Resources and Evaluation",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages

AU - Fam, Rashel

AU - Lepage, Yves

PY - 2019/1/1

Y1 - 2019/1/1

N2 - We release a Python module containing several tools to build analogical grids from words contained in a corpus. The module implements several previously presented algorithms. The tools are language-independent. This permits their use with any language and any writing system. We hope that the tools will ease research in morphology by allowing researchers to automatically obtain structured representations of the vocabulary contained in corpora or linguistic data. We also release analogical grids built on the vocabularies contained in 1,000 corresponding lines of the 11 different language versions of the Europarl corpus v.3. The grids were built on N-grams of different lengths, from words to 6-grams. We hope that the use of structured parallel data will foster research in comparative linguistics.

AB - We release a Python module containing several tools to build analogical grids from words contained in a corpus. The module implements several previously presented algorithms. The tools are language-independent. This permits their use with any language and any writing system. We hope that the tools will ease research in morphology by allowing researchers to automatically obtain structured representations of the vocabulary contained in corpora or linguistic data. We also release analogical grids built on the vocabularies contained in 1,000 corresponding lines of the 11 different language versions of the Europarl corpus v.3. The grids were built on N-grams of different lengths, from words to 6-grams. We hope that the use of structured parallel data will foster research in comparative linguistics.

KW - Analogical grids

KW - Analogy

KW - Morphology

UR - http://www.scopus.com/inward/record.url?scp=85059882760&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059882760&partnerID=8YFLogxK

M3 - Conference contribution

SP - 1060

EP - 1066

BT - LREC 2018 - 11th International Conference on Language Resources and Evaluation

A2 - Isahara, Hitoshi

A2 - Maegaard, Bente

A2 - Piperidis, Stelios

A2 - Cieri, Christopher

A2 - Declerck, Thierry

A2 - Hasida, Koiti

A2 - Mazo, Helene

A2 - Choukri, Khalid

A2 - Goggi, Sara

A2 - Mariani, Joseph

A2 - Moreno, Asuncion

A2 - Calzolari, Nicoletta

A2 - Odijk, Jan

A2 - Tokunaga, Takenobu

PB - European Language Resources Association (ELRA)

ER -