Cellular-phone based speech-to-speech translation system atr-matrix

Rainer Gruhn, Harald Singer, Hajvme Tsukada, Masaki Naito, Atsushi Nishino, Atsushi Nakamura, Yoshinori Sagisaka, Satoshi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
Publication statusPublished - 2000
Externally publishedYes
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 2000 Oct 162000 Oct 20

Other

Other6th International Conference on Spoken Language Processing, ICSLP 2000
CountryChina
CityBeijing
Period00/10/1600/10/20

Fingerprint

acoustics
telephone system
hardware
paradigm
Speech-to-speech Translation
Phone
Data Base
Translation System
evaluation
Acoustics
Hardware
Speech Recognition
Data Collection
Paradigm
Telephone
Prototype
Evaluation
Train

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics

Cite this

Gruhn, R., Singer, H., Tsukada, H., Naito, M., Nishino, A., Nakamura, A., ... Nakamura, S. (2000). Cellular-phone based speech-to-speech translation system atr-matrix. In 6th International Conference on Spoken Language Processing, ICSLP 2000 International Speech Communication Association.

Cellular-phone based speech-to-speech translation system atr-matrix. / Gruhn, Rainer; Singer, Harald; Tsukada, Hajvme; Naito, Masaki; Nishino, Atsushi; Nakamura, Atsushi; Sagisaka, Yoshinori; Nakamura, Satoshi.

6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 2000.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gruhn, R, Singer, H, Tsukada, H, Naito, M, Nishino, A, Nakamura, A, Sagisaka, Y & Nakamura, S 2000, Cellular-phone based speech-to-speech translation system atr-matrix. in 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 6th International Conference on Spoken Language Processing, ICSLP 2000, Beijing, China, 00/10/16.
Gruhn R, Singer H, Tsukada H, Naito M, Nishino A, Nakamura A et al. Cellular-phone based speech-to-speech translation system atr-matrix. In 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association. 2000
Gruhn, Rainer ; Singer, Harald ; Tsukada, Hajvme ; Naito, Masaki ; Nishino, Atsushi ; Nakamura, Atsushi ; Sagisaka, Yoshinori ; Nakamura, Satoshi. / Cellular-phone based speech-to-speech translation system atr-matrix. 6th International Conference on Spoken Language Processing, ICSLP 2000. International Speech Communication Association, 2000.
@inproceedings{59262f446aef48c1a1302725a11a190f,
title = "Cellular-phone based speech-to-speech translation system atr-matrix",
abstract = "We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.",
author = "Rainer Gruhn and Harald Singer and Hajvme Tsukada and Masaki Naito and Atsushi Nishino and Atsushi Nakamura and Yoshinori Sagisaka and Satoshi Nakamura",
year = "2000",
language = "English",
booktitle = "6th International Conference on Spoken Language Processing, ICSLP 2000",
publisher = "International Speech Communication Association",

}

TY - GEN

T1 - Cellular-phone based speech-to-speech translation system atr-matrix

AU - Gruhn, Rainer

AU - Singer, Harald

AU - Tsukada, Hajvme

AU - Naito, Masaki

AU - Nishino, Atsushi

AU - Nakamura, Atsushi

AU - Sagisaka, Yoshinori

AU - Nakamura, Satoshi

PY - 2000

Y1 - 2000

N2 - We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.

AB - We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.

UR - http://www.scopus.com/inward/record.url?scp=85009074734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009074734&partnerID=8YFLogxK

M3 - Conference contribution

BT - 6th International Conference on Spoken Language Processing, ICSLP 2000

PB - International Speech Communication Association

ER -