Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain

Tetsunori Kobayashi, Y. Uchiyama, J. Osada, K. Shirai

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A new feature parameter space for speech recognition called PRPG (Probability Ratios between Phoneme Group pairs) has been proposed and speaker adaptive phoneme recognition has been performed. In the coordinate system proposed here, the area with the same information for speech recognition is compressed into one point. The mapping function from spectral coordinate system to proposed one is realized using a neural network. The code-vectors designed on this coordinate system are assured to be informationtheoretically more efficient than that of spectral coordinate system. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the 40% of errors are reduced by the coordinate conversion in the speaker-dependent tasks. The scores of the speakeradaptive tasks in the proposed feature domain are always superior to those of the speaker-dependent tasks in the spectral domain.

    Original languageEnglish
    Title of host publicationICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages457-460
    Number of pages4
    Volume1
    ISBN (Electronic)0780305329
    DOIs
    Publication statusPublished - 1992
    Event1992 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992 - San Francisco, United States
    Duration: 1992 Mar 231992 Mar 26

    Other

    Other1992 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992
    CountryUnited States
    CitySan Francisco
    Period92/3/2392/3/26

    Fingerprint

    Speech recognition
    Trajectories
    Neural networks

    ASJC Scopus subject areas

    • Software
    • Signal Processing
    • Electrical and Electronic Engineering

    Cite this

    Kobayashi, T., Uchiyama, Y., Osada, J., & Shirai, K. (1992). Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain. In ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing (Vol. 1, pp. 457-460). [225873] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.1992.225873

    Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain. / Kobayashi, Tetsunori; Uchiyama, Y.; Osada, J.; Shirai, K.

    ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing. Vol. 1 Institute of Electrical and Electronics Engineers Inc., 1992. p. 457-460 225873.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kobayashi, T, Uchiyama, Y, Osada, J & Shirai, K 1992, Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain. in ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing. vol. 1, 225873, Institute of Electrical and Electronics Engineers Inc., pp. 457-460, 1992 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992, San Francisco, United States, 92/3/23. https://doi.org/10.1109/ICASSP.1992.225873
    Kobayashi T, Uchiyama Y, Osada J, Shirai K. Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain. In ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing. Vol. 1. Institute of Electrical and Electronics Engineers Inc. 1992. p. 457-460. 225873 https://doi.org/10.1109/ICASSP.1992.225873
    Kobayashi, Tetsunori ; Uchiyama, Y. ; Osada, J. ; Shirai, K. / Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain. ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing. Vol. 1 Institute of Electrical and Electronics Engineers Inc., 1992. pp. 457-460
    @inproceedings{8e9225766b52469ab2a9fa3369433272,
    title = "Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain",
    abstract = "A new feature parameter space for speech recognition called PRPG (Probability Ratios between Phoneme Group pairs) has been proposed and speaker adaptive phoneme recognition has been performed. In the coordinate system proposed here, the area with the same information for speech recognition is compressed into one point. The mapping function from spectral coordinate system to proposed one is realized using a neural network. The code-vectors designed on this coordinate system are assured to be informationtheoretically more efficient than that of spectral coordinate system. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the 40{\%} of errors are reduced by the coordinate conversion in the speaker-dependent tasks. The scores of the speakeradaptive tasks in the proposed feature domain are always superior to those of the speaker-dependent tasks in the spectral domain.",
    author = "Tetsunori Kobayashi and Y. Uchiyama and J. Osada and K. Shirai",
    year = "1992",
    doi = "10.1109/ICASSP.1992.225873",
    language = "English",
    volume = "1",
    pages = "457--460",
    booktitle = "ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    address = "United States",

    }

    TY - GEN

    T1 - Speaker adaptive Phoneme recognition based on feature mapping from spectral domain to probabilistic domain

    AU - Kobayashi, Tetsunori

    AU - Uchiyama, Y.

    AU - Osada, J.

    AU - Shirai, K.

    PY - 1992

    Y1 - 1992

    N2 - A new feature parameter space for speech recognition called PRPG (Probability Ratios between Phoneme Group pairs) has been proposed and speaker adaptive phoneme recognition has been performed. In the coordinate system proposed here, the area with the same information for speech recognition is compressed into one point. The mapping function from spectral coordinate system to proposed one is realized using a neural network. The code-vectors designed on this coordinate system are assured to be informationtheoretically more efficient than that of spectral coordinate system. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the 40% of errors are reduced by the coordinate conversion in the speaker-dependent tasks. The scores of the speakeradaptive tasks in the proposed feature domain are always superior to those of the speaker-dependent tasks in the spectral domain.

    AB - A new feature parameter space for speech recognition called PRPG (Probability Ratios between Phoneme Group pairs) has been proposed and speaker adaptive phoneme recognition has been performed. In the coordinate system proposed here, the area with the same information for speech recognition is compressed into one point. The mapping function from spectral coordinate system to proposed one is realized using a neural network. The code-vectors designed on this coordinate system are assured to be informationtheoretically more efficient than that of spectral coordinate system. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the 40% of errors are reduced by the coordinate conversion in the speaker-dependent tasks. The scores of the speakeradaptive tasks in the proposed feature domain are always superior to those of the speaker-dependent tasks in the spectral domain.

    UR - http://www.scopus.com/inward/record.url?scp=85019651702&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85019651702&partnerID=8YFLogxK

    U2 - 10.1109/ICASSP.1992.225873

    DO - 10.1109/ICASSP.1992.225873

    M3 - Conference contribution

    AN - SCOPUS:85019651702

    VL - 1

    SP - 457

    EP - 460

    BT - ICASSP 1992 - 1992 International Conference on Acoustics, Speech, and Signal Processing

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -