Hybrid modeling of PHMM and HMM for speech recognition

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    A hybrid acoustic model of Partly Hidden Markov Model (PHMM) and HMM is proposed. PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features. It can realize the observation dependent behaviors in both observations and state transitions. It achieved good performance but some errors with different trend from HMM still remained. In this paper, we designed a new acoustic model on the basis of PHMM, in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. In this framework, if a word hypothesis is given a low score by either PHMM or HMM, it almost loses possibilities to be a probable candidate. Since many errors are due to the high-scores of incorrect categories rather than the low-score of the correct category, this property contributed to reduce errors. More over, the proposed model is more stable than PHMM because the higher order statistics of PHMM, which is generally accurate but sometimes less reliable, is smoothed by the lower order statistics of HMM, which is not so accurate but robust. Experimental results showed the effectiveness of proposed model: it reduced the word errors by 25% compared with HMM.

    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
    Pages140-143
    Number of pages4
    Volume1
    Publication statusPublished - 2003
    Event2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong, Hong Kong
    Duration: 2003 Apr 62003 Apr 10

    Other

    Other2003 IEEE International Conference on Accoustics, Speech, and Signal Processing
    CountryHong Kong
    CityHong Kong
    Period03/4/603/4/10

    Fingerprint

    speech recognition
    Hidden Markov models
    Speech recognition
    Acoustics
    Higher order statistics
    acoustics
    statistics
    Statistics
    transition probabilities
    trends

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Signal Processing
    • Acoustics and Ultrasonics

    Cite this

    Ogawa, T., & Kobayashi, T. (2003). Hybrid modeling of PHMM and HMM for speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 1, pp. 140-143)

    Hybrid modeling of PHMM and HMM for speech recognition. / Ogawa, Tetsuji; Kobayashi, Tetsunori.

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. p. 140-143.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Ogawa, T & Kobayashi, T 2003, Hybrid modeling of PHMM and HMM for speech recognition. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 1, pp. 140-143, 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing, Hong Kong, Hong Kong, 03/4/6.
    Ogawa T, Kobayashi T. Hybrid modeling of PHMM and HMM for speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1. 2003. p. 140-143
    Ogawa, Tetsuji ; Kobayashi, Tetsunori. / Hybrid modeling of PHMM and HMM for speech recognition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 2003. pp. 140-143
    @inproceedings{254ad368af8a4a2690433aef164a1495,
    title = "Hybrid modeling of PHMM and HMM for speech recognition",
    abstract = "A hybrid acoustic model of Partly Hidden Markov Model (PHMM) and HMM is proposed. PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features. It can realize the observation dependent behaviors in both observations and state transitions. It achieved good performance but some errors with different trend from HMM still remained. In this paper, we designed a new acoustic model on the basis of PHMM, in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. In this framework, if a word hypothesis is given a low score by either PHMM or HMM, it almost loses possibilities to be a probable candidate. Since many errors are due to the high-scores of incorrect categories rather than the low-score of the correct category, this property contributed to reduce errors. More over, the proposed model is more stable than PHMM because the higher order statistics of PHMM, which is generally accurate but sometimes less reliable, is smoothed by the lower order statistics of HMM, which is not so accurate but robust. Experimental results showed the effectiveness of proposed model: it reduced the word errors by 25{\%} compared with HMM.",
    author = "Tetsuji Ogawa and Tetsunori Kobayashi",
    year = "2003",
    language = "English",
    volume = "1",
    pages = "140--143",
    booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

    }

    TY - GEN

    T1 - Hybrid modeling of PHMM and HMM for speech recognition

    AU - Ogawa, Tetsuji

    AU - Kobayashi, Tetsunori

    PY - 2003

    Y1 - 2003

    N2 - A hybrid acoustic model of Partly Hidden Markov Model (PHMM) and HMM is proposed. PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features. It can realize the observation dependent behaviors in both observations and state transitions. It achieved good performance but some errors with different trend from HMM still remained. In this paper, we designed a new acoustic model on the basis of PHMM, in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. In this framework, if a word hypothesis is given a low score by either PHMM or HMM, it almost loses possibilities to be a probable candidate. Since many errors are due to the high-scores of incorrect categories rather than the low-score of the correct category, this property contributed to reduce errors. More over, the proposed model is more stable than PHMM because the higher order statistics of PHMM, which is generally accurate but sometimes less reliable, is smoothed by the lower order statistics of HMM, which is not so accurate but robust. Experimental results showed the effectiveness of proposed model: it reduced the word errors by 25% compared with HMM.

    AB - A hybrid acoustic model of Partly Hidden Markov Model (PHMM) and HMM is proposed. PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features. It can realize the observation dependent behaviors in both observations and state transitions. It achieved good performance but some errors with different trend from HMM still remained. In this paper, we designed a new acoustic model on the basis of PHMM, in which the observation and state transition probabilities are defined by the geometric means of PHMM-based ones and HMM-based ones. In this framework, if a word hypothesis is given a low score by either PHMM or HMM, it almost loses possibilities to be a probable candidate. Since many errors are due to the high-scores of incorrect categories rather than the low-score of the correct category, this property contributed to reduce errors. More over, the proposed model is more stable than PHMM because the higher order statistics of PHMM, which is generally accurate but sometimes less reliable, is smoothed by the lower order statistics of HMM, which is not so accurate but robust. Experimental results showed the effectiveness of proposed model: it reduced the word errors by 25% compared with HMM.

    UR - http://www.scopus.com/inward/record.url?scp=0141480070&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0141480070&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:0141480070

    VL - 1

    SP - 140

    EP - 143

    BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

    ER -