Phrase recognition in conversational speech using prosodic and phonemic information

Shigeki Okawa, Takashi Endo, Tetsunori Kobayashi, Katsuhiko Shirai

    Research output: Contribution to journalArticle

    2 Citations (Scopus)

    Abstract

    In this paper, a new scheme for phrase recognition in conversational speech is proposed, in which prosodic and phonemic information processing are usefully combined. This approach is employed both to produce candidates of phrase boundaries and to discriminate phonemes. The fundamental frequency patterns of continuous utterances are statistically analyzed and the likelihood of the occurrence of a phrase boundary is calculated for every frame. At the same time, the likelihood of phonemic characteristics of each frame can be obtained using a hierarchical clustering method. These two scores, along with lexical and grammatical constraints, can be effectively utilized to develop a possible word sequences or a word lattices which correspond to the continuous speech utterances. Our preliminary experiment shows the feasibility of applying prosody for continuous speech recognition especially for conversational style utterances.

    Original languageEnglish
    Pages (from-to)44-50
    Number of pages7
    JournalIEICE Transactions on Information and Systems
    VolumeE76-D
    Issue number1
    Publication statusPublished - 1993 Jan

    Fingerprint

    Continuous speech recognition
    Experiments

    ASJC Scopus subject areas

    • Computer Graphics and Computer-Aided Design
    • Information Systems
    • Software

    Cite this

    Phrase recognition in conversational speech using prosodic and phonemic information. / Okawa, Shigeki; Endo, Takashi; Kobayashi, Tetsunori; Shirai, Katsuhiko.

    In: IEICE Transactions on Information and Systems, Vol. E76-D, No. 1, 01.1993, p. 44-50.

    Research output: Contribution to journalArticle

    @article{02f98d739bb24ad8ab231ec697c3160d,
    title = "Phrase recognition in conversational speech using prosodic and phonemic information",
    abstract = "In this paper, a new scheme for phrase recognition in conversational speech is proposed, in which prosodic and phonemic information processing are usefully combined. This approach is employed both to produce candidates of phrase boundaries and to discriminate phonemes. The fundamental frequency patterns of continuous utterances are statistically analyzed and the likelihood of the occurrence of a phrase boundary is calculated for every frame. At the same time, the likelihood of phonemic characteristics of each frame can be obtained using a hierarchical clustering method. These two scores, along with lexical and grammatical constraints, can be effectively utilized to develop a possible word sequences or a word lattices which correspond to the continuous speech utterances. Our preliminary experiment shows the feasibility of applying prosody for continuous speech recognition especially for conversational style utterances.",
    author = "Shigeki Okawa and Takashi Endo and Tetsunori Kobayashi and Katsuhiko Shirai",
    year = "1993",
    month = "1",
    language = "English",
    volume = "E76-D",
    pages = "44--50",
    journal = "IEICE Transactions on Information and Systems",
    issn = "0916-8532",
    publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
    number = "1",

    }

    TY - JOUR

    T1 - Phrase recognition in conversational speech using prosodic and phonemic information

    AU - Okawa, Shigeki

    AU - Endo, Takashi

    AU - Kobayashi, Tetsunori

    AU - Shirai, Katsuhiko

    PY - 1993/1

    Y1 - 1993/1

    N2 - In this paper, a new scheme for phrase recognition in conversational speech is proposed, in which prosodic and phonemic information processing are usefully combined. This approach is employed both to produce candidates of phrase boundaries and to discriminate phonemes. The fundamental frequency patterns of continuous utterances are statistically analyzed and the likelihood of the occurrence of a phrase boundary is calculated for every frame. At the same time, the likelihood of phonemic characteristics of each frame can be obtained using a hierarchical clustering method. These two scores, along with lexical and grammatical constraints, can be effectively utilized to develop a possible word sequences or a word lattices which correspond to the continuous speech utterances. Our preliminary experiment shows the feasibility of applying prosody for continuous speech recognition especially for conversational style utterances.

    AB - In this paper, a new scheme for phrase recognition in conversational speech is proposed, in which prosodic and phonemic information processing are usefully combined. This approach is employed both to produce candidates of phrase boundaries and to discriminate phonemes. The fundamental frequency patterns of continuous utterances are statistically analyzed and the likelihood of the occurrence of a phrase boundary is calculated for every frame. At the same time, the likelihood of phonemic characteristics of each frame can be obtained using a hierarchical clustering method. These two scores, along with lexical and grammatical constraints, can be effectively utilized to develop a possible word sequences or a word lattices which correspond to the continuous speech utterances. Our preliminary experiment shows the feasibility of applying prosody for continuous speech recognition especially for conversational style utterances.

    UR - http://www.scopus.com/inward/record.url?scp=0027347691&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0027347691&partnerID=8YFLogxK

    M3 - Article

    AN - SCOPUS:0027347691

    VL - E76-D

    SP - 44

    EP - 50

    JO - IEICE Transactions on Information and Systems

    JF - IEICE Transactions on Information and Systems

    SN - 0916-8532

    IS - 1

    ER -