Application of auditory image model for speech event detection

Minoru Tsuzaki, Satomi Tanaka, Hiroaki Kato, Yoshinori Sagisaka

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    To provide an appropriate model for perception of temporal structures of speech, we applied a comprehensive computational model of the human auditory peripherals to detect changes in speech signals that potentially indicate arrivals of new events. In each tonotopic sub-band, an increase in the activation level was taken into account for the plausibility of a new event, while a decrease was ignored. The total contour obtained by integrating the sub-band information exhibited sharp peaks and dips compared to the loudness contour. A quantitative evaluation to estimate the speaking rate of natural speech also demonstrated that the event-plausibility model performs better than the loudness model.

    Original languageEnglish
    Title of host publication9th European Conference on Speech Communication and Technology
    Pages677-680
    Number of pages4
    Publication statusPublished - 2005
    Event9th European Conference on Speech Communication and Technology - Lisbon
    Duration: 2005 Sep 42005 Sep 8

    Other

    Other9th European Conference on Speech Communication and Technology
    CityLisbon
    Period05/9/405/9/8

    Fingerprint

    Chemical activation

    ASJC Scopus subject areas

    • Engineering(all)

    Cite this

    Tsuzaki, M., Tanaka, S., Kato, H., & Sagisaka, Y. (2005). Application of auditory image model for speech event detection. In 9th European Conference on Speech Communication and Technology (pp. 677-680)

    Application of auditory image model for speech event detection. / Tsuzaki, Minoru; Tanaka, Satomi; Kato, Hiroaki; Sagisaka, Yoshinori.

    9th European Conference on Speech Communication and Technology. 2005. p. 677-680.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Tsuzaki, M, Tanaka, S, Kato, H & Sagisaka, Y 2005, Application of auditory image model for speech event detection. in 9th European Conference on Speech Communication and Technology. pp. 677-680, 9th European Conference on Speech Communication and Technology, Lisbon, 05/9/4.
    Tsuzaki M, Tanaka S, Kato H, Sagisaka Y. Application of auditory image model for speech event detection. In 9th European Conference on Speech Communication and Technology. 2005. p. 677-680
    Tsuzaki, Minoru ; Tanaka, Satomi ; Kato, Hiroaki ; Sagisaka, Yoshinori. / Application of auditory image model for speech event detection. 9th European Conference on Speech Communication and Technology. 2005. pp. 677-680
    @inproceedings{6715a85669454cab9615ed68ed3f8df1,
    title = "Application of auditory image model for speech event detection",
    abstract = "To provide an appropriate model for perception of temporal structures of speech, we applied a comprehensive computational model of the human auditory peripherals to detect changes in speech signals that potentially indicate arrivals of new events. In each tonotopic sub-band, an increase in the activation level was taken into account for the plausibility of a new event, while a decrease was ignored. The total contour obtained by integrating the sub-band information exhibited sharp peaks and dips compared to the loudness contour. A quantitative evaluation to estimate the speaking rate of natural speech also demonstrated that the event-plausibility model performs better than the loudness model.",
    author = "Minoru Tsuzaki and Satomi Tanaka and Hiroaki Kato and Yoshinori Sagisaka",
    year = "2005",
    language = "English",
    pages = "677--680",
    booktitle = "9th European Conference on Speech Communication and Technology",

    }

    TY - GEN

    T1 - Application of auditory image model for speech event detection

    AU - Tsuzaki, Minoru

    AU - Tanaka, Satomi

    AU - Kato, Hiroaki

    AU - Sagisaka, Yoshinori

    PY - 2005

    Y1 - 2005

    N2 - To provide an appropriate model for perception of temporal structures of speech, we applied a comprehensive computational model of the human auditory peripherals to detect changes in speech signals that potentially indicate arrivals of new events. In each tonotopic sub-band, an increase in the activation level was taken into account for the plausibility of a new event, while a decrease was ignored. The total contour obtained by integrating the sub-band information exhibited sharp peaks and dips compared to the loudness contour. A quantitative evaluation to estimate the speaking rate of natural speech also demonstrated that the event-plausibility model performs better than the loudness model.

    AB - To provide an appropriate model for perception of temporal structures of speech, we applied a comprehensive computational model of the human auditory peripherals to detect changes in speech signals that potentially indicate arrivals of new events. In each tonotopic sub-band, an increase in the activation level was taken into account for the plausibility of a new event, while a decrease was ignored. The total contour obtained by integrating the sub-band information exhibited sharp peaks and dips compared to the loudness contour. A quantitative evaluation to estimate the speaking rate of natural speech also demonstrated that the event-plausibility model performs better than the loudness model.

    UR - http://www.scopus.com/inward/record.url?scp=33745210145&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=33745210145&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:33745210145

    SP - 677

    EP - 680

    BT - 9th European Conference on Speech Communication and Technology

    ER -