F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis

Yoko Kokenawa, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    9 Citations (Scopus)

    Abstract

    Aiming at prosody control for speech synthesis expressing speaking attitudes, F0 shapes were characterized by their perceptual impressions. To directly correlate F0 shapes with perceptual impressions, single word utterances "n" extracted from daily conversations were employed. The analysis showed that speaking attitudes were manifested in the global F0 control of "n" as the differences of their average height (high-low) and dynamic patterns (rise, flat, fall and rise&fall). Next, controlled utterances of "n" were perceptually examined through Multiple Dimensional Scaling analysis to confirm F0 control freedoms found in the analysis. The result showed the three-dimensional structure of a perceptual impression space and factor dependent F0 control characteristics. The positive-negative attitude can be controlled by average F0 height while those of confident-doubtful or allowable -unacceptable are manifested through dynamic F0 patterns. These findings provide new possibilities of systematic F0 control for conversational speech synthesis with speaking attitudes using corpus-based approach.

    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    VolumeI
    ISBN (Print)0780388747, 9780780388741
    DOIs
    Publication statusPublished - 2005
    Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA
    Duration: 2005 Mar 182005 Mar 23

    Other

    Other2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
    CityPhiladelphia, PA
    Period05/3/1805/3/23

    Fingerprint

    scaling
    Speech synthesis
    flat patterns
    conversation
    synthesis

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Signal Processing
    • Acoustics and Ultrasonics

    Cite this

    Kokenawa, Y., Tsuzaki, M., Kato, H., & Sagisaka, Y. (2005). F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. I). [1415103] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2005.1415103

    F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis. / Kokenawa, Yoko; Tsuzaki, Minoru; Kato, Hiroaki; Sagisaka, Yoshinori.

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I Institute of Electrical and Electronics Engineers Inc., 2005. 1415103.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kokenawa, Y, Tsuzaki, M, Kato, H & Sagisaka, Y 2005, F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. I, 1415103, Institute of Electrical and Electronics Engineers Inc., 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, Philadelphia, PA, 05/3/18. https://doi.org/10.1109/ICASSP.2005.1415103
    Kokenawa Y, Tsuzaki M, Kato H, Sagisaka Y. F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I. Institute of Electrical and Electronics Engineers Inc. 2005. 1415103 https://doi.org/10.1109/ICASSP.2005.1415103
    Kokenawa, Yoko ; Tsuzaki, Minoru ; Kato, Hiroaki ; Sagisaka, Yoshinori. / F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I Institute of Electrical and Electronics Engineers Inc., 2005.
    @inproceedings{26ada848dc144f3eb7952a5cfc2501cd,
    title = "F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis",
    abstract = "Aiming at prosody control for speech synthesis expressing speaking attitudes, F0 shapes were characterized by their perceptual impressions. To directly correlate F0 shapes with perceptual impressions, single word utterances {"}n{"} extracted from daily conversations were employed. The analysis showed that speaking attitudes were manifested in the global F0 control of {"}n{"} as the differences of their average height (high-low) and dynamic patterns (rise, flat, fall and rise&fall). Next, controlled utterances of {"}n{"} were perceptually examined through Multiple Dimensional Scaling analysis to confirm F0 control freedoms found in the analysis. The result showed the three-dimensional structure of a perceptual impression space and factor dependent F0 control characteristics. The positive-negative attitude can be controlled by average F0 height while those of confident-doubtful or allowable -unacceptable are manifested through dynamic F0 patterns. These findings provide new possibilities of systematic F0 control for conversational speech synthesis with speaking attitudes using corpus-based approach.",
    author = "Yoko Kokenawa and Minoru Tsuzaki and Hiroaki Kato and Yoshinori Sagisaka",
    year = "2005",
    doi = "10.1109/ICASSP.2005.1415103",
    language = "English",
    isbn = "0780388747",
    volume = "I",
    booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - F0 control characterization by perceptual impressions on speaking attitudes using Multiple Dimensional Scaling analysis

    AU - Kokenawa, Yoko

    AU - Tsuzaki, Minoru

    AU - Kato, Hiroaki

    AU - Sagisaka, Yoshinori

    PY - 2005

    Y1 - 2005

    N2 - Aiming at prosody control for speech synthesis expressing speaking attitudes, F0 shapes were characterized by their perceptual impressions. To directly correlate F0 shapes with perceptual impressions, single word utterances "n" extracted from daily conversations were employed. The analysis showed that speaking attitudes were manifested in the global F0 control of "n" as the differences of their average height (high-low) and dynamic patterns (rise, flat, fall and rise&fall). Next, controlled utterances of "n" were perceptually examined through Multiple Dimensional Scaling analysis to confirm F0 control freedoms found in the analysis. The result showed the three-dimensional structure of a perceptual impression space and factor dependent F0 control characteristics. The positive-negative attitude can be controlled by average F0 height while those of confident-doubtful or allowable -unacceptable are manifested through dynamic F0 patterns. These findings provide new possibilities of systematic F0 control for conversational speech synthesis with speaking attitudes using corpus-based approach.

    AB - Aiming at prosody control for speech synthesis expressing speaking attitudes, F0 shapes were characterized by their perceptual impressions. To directly correlate F0 shapes with perceptual impressions, single word utterances "n" extracted from daily conversations were employed. The analysis showed that speaking attitudes were manifested in the global F0 control of "n" as the differences of their average height (high-low) and dynamic patterns (rise, flat, fall and rise&fall). Next, controlled utterances of "n" were perceptually examined through Multiple Dimensional Scaling analysis to confirm F0 control freedoms found in the analysis. The result showed the three-dimensional structure of a perceptual impression space and factor dependent F0 control characteristics. The positive-negative attitude can be controlled by average F0 height while those of confident-doubtful or allowable -unacceptable are manifested through dynamic F0 patterns. These findings provide new possibilities of systematic F0 control for conversational speech synthesis with speaking attitudes using corpus-based approach.

    UR - http://www.scopus.com/inward/record.url?scp=33646770621&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=33646770621&partnerID=8YFLogxK

    U2 - 10.1109/ICASSP.2005.1415103

    DO - 10.1109/ICASSP.2005.1415103

    M3 - Conference contribution

    SN - 0780388747

    SN - 9780780388741

    VL - I

    BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -