Cross-modal analysis between phonation differences and texture images based on sentiment correlations

Win Thuzar Kyaw, Yoshinori Sagisaka

研究成果: Conference article査読

1 被引用数 (Scopus)

抄録

Motivated by the success of speech characteristics representation by color attributes, we analyzed the cross-modal sentiment correlations between voice source characteristics and textural image characteristics. For the analysis, we employed vowel sounds with representative three phonation differences (modal, creaky and breathy) and 36 texture images with 36 semantic attributes (e.g., banded, cracked and scaly) annotated one semantic attribute for each texture. By asking 40 subjects to select the most fitted textures from 36 figures with different textures after listening 30 speech samples with different phonations, we measured the correlations between acoustic parameters showing voice source variations and the parameters of selected textural image differences showing coarseness, contrast, directionality, busyness, complexity and strength. From the texture classifications, voice characteristics can be roughly characterized by textural differences: modal-gauzy, banded and smeared, creaky-porous, crystalline, cracked and scaly, breathy-smeared, freckled and stained. We have also found significant correlations between voice source acoustic parameters and textural parameters. These correlations suggest the possibility of cross-modal mapping between voice source characteristics and textural parameters, which enables visualization of speech information with source variations reflecting human sentiment perception.

本文言語English
ページ(範囲)679-683
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2017-August
DOI
出版ステータスPublished - 2017
外部発表はい
イベント18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden
継続期間: 2017 8 202017 8 24

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Cross-modal analysis between phonation differences and texture images based on sentiment correlations」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル