TY - GEN
T1 - Predicting speech fluency in children using automatic acoustic features
AU - Fontan, Lionel
AU - Kim, Shinyoung
AU - De Fino, Verdiana
AU - Detey, Sylvain
N1 - Funding Information:
The authors are indebted to Prof. Mariko Kondo who helped with data collection, as well as to Dr. Julien Eychenne for his assistance with rater recruitment and his useful insights on Korean phonology. The authors also thank Dr. Saïd Jmel for his helpful statistical advices.
Publisher Copyright:
© 2022 Asia-Pacific of Signal and Information Processing Association (APSIPA).
PY - 2022
Y1 - 2022
N2 - The present study aims at predicting the speech fluency of children using automatic acoustic measures derived from forward-backward divergence segmentation (FBDS). Thirteen Korean children were recorded while reading out loud a set of sentences. Three native-Korean speakers evaluated the fluency of each sentence on a five-point scale. A FBDS algorithm was used to segment speech recordings into sub-phonemic units and silent segments. In addition to the low-level acoustic features directly derived from FBDS segments, higher-level acoustic features were computed by clustering FBDS segments into pseudo-syllables and silent breaks. Both low- and higher-level features were used to predict average ratings of speech fluency, using a leave-one-speaker-out cross-validation scheme and three regression models: a multiple linear regression, a support vector regression, and a random-forest regressor. Highly accurate predictions were achieved, with average root-mean-square errors (RMSEs) as low as 0.3. Prediction accuracy did not significantly change as a function of regression model. Using higher-level features yielded lower RMSEs than using raw FBDS features. The results of a multiple linear regression using higher-level features (R2 = 0.94) suggest that speech/silence ratio and pseudo-syllable rate are the two most important predictors of speech fluency.
AB - The present study aims at predicting the speech fluency of children using automatic acoustic measures derived from forward-backward divergence segmentation (FBDS). Thirteen Korean children were recorded while reading out loud a set of sentences. Three native-Korean speakers evaluated the fluency of each sentence on a five-point scale. A FBDS algorithm was used to segment speech recordings into sub-phonemic units and silent segments. In addition to the low-level acoustic features directly derived from FBDS segments, higher-level acoustic features were computed by clustering FBDS segments into pseudo-syllables and silent breaks. Both low- and higher-level features were used to predict average ratings of speech fluency, using a leave-one-speaker-out cross-validation scheme and three regression models: a multiple linear regression, a support vector regression, and a random-forest regressor. Highly accurate predictions were achieved, with average root-mean-square errors (RMSEs) as low as 0.3. Prediction accuracy did not significantly change as a function of regression model. Using higher-level features yielded lower RMSEs than using raw FBDS features. The results of a multiple linear regression using higher-level features (R2 = 0.94) suggest that speech/silence ratio and pseudo-syllable rate are the two most important predictors of speech fluency.
UR - http://www.scopus.com/inward/record.url?scp=85146264131&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146264131&partnerID=8YFLogxK
U2 - 10.23919/APSIPAASC55919.2022.9979884
DO - 10.23919/APSIPAASC55919.2022.9979884
M3 - Conference contribution
AN - SCOPUS:85146264131
T3 - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
SP - 1085
EP - 1090
BT - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
Y2 - 7 November 2022 through 10 November 2022
ER -