TY - GEN
T1 - Automatic scoring method for open answer task in the SJ-CAT speaking test considering utterance difficulty level
AU - Lu, Hao
AU - Yamada, Takeshi
AU - Imai, Shingo
AU - Shinozaki, Takahiro
AU - Nisimura, Ryuichi
AU - Ishizuka, Kenkichi
AU - Makino, Shoji
AU - Kitawaki, Nobuhiko
N1 - Publisher Copyright:
© 2014 Asia-Pacific Signal and Information Processing Ass.
PY - 2014/2/12
Y1 - 2014/2/12
N2 - In this paper, we propose an automatic scoring method for the open answer task of the Japanese speaking test SJ-CAT. The proposed method first extracts a set of features from an input answer utterance and then estimates a vocabulary richness score by human raters, which ranges from 0 to 4, by employing SVR (support vector regression). We devised a novel set of features, namely text statistics weighted by word reliability, to assess the abundance of vocabulary and expression, and degree of word relevance based on the hierarchical distance in a thesaurus to evaluate the suitability of vocabulary. We confirmed experimentally that the proposed method provides good estimates of the human richness score, with a correlation coefficient of 0.92 and an RMSE (root mean square error) of 0.56. We also showed that the proposed method is relatively robust to differences among examinees and among questions used for training and testing.
AB - In this paper, we propose an automatic scoring method for the open answer task of the Japanese speaking test SJ-CAT. The proposed method first extracts a set of features from an input answer utterance and then estimates a vocabulary richness score by human raters, which ranges from 0 to 4, by employing SVR (support vector regression). We devised a novel set of features, namely text statistics weighted by word reliability, to assess the abundance of vocabulary and expression, and degree of word relevance based on the hierarchical distance in a thesaurus to evaluate the suitability of vocabulary. We confirmed experimentally that the proposed method provides good estimates of the human richness score, with a correlation coefficient of 0.92 and an RMSE (root mean square error) of 0.56. We also showed that the proposed method is relatively robust to differences among examinees and among questions used for training and testing.
UR - http://www.scopus.com/inward/record.url?scp=84949924127&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84949924127&partnerID=8YFLogxK
U2 - 10.1109/APSIPA.2014.7041583
DO - 10.1109/APSIPA.2014.7041583
M3 - Conference contribution
AN - SCOPUS:84949924127
T3 - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
BT - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
Y2 - 9 December 2014 through 12 December 2014
ER -