TY - GEN
T1 - An improved symbolic aggregate approximation distance measure based on its statistical features
AU - Thet Zan, Chaw
AU - Yamana, Hayato
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/11/28
Y1 - 2016/11/28
N2 - The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).
AB - The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).
KW - Classi-cation
KW - Dimension reduction
KW - Statistical features
KW - Symbolic representation
KW - Time series
UR - http://www.scopus.com/inward/record.url?scp=85014891332&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85014891332&partnerID=8YFLogxK
U2 - 10.1145/3011141.3011146
DO - 10.1145/3011141.3011146
M3 - Conference contribution
AN - SCOPUS:85014891332
T3 - ACM International Conference Proceeding Series
SP - 72
EP - 80
BT - 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings
A2 - Indrawan-Santiago, Maria
A2 - Anderst-Kotsis, Gabriele
A2 - Steinbauer, Matthias
A2 - Khalil, Ismail
PB - Association for Computing Machinery
T2 - 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016
Y2 - 28 November 2016 through 30 November 2016
ER -