An improved symbolic aggregate approximation distance measure based on its statistical features

Chaw Thet Zan, Hayato Yamana

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    5 Citations (Scopus)

    Abstract

    The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).

    Original languageEnglish
    Title of host publication18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings
    PublisherAssociation for Computing Machinery
    Pages72-80
    Number of pages9
    VolumePart F126325
    ISBN (Electronic)9781450348072
    DOIs
    Publication statusPublished - 2016 Nov 28
    Event18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Singapore, Singapore
    Duration: 2016 Nov 282016 Nov 30

    Other

    Other18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016
    CountrySingapore
    CitySingapore
    Period16/11/2816/11/30

    Fingerprint

    Time series
    Gaussian distribution

    Keywords

    • Classi-cation
    • Dimension reduction
    • Statistical features
    • Symbolic representation
    • Time series

    ASJC Scopus subject areas

    • Human-Computer Interaction
    • Computer Networks and Communications
    • Computer Vision and Pattern Recognition
    • Software

    Cite this

    Thet Zan, C., & Yamana, H. (2016). An improved symbolic aggregate approximation distance measure based on its statistical features. In 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings (Vol. Part F126325, pp. 72-80). Association for Computing Machinery. https://doi.org/10.1145/3011141.3011146

    An improved symbolic aggregate approximation distance measure based on its statistical features. / Thet Zan, Chaw; Yamana, Hayato.

    18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Vol. Part F126325 Association for Computing Machinery, 2016. p. 72-80.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Thet Zan, C & Yamana, H 2016, An improved symbolic aggregate approximation distance measure based on its statistical features. in 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. vol. Part F126325, Association for Computing Machinery, pp. 72-80, 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016, Singapore, Singapore, 16/11/28. https://doi.org/10.1145/3011141.3011146
    Thet Zan C, Yamana H. An improved symbolic aggregate approximation distance measure based on its statistical features. In 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Vol. Part F126325. Association for Computing Machinery. 2016. p. 72-80 https://doi.org/10.1145/3011141.3011146
    Thet Zan, Chaw ; Yamana, Hayato. / An improved symbolic aggregate approximation distance measure based on its statistical features. 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings. Vol. Part F126325 Association for Computing Machinery, 2016. pp. 72-80
    @inproceedings{e463480e5a894a55b02b5a5cd5af1fbb,
    title = "An improved symbolic aggregate approximation distance measure based on its statistical features",
    abstract = "The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).",
    keywords = "Classi-cation, Dimension reduction, Statistical features, Symbolic representation, Time series",
    author = "{Thet Zan}, Chaw and Hayato Yamana",
    year = "2016",
    month = "11",
    day = "28",
    doi = "10.1145/3011141.3011146",
    language = "English",
    volume = "Part F126325",
    pages = "72--80",
    booktitle = "18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings",
    publisher = "Association for Computing Machinery",

    }

    TY - GEN

    T1 - An improved symbolic aggregate approximation distance measure based on its statistical features

    AU - Thet Zan, Chaw

    AU - Yamana, Hayato

    PY - 2016/11/28

    Y1 - 2016/11/28

    N2 - The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).

    AB - The challenges in effcient data representation and similarity measures on massive amounts of time series have enormous impact on many applications. This paper addresses an improvement on Symbolic Aggregate approXimation (SAX), is one of the effcient representations for time series mining. Because SAX represents its symbols by the average (mean) value of a segment with the assumption of Gaussian distribution, it is insuficient to serve the entire deterministic information and causes sometimes incorrect results in time series classiffcation. In this work, SAX representation and distance measure is improved with the addition of another moment of the prior distribution, standard deviation; SAX SD is proposed. We provide comprehensive analysis for the proposed SAX SD and conrm both the highest classi-fication accuracy and the highest dimensionality reduction ratio on University of California, Riverside (UCR) datasets in comparison to state of the art methods such as SAX, Extended SAX (ESAX) and SAX Trend Distance (SAX TD).

    KW - Classi-cation

    KW - Dimension reduction

    KW - Statistical features

    KW - Symbolic representation

    KW - Time series

    UR - http://www.scopus.com/inward/record.url?scp=85014891332&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85014891332&partnerID=8YFLogxK

    U2 - 10.1145/3011141.3011146

    DO - 10.1145/3011141.3011146

    M3 - Conference contribution

    VL - Part F126325

    SP - 72

    EP - 80

    BT - 18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings

    PB - Association for Computing Machinery

    ER -