A PCA analysis of daily unwanted traffic

Kensuke Fukuda, Toshio Hirotsu, Osamu Akashi, Toshiharu Sugawara

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    8 Citations (Scopus)

    Abstract

    This paper investigates the macroscopic behavior of unwanted traffic (e.g., virus, worm, backscatter of (D)DoS or misconfiguration) passing through the Internet. The data set we used are unwanted packets measured at /18 darknet in Japan from Oct. 2006 to Apr. 2009 that included the recent Conficker outbreak. The traffic behavior is quantified by the entropy of ten packet features (e.g., 5-tuple). Then, we apply PCA (principal component analysis) to a ten dimensional entropy time series matrix to obtain a suitable representation of unwanted traffic. PCA is a well-known and studied method for finding out normal and anomalous behaviors in Internet backbone traffic, however, few studies applied it to darknet traffic. We first demonstrate the high variability nature of the entropy time series for ten packet features. Next, we show that the top four principal components are sufficiently enough to describe the original traffic behavior. In particular, the first component can be interpreted as the type of unwanted traffic (i.e., worm/virus or scanning), and the second one as the difference in communication patterns (e.g., one-to-many or many-to-one). Those two components account for 63.8% of the original data set in terms of the total variance. On the other hand, the outliers in the higher components indicate the presence of specific anomalies although most of mapped data to the components have less variability. Furthermore, we show that the scatter plot of the first and second principal component scores provides us with a better view of the macroscopic unwanted traffic behavior.

    Original languageEnglish
    Title of host publicationProceedings - International Conference on Advanced Information Networking and Applications, AINA
    Pages377-384
    Number of pages8
    DOIs
    Publication statusPublished - 2010
    Event24th IEEE International Conference on Advanced Information Networking and Applications, AINA2010 - Perth, WA
    Duration: 2010 Apr 202010 Apr 23

    Other

    Other24th IEEE International Conference on Advanced Information Networking and Applications, AINA2010
    CityPerth, WA
    Period10/4/2010/4/23

    Fingerprint

    Principal component analysis
    Entropy
    Viruses
    Time series
    Internet
    Scanning
    Communication

    ASJC Scopus subject areas

    • Engineering(all)

    Cite this

    Fukuda, K., Hirotsu, T., Akashi, O., & Sugawara, T. (2010). A PCA analysis of daily unwanted traffic. In Proceedings - International Conference on Advanced Information Networking and Applications, AINA (pp. 377-384). [5474726] https://doi.org/10.1109/AINA.2010.79

    A PCA analysis of daily unwanted traffic. / Fukuda, Kensuke; Hirotsu, Toshio; Akashi, Osamu; Sugawara, Toshiharu.

    Proceedings - International Conference on Advanced Information Networking and Applications, AINA. 2010. p. 377-384 5474726.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Fukuda, K, Hirotsu, T, Akashi, O & Sugawara, T 2010, A PCA analysis of daily unwanted traffic. in Proceedings - International Conference on Advanced Information Networking and Applications, AINA., 5474726, pp. 377-384, 24th IEEE International Conference on Advanced Information Networking and Applications, AINA2010, Perth, WA, 10/4/20. https://doi.org/10.1109/AINA.2010.79
    Fukuda K, Hirotsu T, Akashi O, Sugawara T. A PCA analysis of daily unwanted traffic. In Proceedings - International Conference on Advanced Information Networking and Applications, AINA. 2010. p. 377-384. 5474726 https://doi.org/10.1109/AINA.2010.79
    Fukuda, Kensuke ; Hirotsu, Toshio ; Akashi, Osamu ; Sugawara, Toshiharu. / A PCA analysis of daily unwanted traffic. Proceedings - International Conference on Advanced Information Networking and Applications, AINA. 2010. pp. 377-384
    @inproceedings{e6ba1b63ff0c49329203ddf2bac11271,
    title = "A PCA analysis of daily unwanted traffic",
    abstract = "This paper investigates the macroscopic behavior of unwanted traffic (e.g., virus, worm, backscatter of (D)DoS or misconfiguration) passing through the Internet. The data set we used are unwanted packets measured at /18 darknet in Japan from Oct. 2006 to Apr. 2009 that included the recent Conficker outbreak. The traffic behavior is quantified by the entropy of ten packet features (e.g., 5-tuple). Then, we apply PCA (principal component analysis) to a ten dimensional entropy time series matrix to obtain a suitable representation of unwanted traffic. PCA is a well-known and studied method for finding out normal and anomalous behaviors in Internet backbone traffic, however, few studies applied it to darknet traffic. We first demonstrate the high variability nature of the entropy time series for ten packet features. Next, we show that the top four principal components are sufficiently enough to describe the original traffic behavior. In particular, the first component can be interpreted as the type of unwanted traffic (i.e., worm/virus or scanning), and the second one as the difference in communication patterns (e.g., one-to-many or many-to-one). Those two components account for 63.8{\%} of the original data set in terms of the total variance. On the other hand, the outliers in the higher components indicate the presence of specific anomalies although most of mapped data to the components have less variability. Furthermore, we show that the scatter plot of the first and second principal component scores provides us with a better view of the macroscopic unwanted traffic behavior.",
    author = "Kensuke Fukuda and Toshio Hirotsu and Osamu Akashi and Toshiharu Sugawara",
    year = "2010",
    doi = "10.1109/AINA.2010.79",
    language = "English",
    isbn = "9780769540184",
    pages = "377--384",
    booktitle = "Proceedings - International Conference on Advanced Information Networking and Applications, AINA",

    }

    TY - GEN

    T1 - A PCA analysis of daily unwanted traffic

    AU - Fukuda, Kensuke

    AU - Hirotsu, Toshio

    AU - Akashi, Osamu

    AU - Sugawara, Toshiharu

    PY - 2010

    Y1 - 2010

    N2 - This paper investigates the macroscopic behavior of unwanted traffic (e.g., virus, worm, backscatter of (D)DoS or misconfiguration) passing through the Internet. The data set we used are unwanted packets measured at /18 darknet in Japan from Oct. 2006 to Apr. 2009 that included the recent Conficker outbreak. The traffic behavior is quantified by the entropy of ten packet features (e.g., 5-tuple). Then, we apply PCA (principal component analysis) to a ten dimensional entropy time series matrix to obtain a suitable representation of unwanted traffic. PCA is a well-known and studied method for finding out normal and anomalous behaviors in Internet backbone traffic, however, few studies applied it to darknet traffic. We first demonstrate the high variability nature of the entropy time series for ten packet features. Next, we show that the top four principal components are sufficiently enough to describe the original traffic behavior. In particular, the first component can be interpreted as the type of unwanted traffic (i.e., worm/virus or scanning), and the second one as the difference in communication patterns (e.g., one-to-many or many-to-one). Those two components account for 63.8% of the original data set in terms of the total variance. On the other hand, the outliers in the higher components indicate the presence of specific anomalies although most of mapped data to the components have less variability. Furthermore, we show that the scatter plot of the first and second principal component scores provides us with a better view of the macroscopic unwanted traffic behavior.

    AB - This paper investigates the macroscopic behavior of unwanted traffic (e.g., virus, worm, backscatter of (D)DoS or misconfiguration) passing through the Internet. The data set we used are unwanted packets measured at /18 darknet in Japan from Oct. 2006 to Apr. 2009 that included the recent Conficker outbreak. The traffic behavior is quantified by the entropy of ten packet features (e.g., 5-tuple). Then, we apply PCA (principal component analysis) to a ten dimensional entropy time series matrix to obtain a suitable representation of unwanted traffic. PCA is a well-known and studied method for finding out normal and anomalous behaviors in Internet backbone traffic, however, few studies applied it to darknet traffic. We first demonstrate the high variability nature of the entropy time series for ten packet features. Next, we show that the top four principal components are sufficiently enough to describe the original traffic behavior. In particular, the first component can be interpreted as the type of unwanted traffic (i.e., worm/virus or scanning), and the second one as the difference in communication patterns (e.g., one-to-many or many-to-one). Those two components account for 63.8% of the original data set in terms of the total variance. On the other hand, the outliers in the higher components indicate the presence of specific anomalies although most of mapped data to the components have less variability. Furthermore, we show that the scatter plot of the first and second principal component scores provides us with a better view of the macroscopic unwanted traffic behavior.

    UR - http://www.scopus.com/inward/record.url?scp=77954323502&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=77954323502&partnerID=8YFLogxK

    U2 - 10.1109/AINA.2010.79

    DO - 10.1109/AINA.2010.79

    M3 - Conference contribution

    AN - SCOPUS:77954323502

    SN - 9780769540184

    SP - 377

    EP - 384

    BT - Proceedings - International Conference on Advanced Information Networking and Applications, AINA

    ER -