Inferring popularity of domain names with DNS traffic: exploiting cache timeout heuristics

Akihiro Shimoda, Keisuke Ishibashi, Kazumichi Sato, Masayuki Tsujino, Takeru Inoue, Masaki Shimura, Takanori Takebe, Kazuki Takahashi, Tatsuya Mori, Shigeki Goto

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    Popularity ranking of Internet services is an important metric for network operators, because it enables mid- to-long term planning of their network facilities and root cause analysis for unexpected traffic. The service-oriented traffic monitoring is much helpful to infer the popularity, hence it has been gathering much attention from both researchers and practitioners. Lately, service identification of a given flow has become very difficult due to the rapid growth of CDNs and/or encrypted traffic, while some research works employed preceding DNS traffic as a hint. However, because of its cache mechanism, the DNS message count deviates from the actual number of flows, which can greatly degrade the ranking reliability. We propose a theoretical model for inferring the user's number of accesses per domain name by exploiting the characteristics of the DNS message count. To the best of our knowledge, this paper is the first attempt to formulate the effect of user's stub resolvers; previous studies were focused on analyzing the effect of cache servers. We evaluated the precision of our model with a real dataset of traffic of thousands of users. By analyzing the top-50 domain names by the number of users, we can infer the number of flows within a 24% error rate on average in 42 out of 50 FQDNs.

    Original languageEnglish
    Title of host publication2015 IEEE Global Communications Conference, GLOBECOM 2015
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    ISBN (Print)9781479959525
    DOIs
    Publication statusPublished - 2016 Feb 23
    Event58th IEEE Global Communications Conference, GLOBECOM 2015 - San Diego, United States
    Duration: 2015 Dec 62015 Dec 10

    Other

    Other58th IEEE Global Communications Conference, GLOBECOM 2015
    CountryUnited States
    CitySan Diego
    Period15/12/615/12/10

    Fingerprint

    popularity
    heuristics
    traffic
    Servers
    Internet
    ranking
    Planning
    Monitoring
    monitoring
    planning
    cause

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Electrical and Electronic Engineering
    • Communication

    Cite this

    Shimoda, A., Ishibashi, K., Sato, K., Tsujino, M., Inoue, T., Shimura, M., ... Goto, S. (2016). Inferring popularity of domain names with DNS traffic: exploiting cache timeout heuristics. In 2015 IEEE Global Communications Conference, GLOBECOM 2015 [7417638] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/GLOCOM.2014.7417638

    Inferring popularity of domain names with DNS traffic : exploiting cache timeout heuristics. / Shimoda, Akihiro; Ishibashi, Keisuke; Sato, Kazumichi; Tsujino, Masayuki; Inoue, Takeru; Shimura, Masaki; Takebe, Takanori; Takahashi, Kazuki; Mori, Tatsuya; Goto, Shigeki.

    2015 IEEE Global Communications Conference, GLOBECOM 2015. Institute of Electrical and Electronics Engineers Inc., 2016. 7417638.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Shimoda, A, Ishibashi, K, Sato, K, Tsujino, M, Inoue, T, Shimura, M, Takebe, T, Takahashi, K, Mori, T & Goto, S 2016, Inferring popularity of domain names with DNS traffic: exploiting cache timeout heuristics. in 2015 IEEE Global Communications Conference, GLOBECOM 2015., 7417638, Institute of Electrical and Electronics Engineers Inc., 58th IEEE Global Communications Conference, GLOBECOM 2015, San Diego, United States, 15/12/6. https://doi.org/10.1109/GLOCOM.2014.7417638
    Shimoda A, Ishibashi K, Sato K, Tsujino M, Inoue T, Shimura M et al. Inferring popularity of domain names with DNS traffic: exploiting cache timeout heuristics. In 2015 IEEE Global Communications Conference, GLOBECOM 2015. Institute of Electrical and Electronics Engineers Inc. 2016. 7417638 https://doi.org/10.1109/GLOCOM.2014.7417638
    Shimoda, Akihiro ; Ishibashi, Keisuke ; Sato, Kazumichi ; Tsujino, Masayuki ; Inoue, Takeru ; Shimura, Masaki ; Takebe, Takanori ; Takahashi, Kazuki ; Mori, Tatsuya ; Goto, Shigeki. / Inferring popularity of domain names with DNS traffic : exploiting cache timeout heuristics. 2015 IEEE Global Communications Conference, GLOBECOM 2015. Institute of Electrical and Electronics Engineers Inc., 2016.
    @inproceedings{f3e7ee534ad4427192a789b4f30fd28e,
    title = "Inferring popularity of domain names with DNS traffic: exploiting cache timeout heuristics",
    abstract = "Popularity ranking of Internet services is an important metric for network operators, because it enables mid- to-long term planning of their network facilities and root cause analysis for unexpected traffic. The service-oriented traffic monitoring is much helpful to infer the popularity, hence it has been gathering much attention from both researchers and practitioners. Lately, service identification of a given flow has become very difficult due to the rapid growth of CDNs and/or encrypted traffic, while some research works employed preceding DNS traffic as a hint. However, because of its cache mechanism, the DNS message count deviates from the actual number of flows, which can greatly degrade the ranking reliability. We propose a theoretical model for inferring the user's number of accesses per domain name by exploiting the characteristics of the DNS message count. To the best of our knowledge, this paper is the first attempt to formulate the effect of user's stub resolvers; previous studies were focused on analyzing the effect of cache servers. We evaluated the precision of our model with a real dataset of traffic of thousands of users. By analyzing the top-50 domain names by the number of users, we can infer the number of flows within a 24{\%} error rate on average in 42 out of 50 FQDNs.",
    author = "Akihiro Shimoda and Keisuke Ishibashi and Kazumichi Sato and Masayuki Tsujino and Takeru Inoue and Masaki Shimura and Takanori Takebe and Kazuki Takahashi and Tatsuya Mori and Shigeki Goto",
    year = "2016",
    month = "2",
    day = "23",
    doi = "10.1109/GLOCOM.2014.7417638",
    language = "English",
    isbn = "9781479959525",
    booktitle = "2015 IEEE Global Communications Conference, GLOBECOM 2015",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - Inferring popularity of domain names with DNS traffic

    T2 - exploiting cache timeout heuristics

    AU - Shimoda, Akihiro

    AU - Ishibashi, Keisuke

    AU - Sato, Kazumichi

    AU - Tsujino, Masayuki

    AU - Inoue, Takeru

    AU - Shimura, Masaki

    AU - Takebe, Takanori

    AU - Takahashi, Kazuki

    AU - Mori, Tatsuya

    AU - Goto, Shigeki

    PY - 2016/2/23

    Y1 - 2016/2/23

    N2 - Popularity ranking of Internet services is an important metric for network operators, because it enables mid- to-long term planning of their network facilities and root cause analysis for unexpected traffic. The service-oriented traffic monitoring is much helpful to infer the popularity, hence it has been gathering much attention from both researchers and practitioners. Lately, service identification of a given flow has become very difficult due to the rapid growth of CDNs and/or encrypted traffic, while some research works employed preceding DNS traffic as a hint. However, because of its cache mechanism, the DNS message count deviates from the actual number of flows, which can greatly degrade the ranking reliability. We propose a theoretical model for inferring the user's number of accesses per domain name by exploiting the characteristics of the DNS message count. To the best of our knowledge, this paper is the first attempt to formulate the effect of user's stub resolvers; previous studies were focused on analyzing the effect of cache servers. We evaluated the precision of our model with a real dataset of traffic of thousands of users. By analyzing the top-50 domain names by the number of users, we can infer the number of flows within a 24% error rate on average in 42 out of 50 FQDNs.

    AB - Popularity ranking of Internet services is an important metric for network operators, because it enables mid- to-long term planning of their network facilities and root cause analysis for unexpected traffic. The service-oriented traffic monitoring is much helpful to infer the popularity, hence it has been gathering much attention from both researchers and practitioners. Lately, service identification of a given flow has become very difficult due to the rapid growth of CDNs and/or encrypted traffic, while some research works employed preceding DNS traffic as a hint. However, because of its cache mechanism, the DNS message count deviates from the actual number of flows, which can greatly degrade the ranking reliability. We propose a theoretical model for inferring the user's number of accesses per domain name by exploiting the characteristics of the DNS message count. To the best of our knowledge, this paper is the first attempt to formulate the effect of user's stub resolvers; previous studies were focused on analyzing the effect of cache servers. We evaluated the precision of our model with a real dataset of traffic of thousands of users. By analyzing the top-50 domain names by the number of users, we can infer the number of flows within a 24% error rate on average in 42 out of 50 FQDNs.

    UR - http://www.scopus.com/inward/record.url?scp=84964896480&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84964896480&partnerID=8YFLogxK

    U2 - 10.1109/GLOCOM.2014.7417638

    DO - 10.1109/GLOCOM.2014.7417638

    M3 - Conference contribution

    AN - SCOPUS:84964896480

    SN - 9781479959525

    BT - 2015 IEEE Global Communications Conference, GLOBECOM 2015

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -