Statistical estimation of the names of HTTPS servers with domain name graphs

Tatsuya Mori, Takeru Inoue, Akihiro Shimoda, Kazumichi Sato, Shigeaki Harada, Keisuke Ishibashi, Shigeki Goto

    Research output: Contribution to journalArticle

    6 Citations (Scopus)

    Abstract

    Adoption of SSL/TLS to protect the privacy of web users has become increasingly common. In fact, as of September 2015, more than 68% of top-1M websites deploy SSL/TLS to encrypt their traffic. The transition from HTTP to HTTPS has brought a new challenge for network operators who need to understand the hostnames of encrypted web traffic for various reasons. To meet the challenge, this work develops a novel framework called SFMap, which estimates names of HTTPS servers by analyzing precedent DNS queries/responses in a statistical way. The SFMap framework introduces domain name graph, which can characterize highly dynamic and diverse nature of DNS mechanisms. Such complexity arises from the recent deployment and implementation of DNS ecosystems; i.e., canonical name tricks used by CDNs, the dynamic and diverse nature of DNS TTL settings, and incomplete and unpredictable measurements due to the existence of various DNS caching instances. First, we demonstrate that SFMap establishes good estimation accuracies and outperforms a state-of-the-art approach. We also aim to identify the optimized setting of the SFMap framework. Next, based on the preliminary analysis, we introduce techniques to make the SFMap framework scalable to large-scale traffic data. We validate the effectiveness of the approach using large-scale Internet traffic.

    Original languageEnglish
    JournalComputer Communications
    DOIs
    Publication statusAccepted/In press - 2015 Sep 30

    Fingerprint

    Servers
    Transistor transistor logic circuits
    HTTP
    Ecosystems
    Websites
    Internet

    Keywords

    • DNS
    • Graph
    • SSL/TLS
    • Traffic analysis

    ASJC Scopus subject areas

    • Computer Networks and Communications

    Cite this

    Statistical estimation of the names of HTTPS servers with domain name graphs. / Mori, Tatsuya; Inoue, Takeru; Shimoda, Akihiro; Sato, Kazumichi; Harada, Shigeaki; Ishibashi, Keisuke; Goto, Shigeki.

    In: Computer Communications, 30.09.2015.

    Research output: Contribution to journalArticle

    Mori, Tatsuya ; Inoue, Takeru ; Shimoda, Akihiro ; Sato, Kazumichi ; Harada, Shigeaki ; Ishibashi, Keisuke ; Goto, Shigeki. / Statistical estimation of the names of HTTPS servers with domain name graphs. In: Computer Communications. 2015.
    @article{e408b654e0a04f53ba9f8ed7deaea73f,
    title = "Statistical estimation of the names of HTTPS servers with domain name graphs",
    abstract = "Adoption of SSL/TLS to protect the privacy of web users has become increasingly common. In fact, as of September 2015, more than 68{\%} of top-1M websites deploy SSL/TLS to encrypt their traffic. The transition from HTTP to HTTPS has brought a new challenge for network operators who need to understand the hostnames of encrypted web traffic for various reasons. To meet the challenge, this work develops a novel framework called SFMap, which estimates names of HTTPS servers by analyzing precedent DNS queries/responses in a statistical way. The SFMap framework introduces domain name graph, which can characterize highly dynamic and diverse nature of DNS mechanisms. Such complexity arises from the recent deployment and implementation of DNS ecosystems; i.e., canonical name tricks used by CDNs, the dynamic and diverse nature of DNS TTL settings, and incomplete and unpredictable measurements due to the existence of various DNS caching instances. First, we demonstrate that SFMap establishes good estimation accuracies and outperforms a state-of-the-art approach. We also aim to identify the optimized setting of the SFMap framework. Next, based on the preliminary analysis, we introduce techniques to make the SFMap framework scalable to large-scale traffic data. We validate the effectiveness of the approach using large-scale Internet traffic.",
    keywords = "DNS, Graph, SSL/TLS, Traffic analysis",
    author = "Tatsuya Mori and Takeru Inoue and Akihiro Shimoda and Kazumichi Sato and Shigeaki Harada and Keisuke Ishibashi and Shigeki Goto",
    year = "2015",
    month = "9",
    day = "30",
    doi = "10.1016/j.comcom.2016.01.013",
    language = "English",
    journal = "Computer Communications",
    issn = "0140-3664",
    publisher = "Elsevier",

    }

    TY - JOUR

    T1 - Statistical estimation of the names of HTTPS servers with domain name graphs

    AU - Mori, Tatsuya

    AU - Inoue, Takeru

    AU - Shimoda, Akihiro

    AU - Sato, Kazumichi

    AU - Harada, Shigeaki

    AU - Ishibashi, Keisuke

    AU - Goto, Shigeki

    PY - 2015/9/30

    Y1 - 2015/9/30

    N2 - Adoption of SSL/TLS to protect the privacy of web users has become increasingly common. In fact, as of September 2015, more than 68% of top-1M websites deploy SSL/TLS to encrypt their traffic. The transition from HTTP to HTTPS has brought a new challenge for network operators who need to understand the hostnames of encrypted web traffic for various reasons. To meet the challenge, this work develops a novel framework called SFMap, which estimates names of HTTPS servers by analyzing precedent DNS queries/responses in a statistical way. The SFMap framework introduces domain name graph, which can characterize highly dynamic and diverse nature of DNS mechanisms. Such complexity arises from the recent deployment and implementation of DNS ecosystems; i.e., canonical name tricks used by CDNs, the dynamic and diverse nature of DNS TTL settings, and incomplete and unpredictable measurements due to the existence of various DNS caching instances. First, we demonstrate that SFMap establishes good estimation accuracies and outperforms a state-of-the-art approach. We also aim to identify the optimized setting of the SFMap framework. Next, based on the preliminary analysis, we introduce techniques to make the SFMap framework scalable to large-scale traffic data. We validate the effectiveness of the approach using large-scale Internet traffic.

    AB - Adoption of SSL/TLS to protect the privacy of web users has become increasingly common. In fact, as of September 2015, more than 68% of top-1M websites deploy SSL/TLS to encrypt their traffic. The transition from HTTP to HTTPS has brought a new challenge for network operators who need to understand the hostnames of encrypted web traffic for various reasons. To meet the challenge, this work develops a novel framework called SFMap, which estimates names of HTTPS servers by analyzing precedent DNS queries/responses in a statistical way. The SFMap framework introduces domain name graph, which can characterize highly dynamic and diverse nature of DNS mechanisms. Such complexity arises from the recent deployment and implementation of DNS ecosystems; i.e., canonical name tricks used by CDNs, the dynamic and diverse nature of DNS TTL settings, and incomplete and unpredictable measurements due to the existence of various DNS caching instances. First, we demonstrate that SFMap establishes good estimation accuracies and outperforms a state-of-the-art approach. We also aim to identify the optimized setting of the SFMap framework. Next, based on the preliminary analysis, we introduce techniques to make the SFMap framework scalable to large-scale traffic data. We validate the effectiveness of the approach using large-scale Internet traffic.

    KW - DNS

    KW - Graph

    KW - SSL/TLS

    KW - Traffic analysis

    UR - http://www.scopus.com/inward/record.url?scp=84959520200&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84959520200&partnerID=8YFLogxK

    U2 - 10.1016/j.comcom.2016.01.013

    DO - 10.1016/j.comcom.2016.01.013

    M3 - Article

    AN - SCOPUS:84959520200

    JO - Computer Communications

    JF - Computer Communications

    SN - 0140-3664

    ER -