Identifying heavy-hitter flows from sampled flow statistics

Tatsuya Mori, Tetsuya Takine, Jianping Pan, Ryoichi Kawahara, Masato Uchida, Shigeki Goto

    Research output: Contribution to journalArticle

    35 Citations (Scopus)

    Abstract

    With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statistics. Our approach follows a two-stage strategy: We first parametrically estimate the original flow length distribution from sampled flows. We then identify heavy-hitter flows with Bayes' theorem, where the flow length distribution estimated at the first stage is used as an a priori distribution. Our approach is validated and evaluated with publicly available packet traces. We show that our approach provides a very flexible framework in striking an appropriate balance between false positives and false negatives when sampling frequency is given.

    Original languageEnglish
    Pages (from-to)3061-3072
    Number of pages12
    JournalIEICE Transactions on Communications
    VolumeE90-B
    Issue number11
    DOIs
    Publication statusPublished - 2007

    Fingerprint

    Statistics
    Sampling

    Keywords

    • A priori distribution
    • Bayes' theorem
    • Flow statistics
    • Network measurement
    • Packet sampling

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Computer Networks and Communications
    • Software

    Cite this

    Identifying heavy-hitter flows from sampled flow statistics. / Mori, Tatsuya; Takine, Tetsuya; Pan, Jianping; Kawahara, Ryoichi; Uchida, Masato; Goto, Shigeki.

    In: IEICE Transactions on Communications, Vol. E90-B, No. 11, 2007, p. 3061-3072.

    Research output: Contribution to journalArticle

    Mori, Tatsuya ; Takine, Tetsuya ; Pan, Jianping ; Kawahara, Ryoichi ; Uchida, Masato ; Goto, Shigeki. / Identifying heavy-hitter flows from sampled flow statistics. In: IEICE Transactions on Communications. 2007 ; Vol. E90-B, No. 11. pp. 3061-3072.
    @article{3f58c75d6aed4204a1bc19abca48b145,
    title = "Identifying heavy-hitter flows from sampled flow statistics",
    abstract = "With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statistics. Our approach follows a two-stage strategy: We first parametrically estimate the original flow length distribution from sampled flows. We then identify heavy-hitter flows with Bayes' theorem, where the flow length distribution estimated at the first stage is used as an a priori distribution. Our approach is validated and evaluated with publicly available packet traces. We show that our approach provides a very flexible framework in striking an appropriate balance between false positives and false negatives when sampling frequency is given.",
    keywords = "A priori distribution, Bayes' theorem, Flow statistics, Network measurement, Packet sampling",
    author = "Tatsuya Mori and Tetsuya Takine and Jianping Pan and Ryoichi Kawahara and Masato Uchida and Shigeki Goto",
    year = "2007",
    doi = "10.1093/ietcom/e90-b.11.3061",
    language = "English",
    volume = "E90-B",
    pages = "3061--3072",
    journal = "IEICE Transactions on Communications",
    issn = "0916-8516",
    publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
    number = "11",

    }

    TY - JOUR

    T1 - Identifying heavy-hitter flows from sampled flow statistics

    AU - Mori, Tatsuya

    AU - Takine, Tetsuya

    AU - Pan, Jianping

    AU - Kawahara, Ryoichi

    AU - Uchida, Masato

    AU - Goto, Shigeki

    PY - 2007

    Y1 - 2007

    N2 - With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statistics. Our approach follows a two-stage strategy: We first parametrically estimate the original flow length distribution from sampled flows. We then identify heavy-hitter flows with Bayes' theorem, where the flow length distribution estimated at the first stage is used as an a priori distribution. Our approach is validated and evaluated with publicly available packet traces. We show that our approach provides a very flexible framework in striking an appropriate balance between false positives and false negatives when sampling frequency is given.

    AB - With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statistics. Our approach follows a two-stage strategy: We first parametrically estimate the original flow length distribution from sampled flows. We then identify heavy-hitter flows with Bayes' theorem, where the flow length distribution estimated at the first stage is used as an a priori distribution. Our approach is validated and evaluated with publicly available packet traces. We show that our approach provides a very flexible framework in striking an appropriate balance between false positives and false negatives when sampling frequency is given.

    KW - A priori distribution

    KW - Bayes' theorem

    KW - Flow statistics

    KW - Network measurement

    KW - Packet sampling

    UR - http://www.scopus.com/inward/record.url?scp=51849152825&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=51849152825&partnerID=8YFLogxK

    U2 - 10.1093/ietcom/e90-b.11.3061

    DO - 10.1093/ietcom/e90-b.11.3061

    M3 - Article

    AN - SCOPUS:51849152825

    VL - E90-B

    SP - 3061

    EP - 3072

    JO - IEICE Transactions on Communications

    JF - IEICE Transactions on Communications

    SN - 0916-8516

    IS - 11

    ER -