Detecting malicious websites by learning IP address features

Daiki Chiba, Kazuhiro Tobe, Tatsuya Mori, Shigeki Goto

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    11 Citations (Scopus)

    Abstract

    Web-based malware attacks have become one of the most serious threats that need to be addressed urgently. Several approaches that have attracted attention as promising ways of detecting such malware include employing various blacklists. However, these conventional approaches often fail to detect new attacks owing to the versatility of malicious websites. Thus, it is difficult to maintain up-to-date blacklists with information regarding new malicious websites. To tackle this problem, we propose a new method for detecting malicious websites using the characteristics of IP addresses. Our approach leverages the empirical observation that IP addresses are more stable than other metrics such as URL and DNS. While the strings that form URLs or domain names are highly variable, IP addresses are less variable, i.e., IPv4 address space is mapped onto 4-bytes strings. We develop a lightweight and scalable detection scheme based on the machine learning technique. The aim of this study is not to provide a single solution that effectively detects web-based malware but to develop a technique that compensates the drawbacks of existing approaches. We validate the effectiveness of our approach by using real IP address data from existing blacklists and real traffic data on a campus network. The results demonstrate that our method can expand the coverage/accuracy of existing blacklists and also detect unknown malicious websites that are not covered by conventional approaches.

    Original languageEnglish
    Title of host publicationProceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012
    Pages29-39
    Number of pages11
    DOIs
    Publication statusPublished - 2012
    Event2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012 - Izmir
    Duration: 2012 Jul 162012 Jul 20

    Other

    Other2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012
    CityIzmir
    Period12/7/1612/7/20

    Fingerprint

    Websites
    World Wide Web
    Learning systems
    Malware

    Keywords

    • Blacklist
    • Drive-by-download
    • IP address
    • Machine learning
    • Web-based malware

    ASJC Scopus subject areas

    • Computer Networks and Communications

    Cite this

    Chiba, D., Tobe, K., Mori, T., & Goto, S. (2012). Detecting malicious websites by learning IP address features. In Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012 (pp. 29-39). [6305258] https://doi.org/10.1109/SAINT.2012.14

    Detecting malicious websites by learning IP address features. / Chiba, Daiki; Tobe, Kazuhiro; Mori, Tatsuya; Goto, Shigeki.

    Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012. 2012. p. 29-39 6305258.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Chiba, D, Tobe, K, Mori, T & Goto, S 2012, Detecting malicious websites by learning IP address features. in Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012., 6305258, pp. 29-39, 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012, Izmir, 12/7/16. https://doi.org/10.1109/SAINT.2012.14
    Chiba D, Tobe K, Mori T, Goto S. Detecting malicious websites by learning IP address features. In Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012. 2012. p. 29-39. 6305258 https://doi.org/10.1109/SAINT.2012.14
    Chiba, Daiki ; Tobe, Kazuhiro ; Mori, Tatsuya ; Goto, Shigeki. / Detecting malicious websites by learning IP address features. Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012. 2012. pp. 29-39
    @inproceedings{40b376c43ff840af84673dc85dfd949c,
    title = "Detecting malicious websites by learning IP address features",
    abstract = "Web-based malware attacks have become one of the most serious threats that need to be addressed urgently. Several approaches that have attracted attention as promising ways of detecting such malware include employing various blacklists. However, these conventional approaches often fail to detect new attacks owing to the versatility of malicious websites. Thus, it is difficult to maintain up-to-date blacklists with information regarding new malicious websites. To tackle this problem, we propose a new method for detecting malicious websites using the characteristics of IP addresses. Our approach leverages the empirical observation that IP addresses are more stable than other metrics such as URL and DNS. While the strings that form URLs or domain names are highly variable, IP addresses are less variable, i.e., IPv4 address space is mapped onto 4-bytes strings. We develop a lightweight and scalable detection scheme based on the machine learning technique. The aim of this study is not to provide a single solution that effectively detects web-based malware but to develop a technique that compensates the drawbacks of existing approaches. We validate the effectiveness of our approach by using real IP address data from existing blacklists and real traffic data on a campus network. The results demonstrate that our method can expand the coverage/accuracy of existing blacklists and also detect unknown malicious websites that are not covered by conventional approaches.",
    keywords = "Blacklist, Drive-by-download, IP address, Machine learning, Web-based malware",
    author = "Daiki Chiba and Kazuhiro Tobe and Tatsuya Mori and Shigeki Goto",
    year = "2012",
    doi = "10.1109/SAINT.2012.14",
    language = "English",
    isbn = "9780769547374",
    pages = "29--39",
    booktitle = "Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012",

    }

    TY - GEN

    T1 - Detecting malicious websites by learning IP address features

    AU - Chiba, Daiki

    AU - Tobe, Kazuhiro

    AU - Mori, Tatsuya

    AU - Goto, Shigeki

    PY - 2012

    Y1 - 2012

    N2 - Web-based malware attacks have become one of the most serious threats that need to be addressed urgently. Several approaches that have attracted attention as promising ways of detecting such malware include employing various blacklists. However, these conventional approaches often fail to detect new attacks owing to the versatility of malicious websites. Thus, it is difficult to maintain up-to-date blacklists with information regarding new malicious websites. To tackle this problem, we propose a new method for detecting malicious websites using the characteristics of IP addresses. Our approach leverages the empirical observation that IP addresses are more stable than other metrics such as URL and DNS. While the strings that form URLs or domain names are highly variable, IP addresses are less variable, i.e., IPv4 address space is mapped onto 4-bytes strings. We develop a lightweight and scalable detection scheme based on the machine learning technique. The aim of this study is not to provide a single solution that effectively detects web-based malware but to develop a technique that compensates the drawbacks of existing approaches. We validate the effectiveness of our approach by using real IP address data from existing blacklists and real traffic data on a campus network. The results demonstrate that our method can expand the coverage/accuracy of existing blacklists and also detect unknown malicious websites that are not covered by conventional approaches.

    AB - Web-based malware attacks have become one of the most serious threats that need to be addressed urgently. Several approaches that have attracted attention as promising ways of detecting such malware include employing various blacklists. However, these conventional approaches often fail to detect new attacks owing to the versatility of malicious websites. Thus, it is difficult to maintain up-to-date blacklists with information regarding new malicious websites. To tackle this problem, we propose a new method for detecting malicious websites using the characteristics of IP addresses. Our approach leverages the empirical observation that IP addresses are more stable than other metrics such as URL and DNS. While the strings that form URLs or domain names are highly variable, IP addresses are less variable, i.e., IPv4 address space is mapped onto 4-bytes strings. We develop a lightweight and scalable detection scheme based on the machine learning technique. The aim of this study is not to provide a single solution that effectively detects web-based malware but to develop a technique that compensates the drawbacks of existing approaches. We validate the effectiveness of our approach by using real IP address data from existing blacklists and real traffic data on a campus network. The results demonstrate that our method can expand the coverage/accuracy of existing blacklists and also detect unknown malicious websites that are not covered by conventional approaches.

    KW - Blacklist

    KW - Drive-by-download

    KW - IP address

    KW - Machine learning

    KW - Web-based malware

    UR - http://www.scopus.com/inward/record.url?scp=84867977527&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84867977527&partnerID=8YFLogxK

    U2 - 10.1109/SAINT.2012.14

    DO - 10.1109/SAINT.2012.14

    M3 - Conference contribution

    AN - SCOPUS:84867977527

    SN - 9780769547374

    SP - 29

    EP - 39

    BT - Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012

    ER -