Abstract
To automate malware analysis, dynamic malware analysis systems have attracted increasing attention from both the industry and research communities. Of the various logs collected by such systems, the API call is a very promising source of information for characterizing malware behavior. This work aims to extract similar malware samples automatically using the concept of 'API call topics,' which represents a set of API calls that are intrinsic to a specific group of malware samples. We first convert Win32 API calls into 'API words.' We then apply non-negative matrix factorization (NMF) clustering analysis to the corpus of the extracted API words. NMF automatically generates the API call topics from the API words. The contributions of this work can be summarized as follows. We present an unsupervised approach to extract API call topics from a large corpus of API calls. Through analysis of the API call logs collected from thousands of malware samples, we demonstrate that the extracted API call topics can detect similar malware samples. The proposed approach is expected to be useful for automating the process of analyzing a huge volume of logs collected from dynamic malware analysis systems.
Original language | English |
---|---|
Title of host publication | 2015 12th Annual IEEE Consumer Communications and Networking Conference, CCNC 2015 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 140-147 |
Number of pages | 8 |
ISBN (Print) | 9781479963904 |
DOIs | |
Publication status | Published - 2015 Jul 14 |
Event | 2015 12th Annual IEEE Consumer Communications and Networking Conference, CCNC 2015 - Las Vegas, United States Duration: 2015 Jan 9 → 2015 Jan 12 |
Other
Other | 2015 12th Annual IEEE Consumer Communications and Networking Conference, CCNC 2015 |
---|---|
Country | United States |
City | Las Vegas |
Period | 15/1/9 → 15/1/12 |
ASJC Scopus subject areas
- Computer Networks and Communications