TY - GEN
T1 - A Review of Data Representation Methods for Vulnerability Mining Using Deep Learning
AU - Li, Ying
AU - Gu, Mianxue
AU - Sun, Hongyu
AU - Lin, Yuhao
AU - Yue, Qiuling
AU - Guo, Zhen
AU - Hu, Jinglu
AU - Wang, He
AU - Zhang, Yuqing
N1 - Funding Information:
This work was supported by the Key Research and Development Science and Technology of Hainan Province(ZDYF202012), the National Key Research and Development Program of China(2018YFB0804701), and the National Natural Science Foundation of China (U1836210).
Publisher Copyright:
© 2022, Springer Nature Singapore Pte Ltd.
PY - 2022
Y1 - 2022
N2 - The rapid development of software has brought unprecedented severe challenges to software security vulnerabilities. Traditional vulnerability mining methods are difficult to apply to large-scale software systems due to drawbacks such as manual inspection, low efficiency, high false positives and high false negatives. Recent research works have attempted to apply deep learning models to vulnerability mining, and have made a good progress in vulnerability mining filed. In this paper, we analyze the deep learning model framework applied to vulnerability mining and summarize its overall workflow and technology. Then, we give a detailed analysis on five feature extraction methods for vulnerability mining, including sequence characterization-based method, abstract syntax tree-based method, graph-based method, text-based method and mixed characterization-based method. In addition, we summarize their advantages and disadvantages from the angles of single and mixed feature extraction method. Finally, we point out the future research trends and prospects.
AB - The rapid development of software has brought unprecedented severe challenges to software security vulnerabilities. Traditional vulnerability mining methods are difficult to apply to large-scale software systems due to drawbacks such as manual inspection, low efficiency, high false positives and high false negatives. Recent research works have attempted to apply deep learning models to vulnerability mining, and have made a good progress in vulnerability mining filed. In this paper, we analyze the deep learning model framework applied to vulnerability mining and summarize its overall workflow and technology. Then, we give a detailed analysis on five feature extraction methods for vulnerability mining, including sequence characterization-based method, abstract syntax tree-based method, graph-based method, text-based method and mixed characterization-based method. In addition, we summarize their advantages and disadvantages from the angles of single and mixed feature extraction method. Finally, we point out the future research trends and prospects.
KW - Data representation
KW - Deep learning
KW - Vulnerability mining
UR - http://www.scopus.com/inward/record.url?scp=85126253976&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126253976&partnerID=8YFLogxK
U2 - 10.1007/978-981-19-0523-0_22
DO - 10.1007/978-981-19-0523-0_22
M3 - Conference contribution
AN - SCOPUS:85126253976
SN - 9789811905223
T3 - Communications in Computer and Information Science
SP - 342
EP - 351
BT - Frontiers in Cyber Security - 4th International Conference, FCS 2021, Revised Selected Papers
A2 - Cao, Chunjie
A2 - Zhang, Yuqing
A2 - Hong, Yuan
A2 - Wang, Ding
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th International Conference on Frontiers in Cyber Security, FCS 2021
Y2 - 17 December 2021 through 19 December 2021
ER -