A Comparative Study of Deep Learning and Machine Learning Models for Malware Detection

Location

Hager-Lubbers Exhibition Hall

Description

PURPOSE: Artificial intelligence has since become an area of interest for many professionals, with an emphasis on how it can help cybersecurity practitioners diffuse these threats. Machine learning and deep learning are subsets of artificial intelligence that have been leveraged for this specific task. This project delves into both worlds to provide a comparative analysis of these methods. METHODS AND MATERIALS: In this study, we propose to build data mining pipelines that demonstrate the efficacy of binary classifiers in detecting malware using traditional machine learning and deep learning. ANALYSES: One–way ANOVA test with 0.05 significance gave a P-value of close to zero, which demonstrated significant differences in the model accuracies. We went ahead and compared the model's performance using other metrics like F1 score, precision, and recall. RESULTS In our traditional machine learning experiments, all our algorithms exhibited promising detection results of nearly 99%. The best-performing model here was the Gradient Boosting classifier, with an accuracy of 99.73%. Cross-validation experiments determined XGBOOST to be the best-performing algorithm with an Accuracy of 99.05%. In our deep learning experiments, the CNN ensemble gave the best performance with an accuracy of 99.13. Cross-validation experiments marginally improved the CNN accuracy to 99.15%. CONCLUSIONS: These results demonstrate that the proposed binary classification scheme is effective in malware detection.

This document is currently not available here.

Share

COinS
 
Apr 23rd, 3:00 PM

A Comparative Study of Deep Learning and Machine Learning Models for Malware Detection

Hager-Lubbers Exhibition Hall

PURPOSE: Artificial intelligence has since become an area of interest for many professionals, with an emphasis on how it can help cybersecurity practitioners diffuse these threats. Machine learning and deep learning are subsets of artificial intelligence that have been leveraged for this specific task. This project delves into both worlds to provide a comparative analysis of these methods. METHODS AND MATERIALS: In this study, we propose to build data mining pipelines that demonstrate the efficacy of binary classifiers in detecting malware using traditional machine learning and deep learning. ANALYSES: One–way ANOVA test with 0.05 significance gave a P-value of close to zero, which demonstrated significant differences in the model accuracies. We went ahead and compared the model's performance using other metrics like F1 score, precision, and recall. RESULTS In our traditional machine learning experiments, all our algorithms exhibited promising detection results of nearly 99%. The best-performing model here was the Gradient Boosting classifier, with an accuracy of 99.73%. Cross-validation experiments determined XGBOOST to be the best-performing algorithm with an Accuracy of 99.05%. In our deep learning experiments, the CNN ensemble gave the best performance with an accuracy of 99.13. Cross-validation experiments marginally improved the CNN accuracy to 99.15%. CONCLUSIONS: These results demonstrate that the proposed binary classification scheme is effective in malware detection.