TY - JOUR
T1 - Enhanced Adversarial Defense Model with Vector Compression and Ensemble Learning
AU - Baek, Seungyeon
AU - Jeong, Byeonghui
AU - Jeon, Jueun
AU - Jeong, Young Sik
N1 - Publisher Copyright:
© This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
PY - 2025
Y1 - 2025
N2 - Deep learning (DL)-based classifiers in malware detection systems effectively analyze complex and diverse malicious behavior patterns to detect the growing number of cyber threats with high accuracy. However, due to their sensitivity to small changes in input data, DL-based classifiers are unable to detect adversarial malware that injects tiny perturbations into portable executable files to evade detection by the classifier. Furthermore, traditional adversarial defense techniques rely on adversarial training and are unable to respond to new perturbations. Therefore, in this study, we propose a vector compression and ensemble learning (VeCoEL) scheme that preserves sequential semantics while mitigating the impact of perturbations to detect adversarial malware, normal malware, and benign with high accuracy. First, VeCoEL converts six high-dimensional features extracted by hybrid analysis into embedding vectors. Then, the vector elements for each feature symbol are compressed by an arithmetic coding algorithm to reduce the influence of perturbation. Finally, the stacking ensemble model analyzes the characteristics of the compressed sequential patterns for each feature and detects malicious behavior with high accuracy. We evaluate the performance of VeCoEL on two malware datasets and find that the average detection accuracy and average evasion rate are 97.14% and 2.53%, respectively.
AB - Deep learning (DL)-based classifiers in malware detection systems effectively analyze complex and diverse malicious behavior patterns to detect the growing number of cyber threats with high accuracy. However, due to their sensitivity to small changes in input data, DL-based classifiers are unable to detect adversarial malware that injects tiny perturbations into portable executable files to evade detection by the classifier. Furthermore, traditional adversarial defense techniques rely on adversarial training and are unable to respond to new perturbations. Therefore, in this study, we propose a vector compression and ensemble learning (VeCoEL) scheme that preserves sequential semantics while mitigating the impact of perturbations to detect adversarial malware, normal malware, and benign with high accuracy. First, VeCoEL converts six high-dimensional features extracted by hybrid analysis into embedding vectors. Then, the vector elements for each feature symbol are compressed by an arithmetic coding algorithm to reduce the influence of perturbation. Finally, the stacking ensemble model analyzes the characteristics of the compressed sequential patterns for each feature and detects malicious behavior with high accuracy. We evaluate the performance of VeCoEL on two malware datasets and find that the average detection accuracy and average evasion rate are 97.14% and 2.53%, respectively.
KW - Adversarial Defense
KW - Malware Detection
KW - Stacking Ensemble Learning
KW - Vector Compression
UR - https://www.scopus.com/pages/publications/105016684088
U2 - 10.22967/HCIS.2025.15.056
DO - 10.22967/HCIS.2025.15.056
M3 - Article
AN - SCOPUS:105016684088
SN - 2192-1962
VL - 15
JO - Human-centric Computing and Information Sciences
JF - Human-centric Computing and Information Sciences
M1 - 56
ER -