TY - JOUR
T1 - Malware classification algorithm using advanced Word2vec-based Bi-LSTM for ground control stations
AU - Sung, Yunsick
AU - Jang, Sejun
AU - Jeong, Young Sik
AU - Park, Jong Hyuk (James J.).
N1 - Publisher Copyright:
© 2020
PY - 2020/3/1
Y1 - 2020/3/1
N2 - Recently, Internet of Drones (IoD) are issued to utilize the diverse kinds of drones for leisure, education and so on. Researchers study to prevent the situations that drones are disabled by cyber-attackers by embedding malwares into the drones and Ground Control Stations (GCS). Therefore, it is required to protect the malwares considering the diverse kinds of features of the drones and GCSs. Signature-based detection approaches are traditionally utilized. However, given that those approaches only scan files partially, some of malwares are not detected. This paper proposes a novel method for finding the malwares in GCSs that utilizes a fastText model to create lower-dimension vectors than those the vectors by one-hot encoding and a bidirectional LSTM model to analyze the correlation with sequential opcodes. In addition, API function names are utilized to increase the classification accuracy of the sequential opcodes. In the experiments, the Microsoft malware classification challenge dataset was utilized and the malwares in the dataset were classified by family types. The proposed method showed the performance improvement of 1.87% comparing with the performance by a one-hot encoding-based approach. When the proposed method was compared with a similar decision tree-based malware detection approach, the performance of the proposed method was improved by 0.76%.
AB - Recently, Internet of Drones (IoD) are issued to utilize the diverse kinds of drones for leisure, education and so on. Researchers study to prevent the situations that drones are disabled by cyber-attackers by embedding malwares into the drones and Ground Control Stations (GCS). Therefore, it is required to protect the malwares considering the diverse kinds of features of the drones and GCSs. Signature-based detection approaches are traditionally utilized. However, given that those approaches only scan files partially, some of malwares are not detected. This paper proposes a novel method for finding the malwares in GCSs that utilizes a fastText model to create lower-dimension vectors than those the vectors by one-hot encoding and a bidirectional LSTM model to analyze the correlation with sequential opcodes. In addition, API function names are utilized to increase the classification accuracy of the sequential opcodes. In the experiments, the Microsoft malware classification challenge dataset was utilized and the malwares in the dataset were classified by family types. The proposed method showed the performance improvement of 1.87% comparing with the performance by a one-hot encoding-based approach. When the proposed method was compared with a similar decision tree-based malware detection approach, the performance of the proposed method was improved by 0.76%.
KW - FastText
KW - Ground control station
KW - Internet of Drone
KW - Long short-term memory
KW - Malware
UR - http://www.scopus.com/inward/record.url?scp=85079174505&partnerID=8YFLogxK
U2 - 10.1016/j.comcom.2020.02.005
DO - 10.1016/j.comcom.2020.02.005
M3 - Article
AN - SCOPUS:85079174505
SN - 0140-3664
VL - 153
SP - 342
EP - 348
JO - Computer Communications
JF - Computer Communications
ER -