It is not easy to extract essential features from a sequence of security data. It requires smart security experts to dig valuable information from enormous data. We propose a novel neural network structure, named filterNN, to extract essential features from text-based sequential data obtained from the dynamic analysis of malware. The filterNN contains a structure that can explicitly point out which part of the sequence (i.e., subsequence) is more important than the others for the latter classification task. Thus, security experts can quickly identify the characteristics of malware samples in a malware family and further identify the family behavior among them. The proposed filterNN is a framework that can adapt different NN classifiers (e.g., SLFN, CNN, and RNN). We evaluate the filterNN with real-world malware data. The experiment shows filterNN can remove useless features from the sequential data while still keeping high classification accuracy.
Hsiao, Shun-Wen and Lee, Yi-Jen, "NN-Based Feature Selection for Text-Based Sequential Data" (2020). PACIS 2020 Proceedings. 238.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.