Abstract
As software is used in various areas today, software security has become a crucial issue. Third-party libraries, which play a major role in software development, pose difficulties in analyzing and testing software security. It is essential to know the variables used in software and the data type information of each variable in order to identify the major weaknesses in the software. However, because the third-party library is generally of the binary code form, the variables, variable data type, program syntax, and semantic information in the source code are removed. Therefore, reconstructing the variables used and the data type information of the variables from binary code is the most important step in weak point analysis. Traditionally, this step of reconstructing information is based on pattern matching; however, the inference of data types is limited. We herein proposed a method of inferring data types using deep learning for variables determined based on pattern matching in binary code, and analyzed its performance. The proposed study has improved the feature generation method to solve the inconsistent problems of the features generated in the previous studies. As a result, the accuracy of prediction of float and double is improved by average 7.2% compared to the previous study, and the result is that the accuracy of 5.1% is increased overall.
| Original language | English |
|---|---|
| Pages (from-to) | 1044-1052 |
| Number of pages | 9 |
| Journal | Future Generation Computer Systems |
| Volume | 100 |
| DOIs | |
| State | Published - Nov 2019 |
Keywords
- Binary code
- Data type inference
- Long short-term memory
- Reconstruction data information
- Software weakness
Fingerprint
Dive into the research topics of 'A data type inference method based on long short-term memory by improved feature for weakness analysis in binary code'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver