
To encode the malware behavior reports to accessible forms for further automatic analysis methods like data mining and machine, we proposed a lightweight design of malware behavior representation named BBIS (Bytes-Based Instruction Set), which can utilize least single-byte characters to represent the items in dynamic behavior reports. BBIS is able to build flexible mapping table for different application scenarios. Experiments show that BBIS can significantly reduce the computation and storage cost while keeping the performance of clustering compared with existed methods. Moreover, a method called CHRL (Compression of High Repetitions in Logarithmic level) is introduced to compress frequently seen repetitions in unexpected API calls sequences. In combination with BBIS, CHRL can further reduce the size of behavior reports to significantly and consequently reduce the computation time while keeping or improving the performance of further malware analysis like clustering.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
