Disclaimer: Nothing in this blog is related to the author’s day-to-day work. The content is neither affiliated nor sponsored by any companies.

Link to the Chinese version https://toooold.com/2021/09/05/hide_malware_ann.html

Deep neural networks are growing in size, and many of these parameters are redundant, so smart hackers can changing a few of these parameters to hide malware in a stealthy way. In this blog, we introduce a paper on hiding arbitrary binaries in deep neural networks, understand the reasons for it, and discuss its detection and adversarial attack.

A brilliant idea

This blog presents an interesting paper, Wang et al “EvilModel: Hiding Malware Inside of Neural Network Models”. It combines deep learning and malware knowledge and leverages the float precision trick in the neural network to successfully hide arbitrary malware binaries (payloads) while maintaining stable precision. The method is widespread and simple, practical and stealthy enough to provide new ideas for malware and APT detection and adversarial attack. The paper’s topic differs from other AI neural networks adversarial attack in that, it treats the deep neural network as a channel to hide and deliver malware rather than adversarial training that tries to trick the network model into misclassifying or missing, or burying specific pattern patterns in the model to trigger misclassification in specific situations.

Steganography is an evergreen topic in the security field. Over the years, it brings amany practical methods of hiding binary files: binary files can be hidden through file sections in image files; binary files can be hidden through punctuation and character combinations in text files; audio, video, and other spatial or frequency rich situations are more suitable for steganography. With the major vendors promoting a deep model arms race, tens of billions of parameters, and hundreds of GB models emerging, smart security researchers wondered if they could hide a few malware in these models. Yes, it is the paper’s starting point. Most companies’ AI models are now deployed without code audit, and the big AI platforms also carry high-performance and unlimited GPUs, even if not to do APT attacks, why not just mine some XMR for profit?


The EvilModel paper combines several common and seemingly non-trivial knowledge points in order to conceal arbitrary binary in a simple and effective manner: The fully-connected layer (FC) of a neural network has a lot of information redundancy in its 32-bit floating-point, and changing the exponent bits by 3 bytes has marginal impact to the precision. Even in a network with fewer parameters, such as AlexNet, which has only 4096 neurons per FC layer, the first FC layer’s 6400 connections per neuron can store 18.75 KB of binary files. That is, changing the weights of just one neuron is more than enough for a malware dropper, and changing the weights of more neurons can achieve the goal of hiding a full arsenal of malware attacks.

Similar work is mentioned in Liu et al “StegoNet: Turn Deep Neural Network into a Stegomalware”, but this paper goes the other way for a more general and more practical approach: is the hiding effect the same for each FC layer? can Batch Normalization (BN) help to hide the payload? How to effectively replace bytes and maximize the hiding ratio? How can additional re-training compensate for the lost accuracy? The paper provides a “Fast Substitution” algorithm to answer these questions with practical results, and its method is as simple as: according to the attacker’s predefined sampling rules, replace 3 bytes of the weight parameter in the FC layer neurons, validate the accuracy loss until it becomes too much loss after replacing more than half of the neurons, e.g., precision drops from 94 percent to less than 90 percent. The weights of these replaced neurons are then frozen, and the rest of the neurons are adjusted for retraining in order to recover as much accuracy as possible, like keeping the loss under 1%.

Experiment results in the paper show that small networks with few parameters and shallow layers, such as AlexNet, can deliver and fully recover 38.7 MB of arbitrary malware binaries, which is nearly enough to provide all of the paylaod for an advanced intrusion, whereas deeper networks with more parameters, such as VGG and ResNet, can embed more and larger payloads. The approach described in the paper works for networks with a fully-connected layer (FC), whether trained in-house or provided by a third party.

Why unreasonable effectiveness

Neural networks, like other vectors of steganographic approaches, have a high level of information redundancy. The greater the depth and complexity of the network structure, the more redundant it becomes as the search space it represents expands exponentially. In terms of search space, each parameter in a neural network is equivalent to a dimensional “yes or no” decision making, where N parameters are a N dimensional cube with 2^N vertices. For example, a network like ResNet50, which is considered small and shallow now, still has 25 million parameters of 2^25000000 vertices, not mention these hyperscale neural networks with tens of billions of parameters. And these vertices correspond to a small fluctuation in the hyperspace position, like 3 bytes flucturation in the 32-bit precision weight, the impact on the entire search space is negligible.

Weight pruning is introduced here to determine which parts are redundant and to eliminate them. It can be traced back to LeCun et al’s late 1980s paper “Optimal Brain Damage.” In LeCun’s paper, the approach was to check the saliency of each weight after training the model, defined as the effect of a perturbation of that weight on the loss function, and then set the saliency of the lowest weight to 0 and retrain. There has also been a lot of research into weight pruning, and there are some excellent open-source implementations. One can learn from the references and have fun with the weight pruning tutorial in PyTorch.

Detection and adversarial attack

From a security prospective, the paper’s approach implements “Payload Deliver” in the kill chain model by using the deep neural network model deployment channel, which simply hides the payload in the neural network deployment file without additional network outreach, so the general detection approach for payload delivery is also applicable. Furthermore, the malicious binary must be extracted and executed from the neural network, and its extraction and execution code can be detected through code auditing.

From a neural network perspective, the paper suggests two approaches: either to fine-tune the weights and continue training so that the embedded binary payload cannot be recovered, or to validate the model’s integrity and obtain and use it from a reliable source. However, the corresponding countermeasures are not very difficult to find: The experiments in the paper did not use redundant encoding of the binary, so fluctuations from fine-tune in several bytes can result in the inability to recover the source file, whereas the neural network can host a file large enough to have adequate countermeasures against fine-tuning if the attacker can sacrify some space and adds encoding redundancy to the binary file; fine-tuning usually drops a few layers closer to softmax, so the attacker can intentionally bypass these layers and use earlier FC layers to hide the payload; the attacker can also train against this by choosing appropriate GAN functions so that the model is more resistant to perturbation of the weights at specific locations defined by the attacker. In terms of model integrity, various deep learning frameworks and ONNX format standards now only provide the basic level of file integrity checking, while the network itself stores its structure and weights in plaintext, and its deployment system frequently ignores DevSecOps specifications like code auditing, whether it builds in-house model or poisons a third-party model, or even publishes a pre-trained network with malicious code on github. In any of these channels, the attacker can easily compromise and deliver the payload.


The parameter redundancy used in the paper is fundamental neural network knowledge. We recommend that all readers learn more about and become acquainted with the theory underlying the algorithms. Many security researchers claim to be experts in deep learning models, but few data scientists and engineers who work or research on deep learning models understand much security, and even fewer have both at the same time. It would be ideal if security researchers could learn and comprehend more about the theoretical underpinnings of algorithms and machine learning, allowing them to produce more multidisciplinary results.