According to Microsoft and Intel, STAMINA is less complicated than it at first appears. The first step of the method was researchers converting an input malware file in binary form into raw pixel data. Next, the team used the one-dimensional pixel stream and upscaled it to a 2D image that was compatible with image analysis algorithms. Microsoft and Intel selected images based on file size (see table below). Once the pixel stream was converted to 2D, it was resized to be smaller. When the images were resized, they are placed in a deep neural network (DNN) that has been pre-trained. The DNN scans the image and could classify it as ether infected or clean. Microsoft points out it tested 2.2 million infected portable executable (PE) file hashes for the STAMINA research. During the tests, STAMINA had a 99.07% success rate in properly identifying and classifying malware samples. The false positives were 2.58%. “The results certainly encourage the use of deep transfer learning for the purpose of malware classification,” said Jugal Parikh and Marc Marino, the two Microsoft researchers who participated in the research on behalf of the Microsoft Threat Protection Intelligence Team.
Moving Forward
Deep learning underpins the STAMINA method by combining artificial intelligence (AI) and machine learning (ML) to allow computers to essentially train themselves. However, Microsoft admits the tool struggled with larger files. “For bigger size applications, STAMINA becomes less effective due to limitations in converting billions of pixels into JPEG images and then resizing them,” Microsoft said in a blog post last week. While it’s clearly early days, if success is ongoing STAMINA could one day be implemented across Microsoft’s business to help detect malware. Microsoft says its access to vast data from Windows (Microsoft) Defender puts it in a good position to train the service. “Anybody can build a model, but the labeled data and the quantity of it and the quality of it, really helps train the machine learning models appropriately and hence defines how effective they are going to be,” Ganacharya said. “And we, at Microsoft, have that as an advantage because we do have sensors that are bringing us lots of interesting signals through email, through identity, through the endpoint, and being able to combine them.”