Microsoft Office Doc Malware Analysis - Nasty Macros

Perhaps the most prevalent delivery method right now out in the wild is that of the weaponized office document.  This sample is a word doc that appears to download an Emotet variant.  The download occurs through the usage of malicious macros and PowerShell.  Let's take it apart!

I started with identifying what streams within the binary contain macros so I know what to extract.  oledump identified streams 8 and 14 as having embedded macros.


Next, I carved out the macros to get a look at the actual code.


I then copied the macro code into Notepad++ for a more friendly look.  The Autoopen() function is always interesting, this is the first function that will run automatically if macros are enabled.


Taking a peek around the code, I noticed that there were a ton of variables being defined that were never used, pointless arithmetic, and basically just a ton of noise/garbage code.  This is very much indicative of an automated obfuscation algorithm being run against the original source code of this malware.  Notice how the below variable is only referenced once throughout the code.  Defined, but never used.


I cleaned up the script by deleting all the nonsense code, leaving behind what appears to be a very long base64 string.  All the code below is doing is concatenating all of the below strings into one giant string, which appears to be passed to PowerShell.  Notice the 'vbKeyP' ("P") being concatenated with "owers" + "Hell -e...".  Breaking up these strings is obviously to evade signature-based detection and thwart analysis.


This code is almost legitimate python code as is, so I pretty much just added a print() function at the end and allowed python to put this long string together for me.


Here is the fully concatenated string.

I then piped out the output of the script to the base64 --decode command to decode it.


As you can see, this malware contains yet another layer of obfuscation. Another base64 string, except this one is compressed prior to being encoded.  If we try to decode the base64 string, we get illegible code due to the string being compressed.


To decompress this without running the macro, I copied the PowerShell script into PowerShell ISE and set a variable with the value being the result of the decompression function.  I then printed out the decompressed script with the Write-Host cmdlet.  Here is a better look at that long decoded base64 string.


Here is how I set the result of the decompression function to a variable, then printed those results with Write-Host.


We can now take the decoded contents and clean it up in Notepad++.


So the malware appears to call out over port 80 to download an executable, drop it in the temp directory, and then run it.  Other methods we could've used to analyze this malware includes: opening the malicious office doc and enabling macros while running system monitoring/network sniffing tools, or debugging the malicious document in developer mode in Microsoft word.  Obviously static deobfuscation/reverse engineering will provide the most visibility and the best look under the hood.  Thanks for reading!

Comments