Analysis of a 3-stage malware sample resulting in a dcrat infection. The initial sample contains 2 payloads which are hidden by obfuscation. This analysis will demonstrate methods for manually uncovering both payloads and extracting the final obfuscated C2.
Tooling
- Detect-it-easy - Quick initial analysis of pe-files
- Dnspy - Analysis, decompilation and debugging of .NET files
- Cyberchef - Interactive tool for prototyping decoders
Samples
The malware file can be found here
And a copy of the decoding scripts here
Initial Analysis
The initial file can be downloaded via Malware Bazaar and unzipped using the password infected
detect-it-easy is a great tool for the initial analysis of the file.
Pe-studio is also a great option but we personally prefer the speed and simplicity of detect-it-easy
Detect-it-easy revealed that the sample is a 32-bit .NET-based file.
- The protector Confuser(1.X)
has also been recognized.
Before proceeding, we checked the entropy graph for signs of embedded files.
I used this to determine if the file was really dcrat
, or a loader for an additional payload containing dcrat
.
In my experience, large and high entropy sections often indicate an embedded payload. Indicating that the file being analyzed is a loader.
The entropy graph revealed that a significant portion of the file has an entropy of 7.98897
(This is very high, the maximum value is 8).
This was a strong indicator that the file was a loader and not the final dcrat
payload.
To analyze the suspected loader, we moved on to Dnspy
Dnspy Analysis
Utilizing Dnspy, we saw that the file had been recognized as rewrwr.exe
and contained references to confuserEx. Likely this means the file is obfuscated using ConfuserEx
and might be a pain to analyze.
To peek at the code being executed - we right-clicked on the rewrwr.exe
name and selected go to entry point
This would give me a rough idea of what the actual executed code might look like.
The file immediately creates an extremely large array of unsigned integers. This could be an encrypted array of integers containing bytecodes for the next stage (further suggested by a post-array reference to a Decrypt
function)
The initial array of uints was so huge that it was too large to display in Dnspy.
Given the size, we suspected this array was the reason for the extremely high entropy previously observed with detect-it-easy
After the array, there is again code that suggests the array's contents are decrypted, then loaded into memory with the name koi
Given the relative simplicity of the code so far - we suspected the encryption was not complex, but still, we decided not to analyze it this time.
Instead, we considered two other approaches
- Set a breakpoint after the
Decrypt
call and dump the result from memory. - Set a
module breakpoint
to break when the new module is decrypted and loaded. Then dump the result into a file.
I took the second approach, as it is reliable and useful for occasions where the location of decryption and loading isn't as easy to find. (Typically it's more complicated to find the Decryption function, but luckily in this case it was rather simple)
Either way, we decided to take the second approach.
Extracting Stage 2 using Module Breakpoints
To extract stage 2 - We first created a module breakpoint
which would break on all module loads.
To do this, we first opened the module breakpoints window.
Debug -> Windows -> Module Breakpoints
We then created a module breakpoint with two wildcards. This will break on all new modules loaded by the malware.
We then executed the malware using the start button
We can accept the default options.
Immediately, a breakpoint was hit as mscorelib.dll was being loaded into memory. This is a default library and we ignored it by selecting Continue
The next module loaded was the original file being analyzed, which in this case can be safely ignored.
After that, a suspicious-looking koi
module was loaded into memory. (If you don't have a modules
window, go to debug -> windows -> modules
)
Here we could see the koi
module had been loaded.
At this point, we saved the koi
module to a new file using Right-Click -> Save Module
.
We can then exit the debugger and move on to the koi.exe
file.
Analysis of koi.exe
We can observe that koi.exe
is another 32-bit .net file containing references to the ConfuserEx
Obfuscator
This time it does not seem to contain any large encrypted payloads.
Although the overall entropy is low, large portions of the graph are still suspiciously flat. This can sometimes be an indication of text-based obfuscation.
We can now go ahead and open koi.exe
in Dnspy.
This time there was another rewrwr.exe
name and references again to ConfuserEx
Koi.exe
does not have a defined Entry Point. Instead we can begin analysis with the rewrwr
namespace (located in the side panel). This namespace contains one Class named Form1
The Form1
class immediately called Form1_Load
, which itself immediately referenced a large string that appeared to be base64 encoded.
Despite appearing to be base64 - the text does not successfully code using base64. This was an indicator that some additional tricks or obfuscation had been used.
I decided to jump to the end of the base64-looking data - Noting that there were about 50 large strings in total. Each titled Str1
str2
... all the way to Str49
It was very likely these strings were the cause of the flat entropy graph we viewed earlier. Text based obfuscation tends to produce lower entropy than "proper" encryption
At the end of the data was the decoding logic. Which appeared to be taking the first character from each string and adding it to a buffer.
After the buffer is filled, it is base64 decoded and loaded into memory as an additional module.
In order to confirm the theory on how the strings are decoded, we can take the first character from the first 5 strings and base64 decode the result.
This confirmed the theory of how the malware was decoding the next stage.
In order to extract the next module, we can copy out the strings and place them into a Python script.
Running this script creates a third file. Which for simplicity's sake is named output.bin
The file is recognized as a 32-bit .NET file. So the decoding was successful.
Stage 3 - Analysis
We have now obtained a stage 3 file - which again is a 32-bit .NET executable.
Luckily this time, there are no references to ConfuserEx
or other obfuscators.
The entropy is reasonably normal - and does not contain any large flat sections that can indicate a hidden payload.
The lack of ConfuserEx
and relatively normal entropy - is an indication that this may be the final payload.
Moving on to Dnspy, the file is recognized as IvTdur2zx
Despite the lack of ConfuserEx
, the namespaces and class names look obfuscated in some way.
We can jump to the Entry Point for further analysis.
The first few functions are mostly junk - but there are some interesting strings referenced throughout the code.
For example - references to a .bat script being written to disk
Since the strings were largely plaintext and not obfuscated - At this point we can use detect-it-easy
to look for more interesting strings contained within the file.
This reveals a reference to DCrat - as well as some potential targeted applications (discord, steam, etc)
At that point, you could probably assume the file was DCrat and an info stealer - but we wanted to continue my analysis until I'd found the C2.
In the above screenshot, we noticed some interesting strings that looked like base64 encoding + gzip (the H4sIAA* is a base64-encoded gzip header).
So we attempted to analyze these using CyberChef.
The first resulted in what appeared to be a base64 encoded + reversed
string.
This was strongly hinted by the presence of ==
at the start.
After applying a character reverse + base64 decode
. We were able to obtain a strange dictionary as well as a mutex of Wzjn9oCrsWNteRRGsQXn
+ some basic config.
This was cool but still no C2.
I then tried to decode the second base64 blob shown by detect-it-easy
.
But the result was largely junk.
Attempting to reverse + base64 decode
returned no results.
At this point - we decided to search for the base64 encoded string to see where it was referenced in the .net code.
This revealed an interesting function showing multiple additional functions acting on the base64 encoded data.
In total, there are 4 functions ( M2r.957
, M2r.i6B
, M2r.1vX
, M2r.i59
) which are acting on the encoded data.
The first function M2r.957
is a wrapper around another function M2r.276
which performed the base64
and Gzip
decoding.
The next function M2r.i6B
took the previously obtained string and then performed a Replace
operation based on a Dictionary
Interesting to note - is that theValue
is replaced with theKey
and not the other way around as you might expect.
Based on the previous code, the input dictionary had something to do with a value of SCRT
Suspiciously - there was an SCRT
that looked like a dictionary in the first base64 string that was decoded.
So we obtained that dictionary and prettied it up using Cyberchef to remove all of the \
escapes.
We then created a partial Python script based on the information we had so far. (I'll post a link at the end of this post)
Executing this result and printing the result - we were able to obtain a cleaner-looking string than before.
Here's a before and after
It was probably safe to assume this string was reversed
+ base64 encoded
, but we decided to check the remaining two decoding functions just to make sure.
M2r.1vX
was indeed responsible for reversing the string.
M2r.i59
was indeed responsible for base64 decoding the result.
So we then added these steps to my Python script.
And executed to reveal the results - successful C2!
http://battletw[.]beget[.]tech/
(The URLs contained some base64 reversed/encoded strings and were not very interesting)
This C2 domain had only 2/85
hits on VirusTotal
At this point, we had obtained the C2 and decided to stop my analysis.
In a real environment, it would be best to block this domain immediately in your security solutions. Additionally, you could review the previous string dumps for process-based indicators that could be used to hunt signs of successful execution.
Additionally, you could try to derive some Sigma rules from the string dumps or potentially use the C2 URL structure to hunt through proxy logs.
Links
- Copies of the decoding scripts - https://github.com/embee-research/Decoders/tree/main/2023-April-dcrat
- Link to the original malware - https://bazaar.abuse.ch/sample/fd687a05b13c4f87f139d043c4d9d936b73762d616204bfb090124fd163c316e/