Deobfuscating xor’ed strings

A few days ago a customer sent us a sample file. The code he sent us was using a very simple technique to obfuscate string constants by building them on the fly and using ‘xor’ to hide the string contents from static disassembly:


The decompiler recovered most of the xor’ed values but some of them were left obfuscated:

After some investigation it turned out that it is a shortcoming our the decompiler: the value propagation (or constant folding) can not handle the situation when an unusual part of a value is used in another expression. For example, if an instruction defines a four byte value, the second byte of the value can not be propagated to other expressions. More standard cases, like the low or high two bytes, or even just one byte, are handled well.

It seems that compilers never leave such constants unpropagated, this is why we did not encounter this case before.

Let us write a short decompiler plugin that would handle this situation and propagate a part of a constant into another expression. The idea is simple: as soon as we find a situation when a constant is used in a binary operation like xor, we will try to find the definition of the second operand, and if it is a constant, then we will propagate it. Graphically it will look like this:

mov #N, var.4           ; put a 4 byte constant into var
...
xor var @1.1, #M, var2.1 ; xor the second byte of var

is converted into

mov #N, var.4
...
xor #N>>8, #M, var2.1

The resulting xor will then automatically get optimized by the decompiler. However, to speed up things (to avoid another loop of optimization rules), we will call the optimize_flat() function ourselves.

Please note that we do not rely on the instruction opcode: the xor opcode can be replaced by any other binary operation, our logic will still work correctly.

Also we do not rely on the operand sizes (well, to speed up things we do not handle operands wider that 1 byte because they are handled fine by the decompiler).

Also we can handle not only the second byte, but any byte of the variable.

The final version of the plugin can be downloaded here. It is fully automatic, you just need to drop it into the plugins/ directory.

And the decompiler output looks nice now:

We could further improve the output and convert these assignments into a call to the strcpy() function, but this is left as an exercise for our dear readers 😉

P.S. Naturally, we will improve the decompiler to handle this case. The next version will include this improvement.