IDAPython: wrappers are only wrappers

Intended audience

IDAPython developers who enjoy the occasional headache, leaky abstraction enthousiasts, or simply the curious.

TL;DR

IDAPython wraps C++ types, and the lifecycle of C++ objects (and in particular members of larger objects) is not necessarily the same as that of the Python wrapper object that is wrapping it.

The problem

One of our users reported IDA crashes when an IDAPython script of theirs. The user came up with a very simple way to reproduce the issue (thank you!), showing that this had to do with accessing the parents member of a ida_hexrays.ctree_visitor_t instance.

Here is (an even more simplified version of) the script the user sent us:

from ida_hexrays import *

my_parents = None

class my_visitor_t(ctree_visitor_t):
    def __init__(self, func):
        ctree_visitor_t.__init__(self, CV_PARENTS)

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents
        return 0


def my_cb(event, *args):
    if event == hxe_print_func:
        f = args[0]
        my_visitor_t(f).apply_to(f.body, None)
        import gc
        gc.collect()
        my_parents.front() # will crash
    return 0

install_hexrays_callback(my_cb)

Note: I threw a gc.collect() in there, to make crashes more likely.

The script above is provided in its entirety for the sake of completeness, but really the important lines are only the following:

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents

    (...)

        my_visitor_t(f).apply_to(f.body, None)
        my_parents.front() # will crash

Details, details, details

Since this issue is non-trivial, I’ll try and provide a step-by-step explanation, hopefully as clear as can be, by annotating the important lines of code mentioned above:

        my_visitor_t(f)

Create a my_visitor_t instance. That is a subclass of the ctree_visitor_t type, which means it eventually extends a C++ object of type ctree_visitor_t.

When the underlying C++ ctree_visitor_t object is created, its member named parents (a ctree_items_t vector) is initialized. For the sake of the example, let’s say the C++ ctree_visitor_t instance is located at memory 0x1000 and the parents member is placed at memory 0x100C.

                       .apply_to(f.body, None)

Call ctree_visitor_t::apply_to. Thanks to SWiG “magic”, C++ virtual method calls will be properly redirected and our my_visitor_t.visit_expr method will be called for each cexpr_t in the tree, as expected.

        if self.parents is not None:

Access self.parents. This will create a Python wrapper object. The key here is to understand that it’s a wrapper object which is backed by the real, C++ ctree_items_t instance.

For example, any access to the object returned by self.parents, will in fact translate to an access into the C++ ctree_items_t vector, so if one were to write, e.g., self.parents.size() (or even len(self.parents)), it’s actually the real underlying C++ ctree_items_t instance’s size() method that will end up being called.

            my_parents = self.parents

Another access to self.parents, and another Python wrapper will be created (once again backed by the actual ctree_items_t vector)

[Note: the fact that another wrapper is created is not a problem (in fact since it went out of scope, the previous wrapper might already have been garbage collected!)]

Once again, for the sake of the example, let’s say the wrapping PyObject instance is placed in memory, at 0xB000.
That wrapper is then bound to the global variable my_parents, causing its python refcount to increase to 2. Past that line, the refcount will drop back to 1 (again, because of scope logic), which means that Python wrapper object will remain alive.

[...apply_to() returns, and we are now back to the `my_cb` function...]

At this point, it’s likely my_visitor_t(f) has just been garbage collected since nobody keeps a reference to it.

That means:

  • the my_visitor_t instance has been destroyed, which means
  • the underlying ctree_visitor_t C++ object located at memory 0x1000 has been deleted, which in turn means
  • its parents object, which was located at memory 0x100C, is now invalid

                my_parents.front()
    

We are now calling front() on the my_parents Python object. If you recall, that my_parents object is a Python wrapper object located in memory at 0xB000. That wrapper object still has a refcount of (at least) 1, and is thus alive.

What is not quite alive anymore, however, is the actual C++ ctree_items_t vector, which was deleted as part of deleting the C++ ctree_visitor_t it belonged to.

In other words, we have a perfectly valid Python wrapper object, that has a dangling pointer to a member of a freshly-deleted C++ object.

The solution

The solution is, in terms of effort, rather simple: make a copy of the vector:

-            my_parents = self.parents
+            my_parents = ctree_items_t(self.parents)

since it doesn’t belong to the C++ ctree_visitor_t object, this copy won’t be thrashed when it is deleted.

Deobfuscating xor’ed strings

A few days ago a customer sent us a sample file. The code he sent us was using a very simple technique to obfuscate string constants by building them on the fly and using ‘xor’ to hide the string contents from static disassembly:


The decompiler recovered most of the xor’ed values but some of them were left obfuscated:

After some investigation it turned out that it is a shortcoming our the decompiler: the value propagation (or constant folding) can not handle the situation when an unusual part of a value is used in another expression. For example, if an instruction defines a four byte value, the second byte of the value can not be propagated to other expressions. More standard cases, like the low or high two bytes, or even just one byte, are handled well.

It seems that compilers never leave such constants unpropagated, this is why we did not encounter this case before.

Let us write a short decompiler plugin that would handle this situation and propagate a part of a constant into another expression. The idea is simple: as soon as we find a situation when a constant is used in a binary operation like xor, we will try to find the definition of the second operand, and if it is a constant, then we will propagate it. Graphically it will look like this:

mov #N, var.4           ; put a 4 byte constant into var
...
xor var @1.1, #M, var2.1 ; xor the second byte of var

is converted into

mov #N, var.4
...
xor #N>>8, #M, var2.1

The resulting xor will then automatically get optimized by the decompiler. However, to speed up things (to avoid another loop of optimization rules), we will call the optimize_flat() function ourselves.

Please note that we do not rely on the instruction opcode: the xor opcode can be replaced by any other binary operation, our logic will still work correctly.

Also we do not rely on the operand sizes (well, to speed up things we do not handle operands wider that 1 byte because they are handled fine by the decompiler).

Also we can handle not only the second byte, but any byte of the variable.

The final version of the plugin can be downloaded here. It is fully automatic, you just need to drop it into the plugins/ directory.

And the decompiler output looks nice now:

We could further improve the output and convert these assignments into a call to the strcpy() function, but this is left as an exercise for our dear readers 😉

P.S. Naturally, we will improve the decompiler to handle this case. The next version will include this improvement.