Hack of the day #2: Command-Line Interface helpers

The problem

The “command-line input” (CLI), situated at the bottom of IDA’s window, is a very powerful tool to quickly execute commands in the language that is currently selected.

Typically, that language will be Python, and one can use helpers such as idc.here() to retrieve the address of the cursor location.

However, when some debuggers such as GDB or WinDbg are used, the CLI can be switched to one specific to the debugger being used, thereby providing a way to input commands that will be sent the debugger backend.

Alas, when one is debugging using GDB (for example), Python-specific helpers such as idc.here() are not available in that CLI anymore.

That means users will have to typically copy information from the listings, and then paste it into the CLI, which is very tedious in addition to being error-prone.

A first approach

An experienced IDA user recently came up to us with this issue, and suggested that we implement some “variable substitution”, before the text is sent to the backend (be it a debugger, or Python)

For example, the markers:

  • $! would be replaced with the current address,
  • $[ with the address of the beginning of the current selection,
  • $] with the address of the end of the current selection

Where the first approach falls short

We were very enthusiastic about this idea at first, but we quickly realized that this would open a can of worms, which we didn’t feel comfortable opening.

Here are some of the reasons:

  • It’s unclear how things such as an address should be represented. Should it be 0xXXXXXXXX, #XXXXXXXX, or even decimal? Depending on who will receive the text to execute, this matters
  • Whatever markers (such as $!) we support, it will never meet all the needs of all our users. It’s probably better if whatever solution we bring, doesn’t rely on a hard-coded set of substitutions.
  • Should expansion take place in string literals?

All-in-all, we decided that it might get very messy, very quickly, and that this first approach of implementing expension in IDA itself, is probably not the strongest idea.

However, the idea is just too good to give up about entirely, and perhaps we can come up with something “lighter”, that could be implemented in IDA 7.2 already (and even before, in fact), and would be helpful most of the time.

A second approach

IDA ships with PyQt5, a set of Python Qt bindings which lets us take advantage of pretty much all the features offered by Qt.

For example, it’s possible to place a “filter” on top of the CLI’s input field, that will perform the expansion, in-place.

The benefits of this are approach are:

  • it will already work in existing IDA releases
  • users can easily extend the set of markers that are recognized
  • it’s written in Python, thus won’t require recompilation when improved
  • since the expansion is performed in-place, it’s clear what is going to be sent to the backend

What follows, is a draft of how this could be done. It currently:

  • only expands $! into the current address, and
  • formats addresses as 0xXXXXXXXX

Perhaps someone will find this useful, and improve on it… (don’t hesitate to contact us at [email protected]hex-rays.com for suggestions!)


import re

from PyQt5 import QtCore, QtGui, QtWidgets

import ida_kernwin

dock = ida_kernwin.find_widget("Output window")
if dock:
    py_dock = ida_kernwin.PluginForm.FormToPyQtWidget(dock)
    line_edit = py_dock.findChild(QtWidgets.QLineEdit)
    if line_edit:
        try:
            line_edit.removeEventFilter(kpf)
        except:
            pass

        class filter_t(QtCore.QObject):

            def eventFilter(self, obj, event):
                if event.type() == QtCore.QEvent.KeyRelease:
                    self.expand_markers(obj)
                return QtCore.QObject.eventFilter(self, obj, event)

            def expand_markers(self, obj):
                text = obj.text()
                ea = ida_kernwin.get_screen_ea()
                exp_text = re.sub(r"\$!", "0x%x" % ea, text)
                if exp_text != text:
                    obj.setText(exp_text)

        kpf = filter_t()
        line_edit.installEventFilter(kpf)
        print("All set")

Update (April 25th, 2019)

Elias Bachaalany has a follow-up blog post about this topic: http://0xeb.net/2019/04/climacros-ida-productivity-tool/

Hack of the day #1: Decompiling selected functions

Intended audience

IDA 7.2 users, who have experience with IDAPython and/or the decompiler.

The problem

As you may already know, the decompilers allow not only decompiling the current function (shortcut F5) but also all the functions in the database (shortcut Ctrl+F5).

A somewhat less-well known feature of the “multiple” decompilation, is that if a range is selected (for example in the IDA View-A), only functions within that range will be decompiled.

Alas this is not good enough for the use-case of one of users, who would like to be able to select entries in the list provided by the
Functions window, and decompile those (the biggest difference with the “IDA View-A range” approach, is that there can be gaps in the selection — functions that the user doesn’t want to spend time decompiling.)

The solution

Although IDA doesn’t provide a built-in solution for this particular use-case (it cannot cover them all), we can use IDA’s scriptability to come up with the following IDAPython script, which should offer a very satisfying implementation of the idea described above:

import ida_kernwin
import ida_funcs
import ida_hexrays

class decompile_selected_t(ida_kernwin.action_handler_t):
    def activate(self, ctx):
        out_path = ida_kernwin.ask_file(
            True,
            None,
            "Please specify the output file name");
        if out_path:
            eas = []
            for pfn_idx in ctx.chooser_selection:
                pfn = ida_funcs.getn_func(pfn_idx)
                if pfn:
                    eas.append(pfn.start_ea)
            ida_hexrays.decompile_many(out_path, eas, 0)
        return 1

    def update(self, ctx):
        if ctx.widget_type == ida_kernwin.BWN_FUNCS:
            return ida_kernwin.AST_ENABLE_FOR_WIDGET
        else:
            return ida_kernwin.AST_DISABLE_FOR_WIDGET

ACTION_NAME = "decompile-selected"

ida_kernwin.register_action(
    ida_kernwin.action_desc_t(
        ACTION_NAME,
        "Decompile selected",
        decompile_selected_t(),
        "Ctrl+F5"))

class popup_hooks_t(ida_kernwin.UI_Hooks):
    def finish_populating_widget_popup(self, w, popup):
        if ida_kernwin.get_widget_type(w) == ida_kernwin.BWN_FUNCS:
            ida_kernwin.attach_action_to_popup(
                w,
                popup,
                ACTION_NAME,
                None)

hooks = popup_hooks_t()
hooks.hook()

IDA 7.2 – The Mac Rundown

We posted an addendum to the release notes for IDA 7.2: The Mac Rundown.

It dives much deeper into the Mac-specific features introduced in 7.2, and should be great reference material for users interested in reversing the latest Apple binaries. It’s packed full of hints, tricks, and workarounds.

We hope you will find it quite useful!

IDA 7.2: Qt 5.6.3 configure options & patch

A handful of our users have already requested information regarding the Qt 5.6.3 build, that is shipped with IDA 7.2.

Configure options

Here are the options that were used to build the libraries on:

  • Windows: ...\5.6.3\configure.bat "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "win32-msvc2015" "-opengl" "desktop" "-prefix" "C:/Qt/5.6.3-x64"
    • Note that you will have to build with Visual Studio 2015 or newer, to obtain compatible libs
  • Linux: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "linux-g++-64" "-developer-build" "-fontconfig" "-qt-freetype" "-qt-libpng" "-glib" "-qt-xcb" "-dbus" "-qt-sql-sqlite" "-gtkstyle" "-prefix" "/usr/local/Qt/5.6.3-x64"
  • Mac OSX: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "macx-clang" "-debug-and-release" "-fontconfig" "-qt-freetype" "-qt-libpng" "-qt-sql-sqlite" "-prefix" "/Users/Shared/Qt/5.6.3-x64"

patch

In addition to the specific configure options, the Qt build that ships with IDA includes the following patch. You should therefore apply it to your own Qt 5.6.3 sources before compiling, in order to obtain similar binaries (patch -p 1 < path/to/qt-5_6_3_full-IDA72.patch)

Note that this patch should work without any modification, against the 5.6.3 release as found there. You may have to fiddle with it, if your Qt 5.6.3 sources come from somewhere else.

Hex-Rays Microcode API vs. Obfuscating Compiler

This is a guest entry written by Rolf Rolles from Mobius Strip Reverse Engineering. His views and opinions are his own, and not those of Hex-Rays. Any technical or maintenance issues regarding the code herein should be directed to him.

In this entry, we’ll investigate an in-the-wild malware sample that was compiled by an obfuscating compiler to hinder analysis. We begin by examining its obfuscation techniques and formulating strategies for removing them. Following a brief detour into the Hex-Rays CTREE API, we find that the newly-released microcode API is more powerful and flexible for our task. We give an overview of the microcode API, and then we write a Hex-Rays plugin to automatically remove the obfuscation and present the user with a clean decompilation.

Continue reading Hex-Rays Microcode API vs. Obfuscating Compiler

Microcode in pictures

Since a picture is worth thousand words below are a few drawings for your perusal. Let us start at the top level, with the mbl_array_t class, which represents the entire microcode object:

The above picture does not show the control flow graph. For that we use predecessor and successor lists:

Pay attention to the block types here, they tell us how many outgoing edges must be present. Then, each basic block (mblock_t) contains a list of instructions:

Instructions (minsn_t) can be nested, and the next drawing shows how it looks like:

As you see, conceptually things are quite simple. But the devil is in the details, as usual 🙂

 

IDAPython: wrappers are only wrappers

Intended audience

IDAPython developers who enjoy the occasional headache, leaky abstraction enthousiasts, or simply the curious.

TL;DR

IDAPython wraps C++ types, and the lifecycle of C++ objects (and in particular members of larger objects) is not necessarily the same as that of the Python wrapper object that is wrapping it.

The problem

One of our users reported IDA crashes when an IDAPython script of theirs. The user came up with a very simple way to reproduce the issue (thank you!), showing that this had to do with accessing the parents member of a ida_hexrays.ctree_visitor_t instance.

Here is (an even more simplified version of) the script the user sent us:

from ida_hexrays import *

my_parents = None

class my_visitor_t(ctree_visitor_t):
    def __init__(self, func):
        ctree_visitor_t.__init__(self, CV_PARENTS)

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents
        return 0


def my_cb(event, *args):
    if event == hxe_print_func:
        f = args[0]
        my_visitor_t(f).apply_to(f.body, None)
        import gc
        gc.collect()
        my_parents.front() # will crash
    return 0

install_hexrays_callback(my_cb)

Note: I threw a gc.collect() in there, to make crashes more likely.

The script above is provided in its entirety for the sake of completeness, but really the important lines are only the following:

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents

    (...)

        my_visitor_t(f).apply_to(f.body, None)
        my_parents.front() # will crash

Details, details, details

Since this issue is non-trivial, I’ll try and provide a step-by-step explanation, hopefully as clear as can be, by annotating the important lines of code mentioned above:

        my_visitor_t(f)

Create a my_visitor_t instance. That is a subclass of the ctree_visitor_t type, which means it eventually extends a C++ object of type ctree_visitor_t.

When the underlying C++ ctree_visitor_t object is created, its member named parents (a ctree_items_t vector) is initialized. For the sake of the example, let’s say the C++ ctree_visitor_t instance is located at memory 0x1000 and the parents member is placed at memory 0x100C.

                       .apply_to(f.body, None)

Call ctree_visitor_t::apply_to. Thanks to SWiG “magic”, C++ virtual method calls will be properly redirected and our my_visitor_t.visit_expr method will be called for each cexpr_t in the tree, as expected.

        if self.parents is not None:

Access self.parents. This will create a Python wrapper object. The key here is to understand that it’s a wrapper object which is backed by the real, C++ ctree_items_t instance.

For example, any access to the object returned by self.parents, will in fact translate to an access into the C++ ctree_items_t vector, so if one were to write, e.g., self.parents.size() (or even len(self.parents)), it’s actually the real underlying C++ ctree_items_t instance’s size() method that will end up being called.

            my_parents = self.parents

Another access to self.parents, and another Python wrapper will be created (once again backed by the actual ctree_items_t vector)

[Note: the fact that another wrapper is created is not a problem (in fact since it went out of scope, the previous wrapper might already have been garbage collected!)]

Once again, for the sake of the example, let’s say the wrapping PyObject instance is placed in memory, at 0xB000.
That wrapper is then bound to the global variable my_parents, causing its python refcount to increase to 2. Past that line, the refcount will drop back to 1 (again, because of scope logic), which means that Python wrapper object will remain alive.

[...apply_to() returns, and we are now back to the `my_cb` function...]

At this point, it’s likely my_visitor_t(f) has just been garbage collected since nobody keeps a reference to it.

That means:

  • the my_visitor_t instance has been destroyed, which means
  • the underlying ctree_visitor_t C++ object located at memory 0x1000 has been deleted, which in turn means
  • its parents object, which was located at memory 0x100C, is now invalid

                my_parents.front()
    

We are now calling front() on the my_parents Python object. If you recall, that my_parents object is a Python wrapper object located in memory at 0xB000. That wrapper object still has a refcount of (at least) 1, and is thus alive.

What is not quite alive anymore, however, is the actual C++ ctree_items_t vector, which was deleted as part of deleting the C++ ctree_visitor_t it belonged to.

In other words, we have a perfectly valid Python wrapper object, that has a dangling pointer to a member of a freshly-deleted C++ object.

The solution

The solution is, in terms of effort, rather simple: make a copy of the vector:

-            my_parents = self.parents
+            my_parents = ctree_items_t(self.parents)

since it doesn’t belong to the C++ ctree_visitor_t object, this copy won’t be thrashed when it is deleted.

Deobfuscating xor’ed strings

A few days ago a customer sent us a sample file. The code he sent us was using a very simple technique to obfuscate string constants by building them on the fly and using ‘xor’ to hide the string contents from static disassembly:


The decompiler recovered most of the xor’ed values but some of them were left obfuscated:

After some investigation it turned out that it is a shortcoming our the decompiler: the value propagation (or constant folding) can not handle the situation when an unusual part of a value is used in another expression. For example, if an instruction defines a four byte value, the second byte of the value can not be propagated to other expressions. More standard cases, like the low or high two bytes, or even just one byte, are handled well.

It seems that compilers never leave such constants unpropagated, this is why we did not encounter this case before.

Let us write a short decompiler plugin that would handle this situation and propagate a part of a constant into another expression. The idea is simple: as soon as we find a situation when a constant is used in a binary operation like xor, we will try to find the definition of the second operand, and if it is a constant, then we will propagate it. Graphically it will look like this:

mov #N, var.4           ; put a 4 byte constant into var
...
xor var @1.1, #M, var2.1 ; xor the second byte of var

is converted into

mov #N, var.4
...
xor #N>>8, #M, var2.1

The resulting xor will then automatically get optimized by the decompiler. However, to speed up things (to avoid another loop of optimization rules), we will call the optimize_flat() function ourselves.

Please note that we do not rely on the instruction opcode: the xor opcode can be replaced by any other binary operation, our logic will still work correctly.

Also we do not rely on the operand sizes (well, to speed up things we do not handle operands wider that 1 byte because they are handled fine by the decompiler).

Also we can handle not only the second byte, but any byte of the variable.

The final version of the plugin can be downloaded here. It is fully automatic, you just need to drop it into the plugins/ directory.

And the decompiler output looks nice now:

We could further improve the output and convert these assignments into a call to the strcpy() function, but this is left as an exercise for our dear readers 😉

P.S. Naturally, we will improve the decompiler to handle this case. The next version will include this improvement.

IDA on non-OS X/Retina Hi-DPI displays

The problem

Some users running IDA on Windows & Linux X11 platforms with Hi-DPI displays, have reported that IDA looks rather odd: the navigator bar is too narrow, the text under it gets truncated, and there is overall feeling of packing & clumsiness:

  • Windows:
  • Linux X11:

Looking closely, one can notice the following issues (probably not an exhaustive list)

Note: this should not happen and shouldn’t apply for OS X users running IDA on Retina displays. Nor should it happen (but we didn’t get a chance to test this) on non-X11 Linux display managers, such as Wayland.

Fix / mitigation:

On Linux X11 & Windows, if you are using Hi-DPI monitors and IDA looks somewhat like it does in the above screenshots, please try setting the environment variable QT_AUTO_SCREEN_SCALE_FACTOR to 1:

E.g., on Linux/X11:

~# export QT_AUTO_SCREEN_SCALE_FACTOR=1
~# path/to/ida my.idb

IDA should now look more pleasant:

  • Windows:
  • Linux X11:

Some things are still not perfect (e.g., checkboxes might remain small), but IDA definitely looks better.

Please give it a try!

Gory details

When one applies scaling/zooming, either in Windows and Linux X11, that will have the effect of causing the OS to return scaled values for font metrics when queried using point sizes (which is almost always the case.)

For example, when the font metrics for a font of size 12pt are requested by a Qt application, instead of returning 14 pixels as it would on a non-scaled system, the operating system will instead return 28 pixels on a 200% scaled one (in other words, this is essentially a font database-related feature).

That will, in turn, have the net effect of causing Qt to compute layout of the surrounding widgets according to those scaled font metrics, which explains why the text is (for the most part) not truncated.

However, what applying scaling does not do, is tell Qt that it should scale all other pixel measurements by that scale factor.

Consequently, paddings, margins, scrollbars, etc… all have uncomfortably small dimensions, especially when compared to text.

The QT_AUTO_SCREEN_SCALE_FACTOR environment variable is an opt-in that program users can define, in order to control how the program should look. It will in essence instruct Qt to perform automatic scaling of (non-font-related) graphical operations according to the pixel density of the screen(s).

More information can be found on Qt’s website.

Why is this not needed under OS X + Retina?

This is not needed under OS X + Retina, because Qt does not need to perform any kind of scaling there: the scaling is performed by the drawing primitives of the OS itself, and is entirely transparent to the application.

(In fact, an OSX application doesn’t even work with the real screen geometry, but rather with an “abstract” coordinate system, normalizing pixel sizes across screen densities.)

IDA 7.1: Qt 5.6.3 configure options & patch

A handful of our users have already requested information regarding the Qt 5.6.3 build, that is shipped with IDA 7.1.

Configure options

Here are the options that were used to build the libraries on:

  • Windows: ...\5.6.3\configure.bat "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "win32-msvc2015" "-opengl" "desktop" "-prefix" "C:/Qt/5.6.3-x64"
    • Note that you will have to build with Visual Studio 2015, to obtain compatible libs
  • Linux: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "linux-g++-64" "-developer-build" "-fontconfig" "-qt-freetype" "-qt-libpng" "-glib" "-qt-xcb" "-dbus" "-qt-sql-sqlite" "-gtkstyle" "-prefix" "/usr/local/Qt/5.6.3-x64"
  • Mac OSX: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "macx-g++" "-debug-and-release" "-fontconfig" "-qt-freetype" "-qt-libpng" "-qt-sql-sqlite" "-prefix" "/Users/Shared/Qt/5.6.3-x64"

patch

In addition to the specific configure options, the Qt build that ships with IDA includes the following patch. You should therefore apply it to your own Qt 5.6.3 sources before compiling, in order to obtain similar binaries (patch -p 1 < path/to/qt-5_6_3_full-IDA71.patch)

Note that this patch should work without any modification, against the 5.6.3 release as found there. You may have to fiddle with it, if your Qt 5.6.3 sources come from somewhere else.