New features in Hex-Rays Decompiler 1.6

Last week we released IDA 6.2 and Hex-Rays Decompiler 1.6. Many of the new IDA features have been described in previous posts, but there have been notable additions in the decompiler as well. They will let you make the decompilation cleaner and closer to the original source. However, it might be not very obvious how to use some of them, so we will describe them in more detail.

1. Variable mapping

This is probably the simplest new feature and can be used without any extra preparation.

Sometimes the compiler stores the same variable in several places (e.g. a register and a stack slot). While the decompiler often manages to combine such locations, sometimes it’s not able to prove that they always contain the same value (especially in presence of calls that take address of stack variables). In such cases the user can help by performing such a merge or mapping manually.

Consider the following very common case:

int __stdcall SciFreeFilterInstance(_FILTER_INSTANCE *pFilterInstance)
{
  _FILTER_INSTANCE *v1; // esi@1

  v1 = pFilterInstance;
  if ( pFilterInstance->Signature != 'FrtS' )
    RtlAssert(
      "(pFilterInstance)->Signature==SIGN_FILTER_INSTANCE",
      "d:\\xpsprtm\\drivers\\wdm\\dvd\\class\\codinit.c",
      0x17A2u,
      0);
  StreamClassDebugPrint(2, "Freeing filterinstance %p still open streams\n", v1);

The compiler copied an incoming argument (pFilterInstance) into a register (v1==esi). To get rid of the extra name, right-click the left-hand variable and choose “Map to another variable”, or place cursor on it and press ‘=’:

mapvar2

Choose the right-hand variable from the list.

mapvar3

Once decompilation is refreshed, both the left-hand variable (v1) and the assignment are gone. Now we have only one variable – the incoming argument.

int __stdcall SciFreeFilterInstance(_FILTER_INSTANCE *pFilterInstance)
{
  if ( pFilterInstance->Signature != 'FrtS' )
    RtlAssert(
      "(pFilterInstance)->Signature==SIGN_FILTER_INSTANCE",
      "d:\\xpsprtm\\drivers\\wdm\\dvd\\class\\codinit.c",
      0x17A2u,
      0);
  StreamClassDebugPrint(2, "Freeing filterinstance %p still open streams\n",
    pFilterInstance);

You can map several variables to the same name, if necessary.

Made a mistake or mapped too much? It’s simple to fix. Right-click the wrongly mapped name and choose “Unmap variables”. Then choose the variable you want to see again.

2. Union selection.

This feature, naturally, only applies to unions. That means that you need to have union types in your database and assign the types to some variables or fields.

Normally the decompiler tries to choose a union field which matches the expression best, but sometimes there are several equally valid matches, and sometimes other types in the expression are wrong. In such cases, you can override the decompiler’s decision. For example, this code is common in Windows drivers:

NTSTATUS __stdcall DispatchDeviceControl(PDEVICE_OBJECT DeviceObject, PIRP Irp)
{
  PIO_STACK_LOCATION stacklocation; // ebx@1

  stacklocation = Irp->Tail.Overlay.CurrentStackLocation;
  if ( *&stacklocation->Parameters.Create.FileAttributes == 0x224010 )
  {
    v8 = stacklocation->Parameters.Create.Options == 20;
    if ( !v8 )
      goto LABEL_18;
    if ( stacklocation->Parameters.Create.SecurityContext < 1 )
      goto LABEL_87;
    v23 = Irp->AssociatedIrp.MasterIrp;

Since we know we’re in a DeviceControl handler, it’s likely the code is inspecting the Parameters.DeviceIoControl substructure and not Parameters.Create.

Right-click the field and choose “Select union field”, or place cursor on it and press Alt-Y.

selunion2

Choose the Parameters.DeviceIoControl.IoControlCode field.

selunion3

Other references to Parameters.Create can be fixed the same way. The updated decompilation makes more sense:

NTSTATUS __stdcall DispatchDeviceControl(PDEVICE_OBJECT DeviceObject, PIRP Irp)
{
  PIO_STACK_LOCATION stacklocation; // ebx@1

  stacklocation = Irp->Tail.Overlay.CurrentStackLocation;
  if ( stacklocation->Parameters.DeviceIoControl.IoControlCode == 0x224010 )
  {
    v8 = stacklocation->Parameters.DeviceIoControl.InputBufferLength == 20;
    if ( !v8 )
      goto LABEL_18;
    if ( stacklocation->Parameters.DeviceIoControl.OutputBufferLength < 1 )
      goto LABEL_87;

3. CONTAINING_RECORD macro

This macro is commonly use in Windows drivers to get a pointer to the parent structure when we have a pointer to one of its fields.

For example, consider these two structures, used in a driver:

struct _HW_STREAM_OBJECT {
  ULONG  SizeOfThisPacket;
  ULONG  StreamNumber;
  PVOID  HwStreamExtension;
  ...
} HW_STREAM_OBJECT, *PHW_STREAM_OBJECT;

struct _STREAM_OBJECT
{
  _COMMON_OBJECT ComObj;
  _FILE_OBJECT *FilterFileObject;
  _FILE_OBJECT *FileObject;
  _FILTER_INSTANCE *FilterInstance;
  _HW_STREAM_OBJECT HwStreamObject;
  ...
};

The following function accepts a pointer to _HW_STREAM_OBJECT:

void __cdecl StreamClassStreamNotification(
  int NotificationType,
  _HW_STREAM_OBJECT *StreamObject,
  _HW_STREAM_REQUEST_BLOCK *pSrb,
  _KSEVENT_ENTRY *EventEntry,
  GUID *EventSet,
  ULONG EventId);

But immediately converts it into the containing _STREAM_OBJECT:

mov     eax, [ebp+StreamObject]
test    eax, eax
push    ebx
push    esi
lea     esi, [eax-_STREAM_OBJECT.HwStreamObject]

Default decompilation doesn’t look great:

  char *v6; // esi@1
  v6 = (char *)&StreamObject[-2] - 36;

There are two ways to make it nicer:

  1. Change type of v6 to be _STREAM_OBJECT*. The decompiler will detect that the expression “lines up” and convert it to use the macro.
  2. Right-click on the delta being subtracted (-36), select “Structure offset” and choose _STREAM_OBJECT from the list.

In both cases you should get a nice expression:

  v6 = CONTAINING_RECORD(StreamObject, _STREAM_OBJECT, HwStreamObject);

N.B.: currently you need to refresh the decompilation (press F5) to see the changes. We’ll improve it to happen automatically in future.

4. Kernel and user-mode macros involving fs segment access.

On Windows, the fs segment is used to store various thread-specific (for user-mode) or processor-specific (for kernel mode) data. Hex-Rays Decompiler 1.6 detects the most common ways of accessing them and converts them to corresponding macros. However, this functionality requires presence of specific types in the database. For user mode, it is the _TEB structure, for kernel mode it’s the KPCR structure.

For example, consider the following code:

mov     eax, large fs:18h
mov     eax, [eax+30h]
push    24h
push    8
push    dword ptr [eax+18h]
call    ds:__imp__RtlAllocateHeap@12 ; RtlAllocateHeap(x,x,x)
mov     esi, eax

If you don’t have the _TEB structure in types, this will be decompiled to:

  v5 = RtlAllocateHeap(*(_DWORD *)(*(_DWORD *)(__readfsdword(24) + 48) + 24), 8, 36);

However, if you do add the type, it will look much nicer:

  v5 = RtlAllocateHeap(NtCurrentTeb()->ProcessEnvironmentBlock->ProcessHeap, 8, 36);

Currently we support the following macros:

Macro Required types
NtCurrentTeb _TEB
KeGetPcr KPCR
KeGetCurrentPrcb KPCR, KPCRB
KeGetCurrentProcessorNumber KPCR
KeGetCurrentThread KPCR, _KTHREAD

Hint: the easiest way to get _TEB or KPCR types into your database is using the PDB plugin. Invoke it from File|Load file|PDB file…, enter a path to kernel32.dll (for user-mode code) or ntoskrnl.exe (for kernel-mode code), and check the “Types only” checkbox.

kernpdb

PDBs for those two files usually contain the necessary OS structures.

We hope you will like these new additions. Note that the version 1.6 includes even more improvements and fixes, see the full list of the new features and the comparison page.

This entry was posted in Decompilation and tagged . Bookmark the permalink.

9 Responses to New features in Hex-Rays Decompiler 1.6

  1. JayZ says:

    nice but Hex-Rays Decompiler is way too expensive to purchase since its useless on obfuscated code…

  2. None says:

    Looks great. You do realize there is no link to the product or the company website on your blog right?

  3. Jakob says:

    The PDB’s for _TEB and the likes are actually located in ntdll.pdb, so load ntdll.dll. Also remember that if you are on a 64-bit system, the 32-bit version of the dll is inside C:\Windows\SysWOW64 folder.

  4. hiber says:

    The decompiler does help a lot while reversing.
    I’m waiting for the decompiler to support PowerPC and MIPS(PowerPC is prefered).
    As PowerPC is also RISC processor and doesn’t have 32/16-bit mixed code, I think it should be easier to implement the decompiler for PowerPC than for ARM.
    However, there are less user of PowerPC than ARM:-(

  5. Alexandre says:

    I wish there will be 2 things in the next update:
    1) at least simple c++ inheritance for structs and apropriate pointer handling
    2) hexrays will have a way to detect iterators. e.g. when decompiled code looks like:
    for (void *i = &struct_array[0].some_center_field; i < &struct_array[99].some_center_field; i += sizeof(struct_array[0])
    {
    *(i + 4) = ….
    int a = (int *)i[-4];
    …etc

    it is really annoying when loops are complex and there are many fields

  6. Avi Cohen Stuart says:

    It is nice to see how HexRays has been growing up since the first release (of which I proudly was one of the beta testers)
    The potential is ever growing with each release including the ease of use and the highly extendable architecture using plugins.
    I think that users who either complain about the high price or the ‘annoyingness’ about some corners which for sure will be taken seriously by the HexRay development team.

    Guys, again Great Job!

    (Nice to see that the stream.sys I once send in maintains an interesting example :-)