x64 decompiler not far away

Just a short post to show you the current state of the x64 decompiler. In fact, it already mostly works but we still have to solve some minor problems. Let us consider this source code:

struct color_t
{
  short red;
  short green;
  short blue;
  short alpha;
};

extern color_t lighten(color_t c);

color_t func(int red, int green, int blue, int alpha)
{
  color_t c;
  c.red = red;
  c.green = green;
  c.blue = blue;
  c.alpha = alpha;
  return lighten(c);
}

After compilation we get the following binary code:

.text:0000000000000000 ?func@@YA?AUcolor_t@@HHHH@Z proc near
.text:0000000000000000
.text:0000000000000000
c = color_t ptr -18h
.text:0000000000000000 var_10 = qword ptr -10h
.text:0000000000000000 arg_0 = dword ptr 8
.text:0000000000000000 arg_8 = dword ptr 10h
.text:0000000000000000 arg_10 = dword ptr 18h
.text:0000000000000000 arg_18 = dword ptr 20h
.text:0000000000000000
.text:0000000000000000
mov [rsp+arg_18], r9d ; $LN3
.text:0000000000000005 mov [rsp+arg_10], r8d
.text:000000000000000A mov [rsp+arg_8], edx
.text:000000000000000E mov [rsp+arg_0], ecx
.text:0000000000000012 sub rsp, 38h
.text:0000000000000016 movzx eax, word ptr [rsp+38h+arg_0]
.text:000000000000001B mov [rsp+38h+c.red], ax
.text:0000000000000020 movzx eax, word ptr [rsp+38h+arg_8]
.text:0000000000000025 mov [rsp+38h+c.green], ax
.text:000000000000002A movzx eax, word ptr [rsp+38h+arg_10]
.text:000000000000002F mov [rsp+38h+c.blue], ax
.text:0000000000000034 movzx eax, word ptr [rsp+38h+arg_18]
.text:0000000000000039 mov [rsp+38h+c.alpha], ax
.text:000000000000003E mov rcx, qword ptr [rsp+38h+c.red] ; c
.text:0000000000000043 call ?lighten@@YA?AUcolor_t@@U1@@Z ; lighten(color_t)
.text:0000000000000048 mov [rsp+38h+var_10], rax
.text:000000000000004D mov rax, [rsp+38h+var_10]
.text:0000000000000052 add rsp, 38h
.text:0000000000000056 retn
.text:0000000000000056 ?func@@YA?AUcolor_t@@HHHH@Z endp

Please note that the c, which is a structure, is passed by value in 2 registers: rcx and rdx. We had to rework quite many things in the decompiler to support such variables (we call them scattered variables). However, the output was worth it:

color_t __fastcall func(__int16 cx0, __int16 dx0, __int16 r8_0, __int16 r9_0)
{
  color_t c;

  c.red = cx0;
  c.green = dx0;
  c.blue = r8_0;
  c.alpha = r9_0;
  return lighten(c);
}

There is still some work to be done, but it seems we solved most problematic issues. Stay tuned, there will be more decompiler news soon!

 

This entry was posted in Decompilation. Bookmark the permalink.

8 Responses to x64 decompiler not far away

  1. Jonathan says:

    This looks great. Will it be available as a free upgrade to existing Hex-Rays customers, or will it be sold as a separate 64-bit purchase?

  2. Carsten says:

    Nice work!

    Small correction for your text, the struct is passed in RCX only (4 * 16 == 64), DX is only involved as a source.

    I wonder how you can determine that (R/E)DX is not a parameter of lighten() ? Was lighten’s signature given ?

    • Ilfak Guilfanov says:

      You are completely right, Arnaud already pointed out the discrepancy to me but I had no time to fix it.

      The prototype of lighten() was manually specified as in the source code. Without it, the decompiler would assume a simple __int64 value (there is no way that the decompiler could come up with a complex type definition as color_t at its own). The rest was done automatically: based on the prototype the decompiler found out the input register (RCX) and produced a nice output.

      By the way, the same source code compiled by g++ gives a very different binary:

      _Z4funciiii proc near
      sub rsp, 8
      mov eax, 0
      mov ax, di
      movzx esi, si
      shl rsi, 10h
      mov rdi, 0FFFFFFFF0000FFFFh
      and rax, rdi
      or rax, rsi
      movzx edx, dx
      shl rdx, 20h
      mov rsi, 0FFFF0000FFFFFFFFh
      and rax, rsi
      shl rcx, 30h
      mov eax, eax
      or rax, rdx
      or rax, rcx
      mov rdi, rax ; c
      call _Z7lighten7color_t ; lighten(color_t)
      add rsp, 8

      locret_4F: ; DATA XREF: .eh_frame:000000000000007Co
      retn

      and the decompiler output is less beautiful:

      color_t __fastcall func(color_t a1, __int16 a2, __int16 a3, __int64 a4)
      {
      a1.green = a2;
      a1.alpha = a4;
      a1.blue = a3;
      return lighten(a1);
      }

      This happens because the low part of RDI represents two things at the same time: the input argument a1 and the argument to lighten() call. I guess we will have to introduce a fake assignment to separate the different roles of RDI.2

  3. April says:

    Is this an April Fools’ joke?

  4. Avi Cohen Stuart says:

    This is really great!
    I cannot wait to get a taste of the new x64 Decompiler!

  5. Joxean says:

    It looks very good! Keep up the good work!