Calculating API hashes with IDA Pro

Many times when debugging malware you discover that the malware does not import any function, replaces API names by hashes and tries to resolve the addresses by looking up which API name has the desired hash!

In this blog post we are going to demonstrate how to use IDA Pro to solve this problem and uncover all API hashes.

hash_calc

Background

To illustrate the problem, imagine this piece of code:

example1

After tracing this code, we discovered that the call at 0x405DA4 should be renamed as ‘call_by_hash’ because it takes the first argument as the HASH of the API to be called followed by the actual parameters to the desired function.

The call_by_hash() starts by reading the PEB from pointer from the TIB, locates the LDR_MODULE structure of kernel32.dll, parses its PE structure and finally retrieves the IMAGE_EXPORT_DIRECTORY in order to get both the AddressOfFunctions and AddressOfNames fields:

resolve_by_hash

With this information, the malware will then iterate through all the exported names trying to match the hash of the given exported name against the passed hash value using the calc_hash() function:

calc_hash

We notice that this function is self-contained and is easy to extract out of the code, reassemble and use in an external program to calculate the hashes. Instead of going through this hassle, we will write an IDAPython script that will make use of this function to calculate the hashes of all the exported names in this debugging session.

Writing the script

Before writing the script, let us break down into smaller steps:

  1. Extract the calc_hash function body and make it available for use from our script
  2. Find all exported names
  3. Pass each name to the calc_hash() and remember the result
  4. Display all the hashes in a nice chooser window

Using calc_hash() from the IDAPython script

Since the calc_hash() is self-contained, we can directly extract its body from the database and use ctypes to call it:

calc_by_hash

(Remember to free the memory when done)

Or we can use the Appcall mechanism to do just the same thing:

appcall

Because we will be calling calc_hash() at least 9000 times, we opt for the first solution because it is much much faster (Gladly we have another solution because it is not always a luxury to have a self-contained function that we can map into the Python host and execute as is).

Finding exported names and calculating their hashes

Everytime a debugging session starts, the IDA Pro debugger asks the debugger module to provide a set of debug names. The debug names are essentially the exported names of all loaded modules. To retrieve this list programmatically, we resort to using idaapi.get_debug_names():

get_dn

Each returned debug name has the following format ‘Modulename_ApiName’. This is why we had to split (after the first underscore character) the returned debug name.

Now to calculate the hashes of all the names, it is sufficient to loop through the list like this:

get_hashes

Putting it all together

Now that we solved all the issues, we will present the results in a nice chooser and provide two facilities: one to import the hashes into IDA Pro’s enumeration window and the other to export the API hash list to a text file for external processing.

Writing a Chooser window is trivial (please refer to the source code or the ex_choose2 sample in the IDAPython package).

This is how the end result looks like:

hashcalc_menu

The chooser window displays the module name, the hash value and the API name. A popup menu is created so that one can either export the list to a text file or import the values as enums:

hashcalc_enums

Using the script

Now to use the script, launch the debugger, trace until/locate the function that calculates the hash and name it as calc_func.

If the function is not self-contained you need to prototype the function properly (by pressing ‘y’ in IDA Pro) and use the Appcall method instead of the ctypes method.

If you want to enhance the disassembly listing, then after you run the script, select “Import as Enum”.
Later when you encounter something like this:

ex-b4

You may simply press ‘m’ and select the appropriate hash constant, for example:

apply_enum

And after that you will get something like this:

ex-a4

Hope you found this blog post useful.

This script has been tested with IDA Pro 6.0 and may be downloaded from here.

Your comments and suggestions are welcome.

This entry was posted in IDA Pro, Programming. Bookmark the permalink.

11 Responses to Calculating API hashes with IDA Pro

  1. Pingback: Tweets that mention Calculating API hashes with IDA Pro | Hex Blog -- Topsy.com

  2. Dinesh Venkatesan says:

    Nice Article. It will be more helpful if the MD5 value of the sample that was used for the demonstration is given.

  3. Elias Bachaalany says:

    Hello Dinesh,

    This is the MD5: 8a7c0d76e0e8c4d447d88c606f81b6b8

  4. Dinesh Venkatesan says:

    Hi, Please clarify this quick question.
    The figure http://www.hexblog.com/wp-content/uploads/2010/10/example1.gif shows that the target API is named as “call_by_hash” and the figure http://www.hexblog.com/wp-content/uploads/2010/10/resolve_by_hash.gif shows it as “resolve_by_hash”

  5. Dinesh Venkatesan says:

    Got it. I assume it might be some other function which is being called from call_by_hash().

  6. Elias Bachaalany says:

    Yes Dinesh, you are right.

    Mainly this is the logic:

    Malware calling an API -> DWORD call_by_hash(DWORD api_hash, …) -> FARPROC resolve_by_hash(DWORD hash) -> DWORD calc_hash(char *name, int x)

  7. Tomer Teller says:

    Very nice article.

  8. Oliver says:

    Nice write-up. Thanks for the effort, Elias.

  9. binoopang says:

    awesome article :)
    I have a question. what binary do you reverse?
    I found hashed api in the bredolab.

    • Elias Bachaalany says:

      Thank you for your comments!

      The binary MD5 is in a reply above. You can download it from offensivecomputing.net