Casts are bad

Halvar and Dennis Elser recently blogged about a serious vulnerability in the ATL libraries. A few days ago, Microsoft released an emergency “out-of-band” patch. Yes, the bug was that nasty, and since it is in a library, many MS Windows components were affected. Everyone who used the library should review their code and recompile with the corrected version.


Microsoft posted more details about the vulnerability. As it turns out, it was quite difficult to catch the bug. Automatic tools, like source code scanners and fuzzers, could not detect it. The very bug consists of a small typo, and it is understandable that it could be missed during code reviews. Humans tend to skip punctuation marks and focus on “important” stuff. However, everything is important in the source code, even spacing.

Source code analysis tools and the compiler could not catch the bug because of a cast in the source code.
Casts are a way to tell the compiler that an object of one kind should be treated as something different. For example, we have a 4 byte object and we can tell the compiler to treat it as a 2 byte object. The opposite is also possible: a 2 byte object can be treated as a 4 byte object. Naturally, when we use a cast, all responsibility is on us. A cast turns off all compiler checks and leaves us face-to-face with the cruel real world of bits and bytes. If we tell the compiler that the object is bigger than it is, all kinds of odd things can happen, including security breaches.

Since casts are that powerful, they should be used very sparingly. My attitude is to use them only there is no other choice. I know that my position differs from the mainstream. Many textbooks teach to use casts to “document” the code. Like, if a function expects a LONG parameter, they advise cast an uint variable to LONG. This is wrong. Let us check the vulnerable code, it is literally ridden with such casts:

__int64 cbSize;
hr = pStream->Read((void*) &cbSize, sizeof(cbSize), NULL);
BYTE *pbArray;
HRESULT hr = SafeArrayAccessData(psa, reinterpret_cast<LPVOID *>(&pbArray));
hr = pStream->Read((void*)&pbArray, (ULONG)cbSize, NULL);

This is a really bad code snippet. It illustrates how casts are overused nowadays, which is a bad coding practice. 100% of statements above have casts; while the ratio of statements with casts to statements without casts should be 1:100 or less. Here’s the same code, with all casts removed (and the erroneous & removed):

__int64 cbSize;
hr = pStream->Read(&cbSize, sizeof(cbSize), NULL);
BYTE *pbArray;
HRESULT hr = SafeArrayAccessData(psa, &pbArray);
hr = pStream->Read(pbArray, cbSize, NULL);

I did not try to compile it but it should pass all compiler checks and behave the same way. For me, the version without casts is shorter, simpler, easier to read. Why use any casts at all?

It is true that removing some casts leads to compiler warnings. Assignments that may truncate values, or comparisons of signed/unsigned integers are marked by many compilers as problematic. I would personally just turn off these warnings instead of damaging the source code with casts.

Casts to void* have no raison d’etre at all. I always wonder why anyone would cast an address expression to void*.

It even gets funny. Some programmers go as far as marking function calls with void:

(void)func();

This is their way of telling the reader that the return value of the function is not used. Well, isn’t the following simpler?

func();

C/C++ are not verbose languages after all.

Well, to keep it short, I think the current practices of using casts should be revised. I would opt for using casts only when it is absolutely necessary. By the way, our decompiler generates only necessary casts, with the exception of floating point-integer casts, which are always generated. In some cases even necessary casts are undesired (they obscure the program logic), so there is a command to temporarily hide all casts.

Keeping the source code simpler and shorter will allow us to avoid bugs like this. Less casts also mean that source code analysis tools will perform better.

Less casts – better code.

This entry was posted in Programming, Security. Bookmark the permalink.

5 Responses to Casts are bad

  1. David Thomas says:

    Casts such as
    (void)func();
    may exist to suppress warnings from static source code checkers.

  2. Torkell says:

    Your rss feed is currently broken – the angle brackets for the reinterpret_cast parameter aren’t escaped (they don’t appear on this page either).

  3. Ilfak Guilfanov says:

    Torkell, thank you. The problem was with the source code, not the rss feed. I fixed it.

  4. Johan says:

    I did not try to compile it but it should pass all compiler checks and behave the same way. For me, the version without casts is shorter, simpler, easier to read. Why use any casts at all?

    Because it does not compile without all of them. In particular

    HRESULT hr = SafeArrayAccessData(psa, reinterpret_cast(&pbArray));

    is not a cast to void*, it’s a cast to void**. &pbArray is a BYTE** and your compiler will typically tell you something similar to

    argument of type "BYTE **" is incompatible with parameter of type "LPVOID *"

    The casts that are to void* makes less sense though. You could imagine a scenario with overloaded functions where __int64* would call a function that does the wrong thing, but there is only one Read defined in this case as far as I can tell.
    I actually prefer the cast to ULONG over disabling the warnings. It shows that you understand that the compiler is handing the variable off to someone who will treat the value differently. I guess that part is debatable, but you can’t get away from the cast to LPVOID*. The problem there imo is that the called function takes a LPVOID*, but that’s COM for you.
    All this said, I fully agree with your basic aversion to casts and they are one reason I found COM code in C++ so deeply repulsive.
    Speaking of casts I am yet to write anything that navigates the structure of a PE-file without it turning into Cast-mania 3000. Since I assume you’ve done that more times than you care to recall: have you found a way?

  5. Ilfak Guilfanov says:

    I agree, “BYTE**” can not be automatically converted to “void **”. By the way, why not declare pbArray as “void*”?
    I do not remember how the PE loader works but I prefer to hide casts in inline functions. Such a function is usually written once and called many times. The compiler can check all call sites, the only thing is to implement the function correctly.