Thursday, 11 October 2018

Warnings Series - Sometimes Uninitialized

In this series of posts I'm discussing how compiling with alternative compilers can help to catch errors at development time. In the last post I discussed how to detect hidden overloads. In this post, I'm going to look at some recent errors I came across when getting a legacy library to compile under clang.

Background

The library is an old library that was written with C++98 and was in the style of "C with classes" rather than C++. When written the author followed a single return policy from functions, which was achieved by using a goto cleanup pattern. For a simple example see:

1
2
3
4
5
6
7
8
9
10
11
12
int func(int num)
{
    int ret = 0;
 
    if(num < 50)
        goto cleanup;
 
    ret = num;
 
cleanup:
    return ret;
}

The error handling was normally activated by the use of macros that would check a condition and if false, log a message, set the return code, and then goto cleanup. This programming style lead to a lot of functions that have the following pattern:

  • Declare all variables at the start of the function.
  • Check inputs.
  • Function logic (with checks).
  • Cleanup.

Uninitialized Variables

All compilers have some form of uninitialized checking in place. For example GCC, Clang and MSVC will all warn about the uninitialized use of mid in the following function:

1
2
3
4
5
6
7
8
int test(int num)
{
    int mid;
    if(num < mid)
        return 100;
 
    return 0;
}

Maybe / Sometimes Uninitialized

However, with more complex use cases some compilers can miss the use of an uninitialized variable. Below is a simplified (and nonsensical) example of triggering such a use case:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// error handling macro
#define check(A, E)               \
    if ((A)) {                    \
        ret = E;                  \
        goto cleanup;             \
    }
 
int* get_mid()
{
    return new int(50);
}
 
int round_100(int num)
{
    int* mid;
    int ret = 0;
 
    check(num < 0, -1) // make sure input was valid
 
    mid = get_mid(); // get mid point
    check(mid, -1) // check mid was allocated
 
    check(num < *mid, 0)
    ret = 100;
cleanup:
    delete mid;
    return ret;
}

As mentioned above, this uses a macro for error checking and the goto-cleanup pattern of error handling. If a number is entered and is less that 0, we return -1. If it is greater than a median value (which we declare as the pointer mid), we return the value of mid. Otherwise we return 100.

Some basic unit tests for this might be created as:

1
2
3
4
5
6
7
void test_round()
{
    assert(round_100(0) == 0);
    assert(round_100(45) == 0);
    assert(round_100(55) == 100);
    assert(round_100(1000) == 100);
}

Using the above code on GCC (with -Wall), everything compiles without warning and all tests pass.

Problem with the code

There are a lot of problems with this code including lack of RAII, use of goto, etc. but for this post I'm only going to look at the code as it currently is without a full refactor.

On the happy path, the code above works fine and does what it is supposed to do. However, if the user enters a number of less than 0 then we go to the cleanup path and attempt to delete the mid pointer. This results in us calling delete on uninitialized memory and leads to undefined behaviour.

Catching the error

Clang

If you compile the library code with clang (again with -Wall) it gets the following warning:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<source>:24:5: warning: variable 'mid' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
    check(num < 0, -1);
    ^~~~~~~~~~~~~~~~~~
<source>:14:9: note: expanded from macro 'check'
    if ((A)) {                    \
        ^~~
<source>:30:12: note: uninitialized use occurs here
    delete mid;
           ^~~
<source>:24:5: note: remove the 'if' if its condition is always false
    check(num < 0, -1);
    ^~~~~~~~~~~~~~~~~~
<source>:14:5: note: expanded from macro 'check'
    if ((A)) {                    \
    ^
<source>:21:13: note: initialize the variable 'mid' to silence this warning
    int* mid;
            ^
             = nullptr

This warning is long but tells us where the error happens, and even gives a helpful tip on how to fix it by initializing mid to nullptr.

GCC

As mentioned, on GCC with the compiler flags -Wall, there are no errors in the code. Enabling other warnings with -Wextra, -pedantic, and even -Wmaybe-uninitialized all fail to trigger the warnings.

MSVC

On MSVC compiling with the /W4 flag, the warning appears as below:

1
2
z:\tmp\example.cpp(30) : warning C4701: potentially uninitialized local variable 'mid' used
z:\tmp\example.cpp(30) : warning C4703: potentially uninitialized local pointer variable 'mid' used

While this isn't as helpful as the clang warning, it does tell us that there is a problem with the code.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output

No comments:

Post a Comment