Type punning with unions

Posted on 2014-05-25

TLDR: don’t use unions for type punning; always use memcpy.

Sometimes you might want to reinterpret a value of one type as value of another type. For example you might have an integer parameter, but you know that it actually contains a float value.

Let’s assume the integer was large enough to store the float somehow, and now you want to get the value back. If you follow wikipedia you might do something like this:

1
2
3
4
5
6
7
8
float get_float(int i) {
    union {
        int i;
        float f;
    } x;
    x.i = i;
    return x.f;
}

Usually this works fine; as long as you access the values directly through the union member probably no compiler will screw this up. (There are various claims whether this is actually supposed to work in different language versions and C vs C++.)

But if you start using pointers (or references in C++) this can fail very fast:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
int return0(int *i, float *f) {
    *i = 0;
    *f = 1;
    return *i;
}

int test_union_type_punning() {
    union {
        int i;
        float f;
    } x;
    return return0(&x.i, &x.f);
}

If compiled with gcc (4.9.1) gcc -Wall -O2 -S type-punning.c I get this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
return0:
    .cfi_startproc
    movl    $0, (%rdi)
    xorl    %eax, %eax
    movl    $0x3f800000, (%rsi)
    ret
    .cfi_endproc

test_union_type_punning:
    .cfi_startproc
    xorl    %eax, %eax
    ret
    .cfi_endproc

Although in return0 i and f refer to the same memory location, and the write to f comes after the write to i, the compiler will ignore the write to f. This is called “strict aliasing”; the compiler assumes that a write to a float reference will not modify any integers, and therefore the written integer value is still present.

Even inlining doesn’t help the compiler to see it, and it also doesn’t print a warning.

That is why I always prefer punning with memcpy like this:

1
2
3
4
5
6
7
#include <string.h>

float get_float(int i) {
    float f;
    memcpy(&f, &i, sizeof(f));
    return f;
}

With gcc -Wall -O2 -mtune=ivybridge and clang (same options) both functions generate the same code:

1
2
3
4
5
get_float:
    .cfi_startproc
    movd    %edi, %xmm0
    ret
    .cfi_endproc

So if you have the option avoid unions for type punning. memcpy is safer and makes the intention more clear, and shouldn’t have any performance drawbacks.

References:

Generated using nanoc and bootstrap - Last content change: 2015-01-25 13:47