[C-prog-lang-l] multicharacter initializer

Vladimír Kotal vlada at kotalovi.cz
Tue Feb 22 00:12:21 CET 2022


Hi all,

someone asked me about multicharacter initializers after the lecture today. Not familiar with that, I had to look it up. In short, it is described in C99 section 6.4.4.4 (Character constants):

 1. An integer character constant has type *int*. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., *'ab'*), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.


Being implementation defined, it means the compiler can do whatever it deems appropriate with the character sequence when converting it to the integer value. This seems to be handy for use in a switch statement to map to 4 byte values that make sense as a sequence of ASCII letters.

The problem is the 'implementation-defined' part. For example, when compiling this program:

#include <stdio.h>

int
main(void)
{
char c = 'foobar';
printf("%c\n", c);
}

with clang, it will emit the following warning:

$ cc multichar.c
multichar.c:6:11: warning: multi-character character constant [-Wmultichar]
        char c = 'foobar';
                 ^
multichar.c:6:11: warning: character constant too long for its type
multichar.c:6:11: warning: implicit conversion from 'int' to 'char' changes value from 1868718450 to 114 [-Wconstant-conversion]
        char c = 'foobar';
             ~   ^~~~~~~~
3 warnings generated.

When the program is run, it will print "r\n".

If the multicharacter literal is shortened to just 'foo', the warning will be reduced to:

multichar.c:6:11: warning: multi-character character constant [-Wmultichar]
        char c = 'foo';
                 ^
multichar.c:6:11: warning: implicit conversion from 'int' to 'char' changes value from 6713199 to 111 [-Wconstant-conversion]
        char c = 'foo';
             ~   ^~~~~
2 warnings generated.

and the output will be "o\n".

Try writing these big decimal numbers as hex and see the ASCII table to see how this particular implementation works.

https://www.zipcon.net/~swhite/docs/computers/languages/c_multi-char_const.html discribes the problematic well. It concludes that this feature of the languge should be avoided. Even Dennis Ritchie in the original C manual says that.

Best regards,


V. Kotal
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the c-prog-lang-l mailing list