从'avct'看单引号中多个字符的意义
最近在看AVChat的源代码,它的GlobalDefs.h文件里用了以下代码:
// TCP pack types const long PT_AudioMediaType = 10001; const long PT_VideoMediaType = 10002; const long PT_Payload = 10003; // Messages const long msg_FilterGraphError = 'avct' + 1; const long msg_MediaTypeReceived = 'avct' + 2; const long msg_TCPSocketAccepted = 'avct' + 3; const long msg_UDPCommandReceived = 'avct' + 4; const long msg_ModifyFilterGraph = 'avct' + 5; // Let the main thread modify filter graph #define WM_ModifyFilterGraph (WM_USER+123) // UDP command defines const long MAX_COMMAND_SIZE = 100; const long cmd_ClientCalling = 'avct' + 100; const long cmd_DeviceConfig = 'avct' + 101; const long cmd_BuildFilterGraph = 'avct' + 102; const long cmd_DisconnectRequest = 'avct' + 103; // TCP pack types const long PT_AudioMediaType = 10001; const long PT_VideoMediaType = 10002; const long PT_Payload = 10003; // Messages const long msg_FilterGraphError = 'avct' + 1; const long msg_MediaTypeReceived = 'avct' + 2; const long msg_TCPSocketAccepted = 'avct' + 3; const long msg_UDPCommandReceived = 'avct' + 4; const long msg_ModifyFilterGraph = 'avct' + 5; // Let the main thread modify filter graph #define WM_ModifyFilterGraph (WM_USER+123) // UDP command defines const long MAX_COMMAND_SIZE = 100; const long cmd_ClientCalling = 'avct' + 100; const long cmd_DeviceConfig = 'avct' + 101; const long cmd_BuildFilterGraph = 'avct' + 102; const long cmd_DisconnectRequest = 'avct' + 103;
调试时,VS告诉我 'avct' = 1635148660 ——这令我很疑惑。我在整个solution里查找了一遍,没发现任何地方有定义。然后我跑去了StackOverflow,因为不知道该查什么关键词,就提了个问题,然后搞懂了。
C++ standard, §2.14.3/1 - Character literals
(...) An ordinary character literal that contains more than one c-char is a multicharacter literal . A multicharacter literal has type int and implementation-defined value.
因为multicharacter literal被视为int,所以VS做了这样的转换:
'a'=0x61
'v'=0x76;
'c'=0x63
't'=0x74
'avct'=0x61766374=1635148660这就解释了'avct'的值是哪里来的。不过James Kanze对此做了修正:
“Historically, the original C accepted multi-character character constants, and both C and C++ still do, on historical grounds. Unlike single character constants, the type is int, and the value is implementation defined (but will typically consist of some sort of combination of the characters involved).
Practically speaking, they should be avoided in new code, and cannot be used in portable code (because implementations do vary as to what they mean).”
EDIT:
For what it's worth: the most typical implementation would be more or less the equivalent of:
union { char c[sizeof(int)]; int i; }; union { char c[sizeof(int)]; int i; }; , placing the characters in order in c (and ignoring any which didn't fit—whether the first or the last depending on the implementation), and then use the value of i as the value. These results obviously depend on the encoding (but that's true of any character constant), but also on things like byte order and the size of an int. Thus, even assuming an ASCII based encoding, on systems I've used, the results could be 0x61766374, 0x74637661, 0x6374, 0x7463, 0x6176 or 0x7661. (And this doesn't consider "exotic" architectures with 9 bit bytes, or where the size of an int is 6.)
补充:软件开发 , C++ ,