问题描述
int main()
{
char c = 0xff;
bool b = 0xff == c;
// Under most C/C++ compilers' default options, b is FALSE!!!
}
无论是C或C ++标准指定字符作为符号或符号,它是实现定义的。
Neither the C or C++ standard specify char as signed or unsigned, it is implementation-defined.
为什么C / C ++标准没有明确的符号或无符号,以避免像上面code危险滥用定义字符?
Why does the C/C++ standard not explicitly define char as signed or unsigned for avoiding dangerous misuses like the above code?
推荐答案
历史的原因,大多是。
防爆pressions 字符
在大多数情况下晋升为 INT (因为有很多的CPU不具有的8位算术运算)。在某些系统中,符号扩展是做到这一点的最有效方法,它主张使平原字符
签署。
Expressions of type char
are promoted to int
in most contexts (because a lot of CPUs don't have 8-bit arithmetic operations). On some systems, sign extension is the most efficient way to do this, which argues for making plain char
signed.
在另一方面,EBCDIC字符集有与高阶位设置(即字符128或更高的值)基本特征;在EBCDIC平台上,字符
pretty多有无符号。
On the other hand, the EBCDIC character set has basic characters with the high-order bit set (i.e., characters with values of 128 or greater); on EBCDIC platforms, char
pretty much has to be unsigned.
借助(为1989年标准)没有很多说关于这个问题;部分3.1.2.5说:
The ANSI C Rationale (for the 1989 standard) doesn't have a lot to say on the subject; section 3.1.2.5 says:
三种类型的字符指定:签署
,平原,无符号
。一个
平原字符
可重新psented为$ P $符号或者无符号的,这取决于
后的实施,如在现有实践。类型符号字符
引入使可用一个字节有符号整数键入
那些实施char前为无符号系统。对于原因
对称,关键字签署
是允许的类型名称的一部分
其他整型。
让我们回到更进一步,在的一个早期版本从1975年说:
Going back even further, an early version of the C Reference Manual from 1975 says:
A 字符
对象可以在任何地方使用的 INT
的可能。在所有情况下
字符
是通过上传播其符号转换为 INT
所得整数的8位。这是与两个的一致
补充用于字符和整数重新presentation。
(但是,在登录的传播特性在其他消失
实现。)
这说明更多实现特定的比我们在以后的文件看,但它承认,字符
可以是带符号。在其他实现上符号传播消失,推广一个字符
对象来 INT
将具有零扩展的8位重新presentation,基本上把它当作一个8位的无符号的数量。 (语言还不具备签署
或无符号
关键字)。
This description is more implementation-specific than what we see in later documents, but it does acknowledge that char
may be either signed or unsigned. On the "other implementations" on which "the sign-propagation disappears", the promotion of a char
object to int
would have zero-extended the 8-bit representation, essentially treating it as an 8-bit unsigned quantity. (The language didn't yet have the signed
or unsigned
keyword.)
C'S立即predecessor一个叫做硼。硼的语言是无类型语言,因此字符
的问题被符号或无符号并不适用。有关C的早期历史的更多信息,请参见后期丹尼斯里奇的。
C's immediate predecessor was a language called B. B was a typeless language, so the question of char
being signed or unsigned did not apply. For more information about the early history of C, see the late Dennis Ritchie's home page.
至于发生了什么事在code(运用现代的C规则):
As for what's happening in your code (applying modern C rules):
char c = 0xff;
bool b = 0xff == c;
如果纯字符
未签名,那么的ç
初始化其设置为(字符)0xFF的
,其中比较等于 0xFF的
在第二行。但是,如果纯字符
签署,那么 0xFF的
(类型的前pression INT
)转换为字符
- 但因为 0xFF的
超过CHAR_MAX(假设 CHAR_BIT == 8
),其结果是实现定义的。在大多数实现中,其结果是 1
。在比较 0xFF的== c审核
,两个操作数都转换为 INT
,使其等同于 0xFF的== -1
或 255 == -1
,这当然是错误的。
If plain char
is unsigned, then the initialization of c
sets it to (char)0xff
, which compares equal to 0xff
in the second line. But if plain char
is signed, then 0xff
(an expression of type int
) is converted to char
-- but since 0xff
exceeds CHAR_MAX (assuming CHAR_BIT==8
), the result is implementation-defined. In most implementations, the result is -1
. In the comparison 0xff == c
, both operands are converted to int
, making it equivalent to 0xff == -1
, or 255 == -1
, which is of course false.
要注意的另一个重要的事情是, unsigned char型
,符号字符
,和(平)字符
三种不同的类型。 字符
有同样的重presentation为的或者的 unsigned char型
的或的符号字符
;它的实现定义它是哪一个。 (在另一方面,符号int
和 INT
两个名字相同的类型; unsigned int类型
是一个独特的类型。(除,只是添加到轻浮,这是实现定义是否宣布为普通位域 INT
带符号。))
Another important thing to note is that unsigned char
, signed char
, and (plain) char
are three distinct types. char
has the same representation as either unsigned char
or signed char
; it's implementation-defined which one it is. (On the other hand, signed int
and int
are two names for the same type; unsigned int
is a distinct type. (Except that, just to add to the frivolity, it's implementation-defined whether a bit field declared as plain int
is signed or unsigned.))
是的,这一切都有点乱,我敢肯定,这将如果C正在从头开始设计今天已经作出不同的定义。但是C语言的各修订不得不避免破坏(太多)现有code和在较小程度上现有的实现
Yes, it's all a bit of a mess, and I'm sure it would have be defined differently if C were being designed from scratch today. But each revision of the C language has had to avoid breaking (too much) existing code, and to a lesser extent existing implementations.
这篇关于为什么作为符号或无符号而不是C或C ++标准明确定义字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!