问题描述
Java char 类型是否保证以任何特定编码存储?
Is the Java char type guaranteed to be stored in any particular encoding?
我错误地表述了这个问题.我想问的是char文字是否保证使用任何特定的编码?
I phrased this question incorrectly. What I meant to ask is are char literals guaranteed to use any particular encoding?
推荐答案
存储"在哪里?Java 中的所有字符串都以 UTF-16 表示.当写入文件、通过网络发送或其他任何方式时,它会使用您指定的任何字符编码发送.
"Stored" where? All Strings in Java are represented in UTF-16. When written to a file, sent across a network, or whatever else, it's sent using whatever character encoding you specify.
专门针对 char
类型,请参阅 字符文档.具体来说:char 数据类型......基于原始 Unicode 规范,该规范将字符定义为固定宽度的 16 位实体."因此,将 char
转换为 int
将始终为您提供 UTF-16 值 if char
实际上包含一个字符从那个字符集.如果您只是在 char
中插入一些随机值,它显然不一定是有效的 UTF-16 字符,同样,如果您使用错误的编码读取字符.文档继续讨论补充 UTF-16 字符如何只能由 int
表示,因为 char
没有足够的空间来容纳它们,如果你'在这个级别上运行,熟悉这些语义可能很重要.
Specifically for the char
type, see the Character docs. Specifically: "The char data type ... are based on the original Unicode specification, which defined characters as fixed-width 16-bit entities." Therefore, casting char
to int
will always give you a UTF-16 value if the char
actually contains a character from that charset. If you just poked some random value into the char
, it obviously won't necessarily be a valid UTF-16 character, and likewise if you read the character in using a bad encoding. The docs go on to discuss how the supplementary UTF-16 characters can only be represented by an int
, since char
doesn't have enough space to hold them, and if you're operating at this level, it might be important to get familiar with those semantics.
这篇关于Java char 以什么编码存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!