本文介绍了将十六进制值转换为unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我正在尝试将十六进制值 1f600 转换为字符表示形式的笑脸表情符号: String.fromCharCode(parseInt(1f600,16)); 但这只是生成一个方形符号。解决方案大多数表情符号需要两个代码单元,包括那个代码单元。 fromCharCode 以代码为单位工作(JavaScript的字符是UTF-16代码单元,但可以容忍无效的代理对),而不是代码点(实际的Unicode字符)。 在现代环境中,您使用 String.fromCodePoint 或只使用Unicode 代码点转义序列( \ u {XXXXX} 而不是 \ uXXXX ,这是代码单位)。也不需要 parseInt : 的console.log(String.fromCodePoint(0x1f600));执行console.log( \u {1f600}); 在旧环境中,你必须提供代理对,在这种情况下是0xD83D 0xDE00: console.log(\ uD83D \\ _DDE00); ...或使用polyfill fromCodePoint 。 如果由于某种原因你不想要要在旧环境中使用polyfill,并且您的起点是代码点,您必须弄清楚代码单元。你可以在上面链接的MDN的polyfill中看到如何做到这一点,或者这里是如何 Unicode UTF-16常见问题解答说要这样做: 使用以下类型定义 typedef unsigned int16 UTF16; typedef unsigned int32 UTF32; 第一个片段计算字符代码C中的高(或前导)代理。 const UTF16 HI_SURROGATE_START = 0xD800 UTF16 X =(UTF16)C; UTF32 U =(C>> 16)& ((1 UTF16 W =(UTF16)U - 1; UTF16 HiSurrogate = HI_SURROGATE_START | (W > 10; 其中X,U和W对应表3-5 UTF-16位分布中使用的标签。下一个片段对低代理人也是如此。 const UTF16 LO_SURROGATE_START = 0xDC00 UTF16 X =(UTF16) C; UTF16 LoSurrogate =(UTF16)(LO_SURROGATE_START | X&((1<< 10)-1)); I'm trying to convert the hex value 1f600, which is the smiley emoji to its character representation by:String.fromCharCode(parseInt("1f600", 16));but this just generates a square symbol. 解决方案 Most emojis require two code units, including that one. fromCharCode works in code units (JavaScript's "characters" are UTF-16 code units except invalid surrogate pairs are tolerated), not code points (actual Unicode characters).In modern environments, you'd use String.fromCodePoint or just a Unicode codepoint escape sequence (\u{XXXXX} rather than \uXXXX, which is for code units). There's also no need for parseInt:console.log(String.fromCodePoint(0x1f600));console.log("\u{1f600}");In older environments, you have to supply the surrogate pair, which in that case is 0xD83D 0xDE00:console.log("\uD83D\uDE00");...or use a polyfill for fromCodePoint.If for some reason you don't want to use a polyfill in older environments, and your starting point is a code point, you have to figure out the code units. You can see how to do that in MDN's polyfill linked above, or here's how the Unicode UTF-16 FAQ says to do it: Using the following type definitionstypedef unsigned int16 UTF16;typedef unsigned int32 UTF32;the first snippet calculates the high (or leading) surrogate from a character code C.const UTF16 HI_SURROGATE_START = 0xD800UTF16 X = (UTF16) C;UTF32 U = (C >> 16) & ((1 << 5) - 1);UTF16 W = (UTF16) U - 1;UTF16 HiSurrogate = HI_SURROGATE_START | (W << 6) | X >> 10;where X, U and W correspond to the labels used in Table 3-5 UTF-16 Bit Distribution. The next snippet does the same for the low surrogate.const UTF16 LO_SURROGATE_START = 0xDC00UTF16 X = (UTF16) C;UTF16 LoSurrogate = (UTF16) (LO_SURROGATE_START | X & ((1 << 10) - 1)); 这篇关于将十六进制值转换为unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
07-16 17:52