将十六进制值转换为unicode字符

本文介绍了将十六进制值转换为unicode字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述 29岁程序员，3月因学历无情被辞！我正在尝试将十六进制值 1f600 转换为字符表示形式的笑脸表情符号： String.fromCharCode（parseInt（1f600，16））; 但这只是生成一个方形符号。解决方案大多数表情符号需要两个代码单元，包括那个代码单元。 fromCharCode 以代码为单位工作（JavaScript的字符是UTF-16代码单元，但可以容忍无效的代理对），而不是代码点（实际的Unicode字符）。在现代环境中，您使用 String.fromCodePoint 或只使用Unicode 代码点转义序列（ \ u {XXXXX} 而不是 \ uXXXX ，这是代码单位）。也不需要 parseInt ：的console.log（String.fromCodePoint（0x1f600））;执行console.log（ \u {1f600}）; 在旧环境中，你必须提供代理对，在这种情况下是0xD83D 0xDE00： console.log（\ uD83D \\ _DDE00）; ...或使用polyfill fromCodePoint 。如果由于某种原因你不想要要在旧环境中使用polyfill，并且您的起点是代码点，您必须弄清楚代码单元。你可以在上面链接的MDN的polyfill中看到如何做到这一点，或者这里是如何 Unicode UTF-16常见问题解答说要这样做：使用以下类型定义 typedef unsigned int16 UTF16; typedef unsigned int32 UTF32; 第一个片段计算字符代码C中的高（或前导）代理。 const UTF16 HI_SURROGATE_START = 0xD800 UTF16 X =（UTF16）C; UTF32 U =（C>> 16）& （（1 UTF16 W =（UTF16）U - 1; UTF16 HiSurrogate = HI_SURROGATE_START | （W > 10; 其中X，U和W对应表3-5 UTF-16位分布中使用的标签。下一个片段对低代理人也是如此。 const UTF16 LO_SURROGATE_START = 0xDC00 UTF16 X =（UTF16） C; UTF16 LoSurrogate =（UTF16）（LO_SURROGATE_START | X&（（1<< 10）-1））; I'm trying to convert the hex value 1f600, which is the smiley emoji to its character representation by:String.fromCharCode(parseInt("1f600", 16));but this just generates a square symbol. 解决方案 Most emojis require two code units, including that one. fromCharCode works in code units (JavaScript's "characters" are UTF-16 code units except invalid surrogate pairs are tolerated), not code points (actual Unicode characters).In modern environments, you'd use String.fromCodePoint or just a Unicode codepoint escape sequence (\u{XXXXX} rather than \uXXXX, which is for code units). There's also no need for parseInt:console.log(String.fromCodePoint(0x1f600));console.log("\u{1f600}");In older environments, you have to supply the surrogate pair, which in that case is 0xD83D 0xDE00:console.log("\uD83D\uDE00");...or use a polyfill for fromCodePoint.If for some reason you don't want to use a polyfill in older environments, and your starting point is a code point, you have to figure out the code units. You can see how to do that in MDN's polyfill linked above, or here's how the Unicode UTF-16 FAQ says to do it: Using the following type definitionstypedef unsigned int16 UTF16;typedef unsigned int32 UTF32;the first snippet calculates the high (or leading) surrogate from a character code C.const UTF16 HI_SURROGATE_START = 0xD800UTF16 X = (UTF16) C;UTF32 U = (C >> 16) & ((1 << 5) - 1);UTF16 W = (UTF16) U - 1;UTF16 HiSurrogate = HI_SURROGATE_START | (W << 6) | X >> 10;where X, U and W correspond to the labels used in Table 3-5 UTF-16 Bit Distribution. The next snippet does the same for the low surrogate.const UTF16 LO_SURROGATE_START = 0xDC00UTF16 X = (UTF16) C;UTF16 LoSurrogate = (UTF16) (LO_SURROGATE_START | X & ((1 << 10) - 1)); 这篇关于将十六进制值转换为unicode字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！