问题描述
所以我有标准的C字符串:
So I have standard C string:
char* name = "Jakub";
我想将它转换为UTF-16。我发现,UTF-16将两倍长 - 一个字符需要两个字符。
所以我创建另一个字符串:
And I want to convert it to UTF-16. I figured out, that UTF-16 will be twice as long - one character takes two chars.
So I create another string:
char name_utf_16[10]; //"Jakub" is 5 characters
现在,我相信ASCII字符我只会使用低字节,所以对于所有这些,它将像 J
等等 74 00
。有了这个信念,我可以做这样的代码:
Now, I believe with ASCII characters I will only use lower bytes, so for all of them it will be like 74 00
for J
and so on. With that belief, I can make such code:
void charToUtf16(char* input, char* output, int length) {
/*Todo: how to check if output is long enough?*/
for(int i=0; i<length; i+=2) //Step over 2 bytes
{
//Lets use little-endian - smallest bytes first
output[i] = input[i];
output[i+1] = 0; //We will never have any data for this field
}
}
但是,通过这个过程,我以Jkb
结束。我知道没有办法测试这个正确 - 我刚刚发送字符串到Minecraft Bukkit服务器。这是它断开连接时所说的:
But, with this process, I ended with "Jkb"
. I know no way to test this properly - I've just sent the string to Minecraft Bukkit Server. And this is what it said upon disconnecting:
注意:我知道Minecraft使用big-endian。上面的代码只是一个例子,事实上,我已经在类中实现了转换。
Note: I'm aware that Minecraft uses big-endian. Code above is just an example, in fact, I have my conversion implemented in class.
推荐答案
output[i] = input[i];
这将分配输入的每个其他字节,因为增加 i
通过2.所以难怪你获得Jkb
。
你可能想写:
This will assign every other byte of the input, because you increment i
by 2. So no wonder that you obtain "Jkb"
.You probably wanted to write:
output[i] = input[i / 2];
这篇关于从char *创建UTF-16字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!