本文介绍了从Windows-1252升级到UCS-2的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我正在试图找出从Windows-1252(WindowsANSI 代码页)到UCS-2。目前该程序在Windows-1252中读取和写入编码的文件,但也应该能够读取用UCS-2编码的文件。 因为我不喜欢我不想在程序中处理两个字符表示 我打算在内部使用UCS-2。我应该可以简单地使用 std :: wstring然后呢?当读取Windows-1252编码文件时,我必须将转换为UCS-2。我的理解是,如果支持什么样的 转换,它现在取决于C ++标准库的实现?我可能需要使用第三方库,比如 Dinkum Conversions Library,它可以动态转换数据或者像UTF-8 CPP那样的,我可以明确地调用函数转换 字符集? 将所有内容转换为UCS-2并将其存储在std :: wstring后我 假设我可以使用众所周知的字符串函数来搜索,替换, 比较字符串(包括<和>)等。我的理解是正确的 我'只要使用的字符 设置不是多字节的,就可以安全地使用std :: wstring的成员函数吗? 最后但并非最不重要的是程序需要再次保存文件。在这里使用UTF-8可能会产生向后兼容性(因为其他程序如果他们只支持 可能更容易读取文件) Windows-1252)。因此我需要另一个转换器来确保 std :: wstring是否正确编码为UTF-8,这意味着我需要再次使用 第三方工具? /> 我可能错过了什么? 鲍里斯 解决方案 是。 当读取Windows-1252编码文件时,我必须 AFAIK第三方库(或编写自己的代码)是 的唯一途径。对于Windows-1252到UCS-2,为什么不编写自己的?难道不是很难。 这对于UCS-2是正确的。 我认为有些混乱,UTF-8和Windows-1252不一样。 第一个是字符编码,第二个是字符集。 但是,将UCS-2转换为UTF-8是另一个步骤,你可以获得第二方库或编写自己的代码。 john 我想把它取回来,Windows 1252也是一个包裹,但它仍然是 如果它与UTF-8不一样 john 我想把它拿回来,Windows 1252也是一个编码,但它仍然是 ,它与UTF-8 不一样 谢谢,约翰!我应该更清楚地澄清一下:这个想法是,当用UTF-8编码的时,带有ASCII兼容的UTF-8子集的文件看起来像普通的ASCII文件(所以其他程序可以简单地假设它们是ASCII 文件)。 鲍里斯 I''m trying to find out what the steps look like to upgrade a program(which is used on Windows and Unix) from Windows-1252 (the Windows "ANSI"code page) to UCS-2. Currently the program reads and writes files encodedin Windows-1252 but should be able to read files encoded in UCS-2, too.As I don''t want to deal with two character representations in the programI plan to use UCS-2 internally. I should be able to simply usestd::wstring then? When Windows-1252 encoded files are read I have toconvert the data to UCS-2 though. My understanding is that it depends nowon the implementation of the C++ standard library if and what kind ofconversions are supported? I might need to use a third-party library likethe Dinkum Conversions Library which converts data on the fly or somethinglike UTF-8 CPP where I can call functions explicitly to convert betweencharacter sets?After converting everything to UCS-2 and storing it in std::wstring Isuppose I can use the well-known string functions to search, replace,compare strings (including < and >) etc. Is my understanding correct thatI''m safe to use member functions of std::wstring as long as the characterset used is not multibyte?Last but not least the program needs to save files again. It might makesense to use UTF-8 here for backward compatibility (as other programsmight be able to read the files more easily if they support onlyWindows-1252). Thus I would need another converter to make sure thatstd::wstring is encoded in UTF-8 correctly which means I need athird-party tool again?Anything I might have missed?Boris 解决方案Yes.When Windows-1252 encoded files are read I have toAFAIK a third party library (or writing your own code) is the only wayto go. For Windows-1252 to UCS-2 why not write your own? It can''t bethat hard.That''s correct for UCS-2.Some confusion here I think, UTF-8 and Windows-1252 are not the same.The first is an character encoding, the second is a character set.But yes, to convert UCS-2 to UTF-8 is another step for which you couldeither get a third party library or write your own code.johnI want to take that back, Windows 1252 is an encding too, but it''s stillthe case that it''s not the same as UTF-8johnI want to take that back, Windows 1252 is an encding too, but it''s stillthe case that it''s not the same as UTF-8Thanks, John! I should have clarified it better: The idea is that fileswith an ASCII-compatible subset of UTF-8 look like normal ASCII files whenencoded in UTF-8 (so other programs can simply assume they are ASCIIfiles).Boris 这篇关于从Windows-1252升级到UCS-2的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!