文件中读取Unicode字符

文件中读取Unicode字符

本文介绍了从C ++文件中读取Unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想一个字符一个字符地读取Unicode文件(UTF-8),但是我不知道如何一个字符一个文件地读取.

I want to read Unicode file (UTF-8) character by character, but I don't know how to read from a file one by one character.

谁能告诉我该怎么做?

推荐答案

UTF-8与ASCII兼容,因此您可以像读取ASCII文件一样读取UTF-8文件.将整个文件读为字符串的C ++方法是:

UTF-8 is ASCII compatible, so you can read a UTF-8 file like you would an ASCII file. The C++ way to read a whole file into a string is:

#include <iostream>
#include <string>
#include <fstream>

std::ifstream fs("my_file.txt");
std::string content((std::istreambuf_iterator<char>(fs)), std::istreambuf_iterator<char>());

结果字符串具有对应于UTF-8字节的字符.您可以像这样遍历它:

The resultant string has characters corresponding to UTF-8 bytes. you could loop through it like so:

for (std::string::iterator i = content.begin(); i != content.end(); ++i) {
    char nextChar = *i;
    // do stuff here.
}

或者,您可以以 binary模式打开文件,然后以这种方式遍历每个字节:

Alternatively, you could open the file in binary mode, and then move through each byte that way:

std::ifstream fs("my_file.txt", std::ifstream::binary);
if (fs.is_open()) {
    char nextChar;
    while (fs.good()) {
        fs >> nextChar;
        // do stuff here.
    }
}

如果您想做更复杂的事情,建议您看一下 Qt .我发现它对于这种东西很有用.至少比 ICU 的痛苦要小得多,

If you want to do more complicated things, I suggest you take a peek at Qt. I've found it rather useful for this sort of stuff. At least, less painful than ICU, for doing largely practical things.

QFile file;
if (file.open("my_file.text") {
    QTextStream in(&file);
    in.setCodec("UTF-8")
    QString contents = in.readAll();

    return;
}

这篇关于从C ++文件中读取Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 09:36