问题描述
当字符串具有非ASCII字符时,如何遍历字符串的字母?
在Windows上可以使用!
for(int i = 0; i< text.length(); i ++ )
{
std :: cout<< text [i]
}
但是在Linux上,如果我这样做:
std :: string text =á;
std :: cout<< text.length()<< std :: endl;
它告诉我字符串á的长度为2,而在Windows上只有1
但是使用ASCII字母可以很好地工作!
在Windows系统的,á是一个单字节字符,即<$ c中的每个 char
$ c> string 确实是一个字符。因此,您可以循环并打印它们。
在Linux上,á表示为多字节(准确地说是2个字节)utf-8字符 C3 A1。这意味着在您的字符串
中,á实际上由两个 char
组成,并且分别打印(或以任何方式处理它们)会产生废话。 ASCII字符永远不会发生这种情况,因为每个ASCII字符的utf-8表示形式都适合一个字节。
不幸的是,C ++标准并不真正支持utf-8。设备。只要您只处理整个 如果您需要更多utf-8支持,请寻找一个可以实现所需功能的好的库。 您可能还想阅读,以获取有关以下内容的详细讨论在不同的系统上使用不同的字符集,并提供有关 也可以看看,以获取有关如何可移植地处理不同字符编码的信息。 How do i loop through the letters of a string when it has non ASCII charaters?This works on Windows! But on linux if i do: It tells me the string "á" has a length of 2 while on windows it's only 1But with ASCII letters it works good! In your windows system's code page, á is a single byte character, i.e. every On Linux, á is represented as the multibyte (2 bytes to be exact) utf-8 character 'C3 A1'. This means that in your Unfortunately, utf-8 is not really supported by C++ standard facilities. As long as you only handle the whole If you need more utf-8 support, look for a good library that implements what you need. You might also want to read this for a more detailed discussion on different character sets on different systems and advice regarding Also have a look at this for information on how to handle different character encodings portably. 这篇关于C ++非ASCII字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!字符串
,既不从中访问单个 char
s,也不假定 string
等于 string
, std :: string $ c中的实际字符数$ p>
string
与 wstring
的建议。
for (int i = 0; i < text.length(); i++)
{
std::cout << text[i]
}
std::string text = "á";
std::cout << text.length() << std::endl;
char
in the string
is indeed a character. So you can just loop and print them.string
, the á actually consists of two char
s, and printing those (or handling them in any way) separately yields nonsense. This will never happen with ASCII characters because the utf-8 representation of every ASCII character fits in a single byte.string
and neither access individual char
s from it nor assume the length of the string
equals the number of actual characters in the string
, std::string
will most likely do fine.string
vs. wstring
.