问题描述
我使用 iconv()
将CSV数据从 UTF-8 转换为 Windows-1252 。
I use iconv()
to convert CSV data from UTF-8 to Windows-1252.
$converted = iconv("UTF-8", "Windows-1252", $csvData);
在某些情况下, iconv()
失败静静地返回 false
。
In some cases, iconv()
failed quietly, returning false
.
我也尝试过使用 // TRANSLIT
,但`iconv()´也会在此返回 false 。
I also tried using //TRANSLIT
but `iconv()´ returns false here as well.
当我添加 // IGNORE时
声明到目标字符集,转换成功,但这意味着一个或多个字符丢失了。
When i add the //IGNORE
statement to the target charset, the conversion succeeds, but that means one or more character(s) got lost.
我可以坚持 // IGNORE
,但我想找出是哪个字符引起了问题。
I can stick to //IGNORE
but i would like to find out which character(s) are causing the problem.
我该怎么办
推荐答案
将字符串用作char数组是不好的主意(请参阅问题注释),因为
It was bad idea to work with string as char array (see question comments) because php string type
所以我们可以对utf-8使用 mb_substr
并使用符号而不是字节
So we can use mb_substr
for utf-8 and work with symbols not bytes
error_reporting('E_ALL & !E_NOTICE');
$yourString = "test bad ☺ string";
$convertString = '';
$badChars = [];
if (iconv("UTF-8", "Windows-1252", $yourString) === false) {
for($i = 0, $stringLength = mb_strlen($yourString); $i < $stringLength; $i++) {
$char = mb_substr($yourString, $i, 1);
$convertChar = iconv("UTF-8", "Windows-1252", $char);
if ($convertChar === false) {
$badChars[$i] = $char;
} else {
$convertString .= $convertChar;
}
}
} else {
$convertString = iconv("UTF-8", "Windows-1252", $yourString);
}
var_dump($badChars, $convertString);
结果 array(1){[9] => string(3)☺} string(16)测试错误的字符串
P.S。下次,我将用代码给出更详细的答案。我的错误
P.S. The next time I will give a more detailed answer with the code. My mistake
这篇关于iconv()–如何检测冒犯性人物?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!