问题描述
我有以下测试代码:
setlocale(LC_ALL, 'en_US.UTF8');
function t($text)
{
echo "$text\n";
echo "encoding: ", mb_detect_encoding($text), "\n";
// transliterate
$text = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $text);
echo "iconv: ", $text, "\n";
}
// Latvian alphabet
t('AĀBCČDEĒFGĢHIĪJKĶLĻMNŅOPRSŠTUŪVZŽ aābcčdeēfgģhiījkķlļmnņoprsštuūvzž');
// Greek alphabet
t('ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΜμΝνΞξΟοΠπΡρΣσςΤτΥυΦφΧχΨψΩω');
// Cyrillic alphabet + some rarer versions
t('АБВГДЕЖЅЗИІКЛМНОПҀРСТѸФХѠЦЧШЩЪꙐЬѢꙖѤЮѦѪѨѬѮѰѲѴ абвгдеёжзийклмнопрстуфхцчшщъыьэюя');
及其输出:
AĀBCČDEĒFGĢHIĪJKĶLĻMNŅOPRSŠTUŪVZŽ aābcčdeēfgģhiījkķlļmnņoprsštuūvzž
encoding: UTF-8
iconv: AABCCDEEFGGHIIJKKLLMNNOPRSSTUUVZZ aabccdeefgghiijkkllmnnoprsstuuvzz
ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΜμΝνΞξΟοΠπΡρΣσςΤτΥυΦφΧχΨψΩω
encoding: UTF-8
iconv:
АБВГДЕЖЅЗИІКЛМНОПҀРСТѸФХѠЦЧШЩЪꙐЬѢꙖѤЮѦѪѨѬѮѰѲѴ абвгдеёжзийклмнопрстуфхцчшщъыьэюя
encoding: UTF-8
iconv:
基本上,它忽略所有希腊和西里尔字母.为什么?
it essentially IGNOREs all greek and cyrillic characters. why?
我已经在两种环境下进行了测试,其中php -i | egrep "iconv (implementation|library)"
可以输出以下两种情况:
i have tested on two environments, where php -i | egrep "iconv (implementation|library)"
outputs either:
iconv implementation => libiconv
iconv library version => 1.11
或:
iconv implementation => libiconv
iconv library version => 1.13
我还尝试将ivonv内部编码设置为UTF-8,添加/删除了setlocale
函数,但无济于事. iconv似乎只能识别拉丁字母和拉丁字母衍生的字符.
i have also tried setting ivonv internal encoding to UTF-8, adding/removing the setlocale
function, but all of no avail. iconv seems to recognise only latin and derived-from-latin characters.
更新:iconv一定是有问题的,因为终端命令echo 'ΑαΒβΓγΔδ' | iconv -f utf-8 -t ASCII//TRANSLIT
会产生错误iconv: (stdin):1:0: cannot convert
,而echo 'āēī' | iconv -f utf-8 -t ASCII//TRANSLIT
可以正常工作并输出aei
.
UPDATE: It must be a problem with iconv as terminal command echo 'ΑαΒβΓγΔδ' | iconv -f utf-8 -t ASCII//TRANSLIT
produces an error iconv: (stdin):1:0: cannot convert
, while echo 'āēī' | iconv -f utf-8 -t ASCII//TRANSLIT
works and outputs aei
, as expected.
iconv --version
输出iconv (GNU libiconv 1.14)
(版权信息除外).
iconv --version
outputs iconv (GNU libiconv 1.14)
(besides the copyright information).
推荐答案
使用ASCII//IGNORE//TRANSLIT
iconv()停在第一个illegar char处,在该处截断了字符串,这是iconv()
的默认行为,因此它不遵守//TRANSLIT
The iconv() stopped at the first illegar char, cutting off the string right there, which is the default behaviour of iconv()
, so it did not respect the //IGNORE
switch after the //TRANSLIT
这篇关于PHP iconv希腊/西里尔字母音译不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!