问题描述
我正在尝试解析一个字符串,将其拆分为不是字母或数字的内容
I am trying to parse a string, split it on what is not a letter or number
$parse_query_arguments = preg_split("/[^a-z0-9]+/i", 'København');
并构造一个 mysql 查询.即使我跳过 preg_split 并尝试直接输入字符串,它也会将其分成 2 个不同的字符串,'K' 和 'benhavn'.
and construct a mysql query.Even if I skip the preg_split and try to enter the string directly it breaks it into 2 different strings, 'K' and 'benhavn'.
我该如何处理这些问题?
How can I deal with these issues?
推荐答案
如果您使用像 a-z
这样的文字字符,那么它不会匹配重音字符.您可能想要使用各种字符类做更多的通用匹配:
If you're using literal characters like a-z
then it won't match accented ones. You might want to use the various character classes available to do more generic matching:
/[[:alpha:][:digit]]/
[:alpha:]
集的范围比 a-z
更广泛.请记住,字符匹配是基于字符代码完成的,而 a-z
的顺序是按索引从字面上获取 a
和 z
之间的字符.诸如 ø
之类的字符不在此范围内,即使它们按字母顺序位于该范围之间.
The [:alpha:]
set is much broader in scope than a-z
. Remember character matching is done based on character code, and a-z
in order take, literally, characters between a
and z
by index. Characters like ø
lie outside this range even if they'd fall between that alphabetically.
计算机以 ASCII-abetical(UNICODEical?)顺序工作.
Computers work in ASCII-abetical (UNICODEical?) order.
这篇关于处理丹麦语特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!