问题描述
在有关stackoverflow的另一篇文章中,我读到INSTR
可用于根据相关性对结果进行排序.
In another post on stackoverflow, I read that INSTR
could be used to order results by relevance.
我对col LIKE '%str%' and
INSTR(col,'str')`的理解是,它们的行为都相同.排序规则的处理方式似乎有所不同.
My understanding of col LIKE '%str%' and
INSTR(col, 'str')` is that they both behave the same. There seems to be a difference in how collations are handled.
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO users (name)
VALUES ('Joël'), ('René');
SELECT * FROM users WHERE name LIKE '%joel%'; -- 1 record returned
SELECT * FROM users WHERE name LIKE '%rene%'; -- 1 record returned
SELECT * FROM users WHERE INSTR(name, 'joel') > 0; -- 0 records returned
SELECT * FROM users WHERE INSTR(name, 'rene') > 0; -- 0 records returned
SELECT * FROM users WHERE INSTR(name, 'joël') > 0; -- 1 record returned
SELECT * FROM users WHERE INSTR(name, 'rené') > 0; -- 1 record returned
尽管INSTR
进行了一些转换,但它在é
中找到了ë
.
Although INSTR
does some conversion, it finds ë
in é
.
SELECT INSTR('é', 'ë'), INSTR('é', 'e'), INSTR('e', 'ë');
-- returns 1, 0, 0
我错过了什么吗?
http://sqlfiddle.com/#!2/9bf21/6 (使用mysql-version:5.5.22)
http://sqlfiddle.com/#!2/9bf21/6 (using mysql-version: 5.5.22)
推荐答案
这是由于已验证的LOCATE()
和INSTR()
上的错误70767.
This is due to bug 70767 on LOCATE()
and INSTR()
, which has been verified.
尽管 INSTR()
文档指出它可以用于多字节字符串,就像您注意到的那样,它不适用于像utf8_general_ci
这样的归类,而utf8_general_ci
这样的归类应该区分大小写和变音符号
Though the INSTR()
documentation states that it can be used for multi-byte strings, it doesn't seem to work, as you note, with collations like utf8_general_ci
, which should be case and accent insensitive
错误报告指出,尽管MySQL正确地做到了这一点,但只有在 bytes 的数目也相同时,才会这样做:
The bug report states that although MySQL does this correctly it only does so when the number of bytes is also identical:
如果要创建下表,要弄乱报告示例:
To pervert the reports example, if you create the following table:
create table t ( needle varchar(10), haystack varchar(10)
) COLLATE=utf8_general_ci;
insert into t values ("A", "a"), ("A", "XaX");
insert into t values ("A", "á"), ("A", "XáX");
insert into t values ("Á", "a"), ("Á", "XaX");
insert into t values ("Å", "á"), ("Å", "XáX");
然后运行此查询,您可以看到演示的相同行为:
then run this query, you can see the same behaviour demonstrated:
select needle
, haystack
, needle=haystack as `=`
, haystack LIKE CONCAT('%',needle,'%') as `like`
, instr(needle, haystack) as `instr`
from t;
这篇关于当str包含'é'或'ë'并且仅substr'e'时,INSTR(str,substr)不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!