问题描述
我正在构建一个标签系统,我需要检索类似的标签,所以当用户输入某事"或某事"或某事"或某事"等时,他会得到所有表中匹配的行.
如果我在现场使用 utf8_general
或 utf8_unicode
,那将是小菜一碟.我只能
SELECT * FROM tags WHERE tag LIKE 'something'
但是,我需要在该表中使用 utf8_bin
.那么,我该怎么办?我不是一个很大的 mysql 专家.我想我应该使用 CAST() 或 CONVERT() 但我不确定如何使用.
第二部分,获取some-thing、some*thing、some&thing等是另一个问题,但我想我可以用正则表达式自己解决
解决方案我认为搞乱所有这些转换和正则表达式可能不是最好的方法.相反,我将使用我的框架的方法并生成给定标签的 URL名称"并将其存储在同一 db 行中.
是的,转换 :-
mysql> select convert("söme thing" using utf8) =转换(使用utf8的一些东西");+------------------------------------------------------------------------+|convert( "söme thing" using utf8) = convert( "something" using utf8) |+------------------------------------------------------------------------+|1 |+------------------------------------------------------------------------+不过我觉得用utf8_bin没什么好处
处理标签搜索时,可以考虑存储
- 干净的版本(一些)
- 用于将某些变体和其他变体映射到干净版本的附加表
- 当用户搜索söme时,您可以查找söme = some
I'm building a tagging system and I need to retrieve similar tags, so when a user would punch in "some thing" or "somé thing" or "söme thing" or "some¤thing" etc he would get all the matching rows in the table.
If I were using utf8_general
or utf8_unicode
on the field, it would be a piece o' cake. I could just
SELECT * FROM tags WHERE tag LIKE 'some thing'
but alas, I need to use utf8_bin
in that table. So, what do I do? I'm not a very big mysql expert. I think I should be using CAST() or CONVERT() but I'm not sure how.
The second part, getting the some-thing, some*thing, some&thing etc, is another issue, but I think I can solve it on my own with Regular Expressions
EDIT: THE SOLUTIONI thought that messing around with all this converting and regexping might not be the best way. Instead, I will use my framework's methods and generate a URL "name" of given tag and store it on the same db row.
Yes, the convert :-
mysql> select convert( "söme thing" using utf8) = convert( "some thing" using utf8); +------------------------------------------------------------------------+ | convert( "söme thing" using utf8) = convert( "some thing" using utf8) | +------------------------------------------------------------------------+ | 1 | +------------------------------------------------------------------------+
But I think is no benefits to use utf8_bin
When handling search of tag, you can consider to store
- clean version (some)
- an additional table to map söme and other variations to the clean version
- when user search for söme, is possible for you to look-up söme = some
这篇关于在 mysql 中选择相似的值(something, sömé thińg, some-thing, some¤thing 应该是一样的)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!