问题描述
我们正在改进我们的现有系统,该系统使用MYSQL DB处理以下类型的数据。
We are revamping our existing system, which uses MYSQL DB to deal with the following type of data.
- 与交易和订单相关的数据
- 客户信息
- 产品信息
我们需要查询这些数据并提取统计数据,还需要筛选,构面和细分列表以及KPI。
We need to query on these data and pull in statistical data, and also filter, facet and segment list and KPIs.
我们尝试了ClickHouse,Druid,DGraph对
We tried ClickHouse, Druid, DGraph did a few tests on sample data to benchmark and to check which DB fits our needs.
我对Druid DB感兴趣的几件事是
Few things I liked about Druid DB are,
- 德鲁伊搜索查询:列出所有匹配项以及维(列名)和相同项的计数/出现次数。
链接: - utf8mb4支持
- 全文搜索
- 不区分大小写的搜索
- Druid Search Queries: Which lists down all the matches along with the dimensions(column names) and count/occurrence for the same.Link: http://druid.io/docs/latest/querying/searchquery.html
- utf8mb4 support
- Full text search
- Case insensitive search
与MYSQL和Druid数据库相比,我们发现ClickHouse的速度更快。但是有以下问题。
We found ClickHouse to be faster when compared to MYSQL and Druid databases. But have the following problems.
- 无法执行类似druid的搜索查询(返回维度和出现次数)。要解决此问题,有什么解决方法?
- 不区分大小写的搜索。我们该如何处理? ClickHouse区分大小写,对吧?
- utf8mb4支持吗?我们如何保存/存储utf8不支持的特殊字符或少数表情符号?
我们在MYSQL中遇到了类似的问题,将排序规则更改为utf8mb4即可解决。我们在ClickHouse中可以实现什么?
您的建议可以帮助我们克服这些挑战并做出更好的决定。
Your suggestions can help us overcome these challenges and make a better decision.
预先感谢。
推荐答案
该功能听起来像是这样:
That feature sounds to work roughly like:
SELECT interval, dim1, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim1
UNION ALL
SELECT interval, dim2, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim2
UNION ALL
...
有多个选项,例如 positionCaseInsensitiveUTF8(干草堆,针)
函数或与正则表达式匹配:
There are multiple options, for example positionCaseInsensitiveUTF8(haystack, needle)
function or match with regular expressions: https://clickhouse.yandex/docs/en/query_language/functions/string_search_functions/#match-haystack-pattern
ClickHouse中的字符串是任意字节序列,因此您可以在其中存储任何内容,但是您应该检查可用功能是否匹配您的用例。
Strings in ClickHouse are arbitrary byte sequences, so you can store whatever you want there, but you should probably check whether the available functions match your usecase.
这篇关于使用ClickHouse的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!