我们有一个交易表,其中包含金额,用户ID,合作伙伴ID和说明。当用户有3笔特定交易时,他们将获得一笔一次性奖金。我们已经在较小的集合上运行了查询,但是现在我们有超过20,000,000个事务行,只是没有削减。必须有更好的方法来解决此问题。
SELECT t.str_spid, count(*) AS C
FROM transactions t
WHERE t.str_prid = 150
AND (t.str_desc = 'Annual Rewards' OR t.str_desc = 'Annual Rewards (PRO)')
AND t.str_amount = 1000
AND t.str_spid NOT IN (
SELECT x.str_spid
FROM transactions x
WHERE x.str_prid = 150
AND x.str_amount = 2500
GROUP BY x.str_spid
)
GROUP BY t.STR_SPID
HAVING C = 3
我的代码一次只能限制50行,但即使这样也令人窒息。该组中大约有300万人,交易总数约为1200万。仅子查询就不足1,000,000行。
一种想法是标记已经收到奖金的用户,以便以后跳过他们。它将消除对子查询的需要,但是我必须加入到用户表(t.str_serid = u.us_userid)来查看是否设置了该标志。
任何提示,指示,想法等,将不胜感激。
最佳答案
如果检查索引没有提供解决方案,则这是我建议尝试使用相关子查询的情况之一。更改此:
AND t.str_spid NOT IN (
SELECT x.str_spid
FROM transactions x
WHERE x.str_prid = 150
AND x.str_amount = 2500
GROUP BY x.str_spid
)
对此:
AND NOT EXISTS (
SELECT x.str_spid
FROM transactions x
WHERE x.str_spid = t.str_spid
AND x.str_prid = 150
AND x.str_amount = 2500
)
另外,尽管我怀疑是否会有所不同,但是
DISTINCT
比没有聚合功能的GROUP BY
更适合您的原始子查询。另一个选择是LEFT JOIN,这实际上是我更可能的首选:
SELECT t.str_spid, count(*) AS C
FROM transactions AS t
LEFT JOIN transactions AS x
ON t.str_spid = x.str_spid
AND x.str_prid = 150
AND x.str_amount = 2500
WHERE t.str_prid = 150
AND (t.str_desc = 'Annual Rewards' OR t.str_desc = 'Annual Rewards (PRO)')
AND t.str_amount = 1000
AND x.str_spid IS NULL
GROUP BY t.STR_SPID
HAVING C = 3
关于mysql - MySQL查询在非常大的一组事务中查找3个相同的事务,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55856145/