问题描述
我正在运行以下查询时,出现超出资源限制"错误.如果我删除最后一行(order by 子句),它会起作用:
Whem I am running the following query, I get a 'resource limited exceeded'-error. If I remove the last line (the order by clause) it works:
SELECT
id,
INTEGER(-position / (CASE WHEN fallback = 0 THEN 2 ELSE 1 END)) AS major_sort
FROM (
SELECT
id,
fallback,
ROW_NUMBER() OVER(PARTITION BY fallback) AS position
FROM
[table] AS r
ORDER BY
r.score DESC ) AS r
ORDER BY major_sort DESC
实际上整个最后一行是:
Actually the entire last line would be:
ORDER BY major_sort DESC, r.score DESC
但这都不会让事情变得更糟.
But neither that would probably make things even worse.
知道如何更改查询以规避此问题吗?
Any idea how I could change the query to circumvent this problem?
((如果你想知道这个查询是做什么的:table
包含一个带有多个回退策略的排名",我想创建一个这样的排序:'AABAABAABAAB' with 'A' 和 'B' 是后备策略.如果您有更好的想法如何实现这一点;请随时告诉我 :D))
((If you wonder what this query does: the table
contains a 'ranking' with multiple fallback strategies and I want to create an ordering like this: 'AABAABAABAAB' with 'A' and 'B' being the fallback strategies. If you have a better idea how to achieve this; please feel free to tell me :D))
推荐答案
顶级 ORDER BY
将始终序列化您的查询的执行:它会强制所有计算到单个节点上用于此目的的排序.这就是资源超出错误的原因.
A top-level ORDER BY
will always serialize execution of your query: it will force all computation onto a single node for the purpose of sorting. That's the cause of the resources exceeded error.
我不确定我是否完全理解您的查询目标,因此很难提出替代方案,但您可以考虑将 ORDER BY
子句放在 OVER(PARTITION BY ...)
子句.对单个分区进行排序可以并行完成,并且可能更接近您想要的.
I'm not sure I fully understand your goal with the query, so it's hard to suggest alternatives, but you might consider putting an ORDER BY
clause within the OVER(PARTITION BY ...)
clause. Sorting a single partition can be done in parallel and may be closer to what you want.
关于订购的更多一般建议:
More general advice on ordering:
在 BQ 查询期间不会保留顺序,因此如果您想在输入行上保留一个排序,请确保将其作为额外字段编码在您的数据中.
Order is not preserved during BQ queries, so if there's an ordering that you want to preserve on the input rows, make sure it's encoded in your data as an extra field.
大量全局排序数据的用例有些有限.通常当用户遇到 ORDER BY
的资源限制时,我们发现他们实际上正在寻找稍微不同的东西(本地排序的数据,或前 N 个"),并且可以摆脱全局 ORDER BY
完全.
The use cases for large amounts of globally-sorted data are somewhat limited. Often when users run into resource limitations with ORDER BY
, we find that they're actually looking for something slightly different (locally ordered data, or "top N"), and that it's possible to get rid of the global ORDER BY
completely.
这篇关于由于 order by,bigquery 资源受限的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!