本文介绍了为什么我的BigQuery流媒体插入是受限速的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在进行BigQuery插入流程时,出现 403 rateLimitExceeded 错误。我并行地进行了很多流媒体插入,所以虽然我知道这可能是导致某些速率限制的原因,但我不确定特定的速率限制是什么问题。




$ b


{
code:403,
错误:[{
domain:global,
message:超出限制:您的表超出了行的配额。有关更多信息,请参阅https://cloud.google.com / bigquery / troubleshooting-errors,
reason:rateLimitExceeded
}],
message:超出限制:您的表超出行的配额。请参阅https://cloud.google.com/bigquery/troubleshooting-errors
}



基于, 403 rateLimitExceeded 是由,但文档使它听起来像这些都不适用于流媒体操作。
$ b 然而,错误提及表中的消息超过了行的配额,这听起来更像 403 quotaExceeded 错误。 包括:


  • 最大行大小:1 MB - 我在此 - 我的平均行大小在KB中,我专门限制大小以确保它们不会达到1MB

  • HTTP请求大小限制:10 MB - 我在此 - 我的平均批量大小为< 400KB并且max是< 1MB

  • 每秒最大行数:每个表格每秒100,000行。超过此金额将导致quota_exceeded错误。 - 无法想象我会完成这个任务 - 每个批次大约有500行,每批大约需要500毫秒。我并行运行,但插入了大约2000个表,所以虽然可能(尽管不太可能)我在做每秒10万行,但是每个表没有办法(更像每行最多1,000行/秒) em>


  • 每个请求的最大行数:500 - 我恰好在500
  • 每秒MB,每表。超过此金额将导致quota_exceeded错误。 -


任何想法/建议与这个速度限制是什么,将不胜感激!

解决方案

我怀疑你偶尔会提交每秒超过100,000行表。可能你的平行插入过程偶尔会在同一张桌子上排队?



这是报道为限制速率错误的原因是为了给出推迟信号以减慢速度:为了在单个桌子上处理零星的操作尖峰,您可以退后一步,然后再次尝试将负载分散出去。

这与配额失败不同,这意味着重试仍会失败,直到配额周期结束(例如,每日配额限制)。


I'm getting 403 rateLimitExceeded errors while doing streaming inserts into BigQuery. I'm doing many streaming inserts in parallel, so while I understand that this might be cause for some rate limiting, I'm not sure what rate limit in particular is the issue.

Here's what I get:

{ "code" : 403, "errors" : [ { "domain" : "global", "message" : "Exceeded rate limits: Your table exceeded quota for rows. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors", "reason" : "rateLimitExceeded" } ], "message" : "Exceeded rate limits: Your table exceeded quota for rows. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors"}

Based on BigQuery's troubleshooting docs, 403 rateLimitExceeded is caused by either concurrent rate limiting or API request limits, but the docs make it sound like neither of those apply to streaming operations.

However, the message in the error mentions table exceeded quota for rows, which sounds more like the 403 quotaExceeded error. The streaming quotas are:

  • Maximum row size: 1 MB - I'm under this - my average row size is in the KB and I specifically limit sizes to ensure they don't hit 1MB
  • HTTP request size limit: 10 MB - I'm under this - my average batch size is < 400KB and max is < 1MB
  • Maximum rows per second: 100,000 rows per second, per table. Exceeding this amount will cause quota_exceeded errors. - can't imagine I'd be over this - each batch is about 500 rows, and each batch takes about 500 milliseconds. I'm running in parallel but inserting across about 2,000 tables, so while it's possible (though unlikely) that I'm doing 100k rows/second, there's no way that's per table (more like 1,000 rows/sec per table max)
  • Maximum rows per request: 500 - I'm right at 500
  • Maximum bytes per second: 100 MB per second, per table. Exceeding this amount will cause quota_exceeded errors. - Again, my insert rates are not anywhere near this volume by table.

Any thoughts/suggestions as to what this rate limiting is would be appreciated!

解决方案

I suspect you are occasionally submitting more than 100,000 rows per second to a single table. Might your parallel insert processes occasionally all line up on the same table?

The reason this is reported as a rate limit error is to give a push-back signal to slow down: to handle sporadic spikes of operations on a single table, you can back off and try again to spread the load out.

This is different from a quota failure which implies that retrying will still fail until the quota epoch rolls over (for ex, daily quota limits).

这篇关于为什么我的BigQuery流媒体插入是受限速的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-18 14:54