问题描述
据我所知,BigQuery 的缓存机制已启用每个用户的基础.但我们希望能够在项目/数据集/表级别上共享缓存.
As I understand it, BigQuery's caching mechanism is on a per user basis. But we'd like to be able to share the cache on something like a project/dataset/table level.
例如,John &Mary 都在同一个 Google 项目 XYZ
上工作.他们喜欢使用 BigQuery,并且都查询数据集 Foo
中的表 Bar
,即 XYZ:Foo.Bar
以从他们的数据中获得漂亮的见解.
For example, John & Mary both work on the same Google project XYZ
. They love using BigQuery, and both query the table Bar
in dataset Foo
i.e. XYZ:Foo.Bar
to get beautiful insights from their data.
John 登录并针对 XYZ:Foo.Bar
编写一个查询,执行需要 10 秒.几分钟后,Mary 登录并在XYZ:Foo.Bar
上编写完全相同的查询.也需要 10 秒,但她没有命中缓存.
John logs in and writes a query against XYZ:Foo.Bar
which takes 10 seconds to execute. A few minutes later Mary logs in and composes the exact same query on XYZ:Foo.Bar
. It also takes 10 seconds, but she does not get a cache hit.
有什么办法可以跨用户(即在项目/数据集/表级别)共享查询缓存?还是我错过了一些明显的东西?
Is there anything that can be done to share the query cache across users i.e. on a project/dataset/table level? Or have I missed something obvious?
推荐答案
出于隐私原因,BigQuery 不会在用户之间共享缓存 - 但它可能是一个有趣的功能请求提议:https://code.google.com/p/google-bigquery/.
BigQuery doesn't share cache across users for privacy reasons - but it could be an interesting feature request to propose: https://code.google.com/p/google-bigquery/.
您今天可以实施的另一种替代方法是使用服务帐户代表您的用户连接到 BigQuery 的代理.例如,您在使用 http://demo.redash.io 时获得 BigQuery 原生缓存和应用级缓存.与 Datalab 相同 - 因为它默认使用服务帐户,因此会为同一项目中的用户缓存结果.
An alternative you could implement today is a proxy that would connect to BigQuery on behalf of your users with a service account. For example, you get the BigQuery native cache and an application level cache when using http://demo.redash.io. Same with Datalab - as it uses a service account by default, results are cached for users in the same project.
这篇关于在多个用户之间共享 BigQuery 的缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!