问题描述
我想运行一个查询,使用表 B 中的值对表 A 进行分区.例如:
I would like to run a query that partitions table A using a value from table B.For example:
#standard SQL
select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where B.date = '2018-01-01'
此查询将扫描表 A 中的所有分区,并且不会考虑我在 where 子句中指定的日期(用于分区目的).我尝试以几种不同的方式运行此查询,但都产生了相同的结果 - 扫描表 A 中的所有分区.有什么办法可以解决吗?
This query will scan all the partitions in table A and will not take into consideration the date I specified in the where clause (for partitioning purposes). I have tried running this query in several different ways but all produced the same result - scanning all partitions in table A.Is there any way around it?
提前致谢.
推荐答案
With BigQuery 脚本(现在是测试版),有一种方法可以修剪分区.
With BigQuery scripting (Beta now), there is a way to prune the partitions.
基本上,定义了一个脚本变量来捕获子查询的动态部分.然后在后续查询中,使用脚本变量作为过滤器来修剪要扫描的分区.
Basically, a scripting variable is defined to capture the dynamic part of a subquery. Then in subsequent query, scripting variable is used as a filter to prune the partitions to be scanned.
DECLARE date_filter ARRAY<DATETIME>
DEFAULT (SELECT ARRAY_AGG(date) FROM B WHERE ...);
select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where A._partitiontime IN UNNEST(date_filter)
这篇关于使用联接表中的引用在 BigQuery 中查询分区表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!