本文介绍了使用联接表中的引用在 BigQuery 中查询分区表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想运行一个查询,使用表 B 中的值对表 A 进行分区.例如:

I would like to run a query that partitions table A using a value from table B.For example:

#standard SQL
select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where B.date = '2018-01-01'

此查询将扫描表 A 中的所有分区,并且不会考虑我在 where 子句中指定的日期(用于分区目的).我尝试以几种不同的方式运行此查询,但都产生了相同的结果 - 扫描表 A 中的所有分区.有什么办法可以解决吗?

This query will scan all the partitions in table A and will not take into consideration the date I specified in the where clause (for partitioning purposes). I have tried running this query in several different ways but all produced the same result - scanning all partitions in table A.Is there any way around it?

提前致谢.

推荐答案

With BigQuery 脚本(现在是测试版),有一种方法可以修剪分区.

With BigQuery scripting (Beta now), there is a way to prune the partitions.

基本上,定义了一个脚本变量来捕获子查询的动态部分.然后在后续查询中,使用脚本变量作为过滤器来修剪要扫描的分区.

Basically, a scripting variable is defined to capture the dynamic part of a subquery. Then in subsequent query, scripting variable is used as a filter to prune the partitions to be scanned.

DECLARE date_filter ARRAY<DATETIME>
  DEFAULT (SELECT ARRAY_AGG(date) FROM B WHERE ...);

select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where A._partitiontime IN UNNEST(date_filter)

这篇关于使用联接表中的引用在 BigQuery 中查询分区表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:48