本文介绍了如何选择BigQuery表中的最新分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我试图从日期分区BigQuery表中的最新分区中选择数据,但查询仍然从整个表读取数据。
我已经尝试过(据我所知,BigQuery不支持 QUALIFY
):
SELECT col FROM table WHERE _PARTITIONTIME =(
SELECT pt FROM(
SELECT pt,RANK()OVER(ORDER BY pC DESC)as rnk FROM(
SELECT _PARTITIONTIME AS pt FROM table GROUP BY 1)
)
)
WHERE rnk = 1
);
但是这不起作用并读取所有行。
从列表中选择列WHERE _PARTITIONTIME = TIMESTAMP('YYYY-MM-DD')
其中'YYYY-MM-DD'
是一个特定的日期。
_PARTITIONTIME )是不规则的。有没有一种方法可以从BigQuery中的最新分区获取数据?
解决方案
尝试
SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN(
SELECT MAX(TIMESTAMP(partition_id))
FROM [ dataset.partitioned_table $ __ PARTITIONS_SUMMARY__]
)
或
SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN(
SELECT MAX(_PARTITIONTIME)
FROM [dataset.partitioned_table]
)
I am trying to select data from the latest partition in a date-partitioned BigQuery table, but the query still reads data from the whole table.
I've tried (as far as I know, BigQuery does not support QUALIFY
):
SELECT col FROM table WHERE _PARTITIONTIME = (
SELECT pt FROM (
SELECT pt, RANK() OVER(ORDER by pt DESC) as rnk FROM (
SELECT _PARTITIONTIME AS pt FROM table GROUP BY 1)
)
)
WHERE rnk = 1
);
But this does not work and reads all rows.
SELECT col from table WHERE _PARTITIONTIME = TIMESTAMP('YYYY-MM-DD')
where 'YYYY-MM-DD'
is a specific date does work.
However, I need to run this script in the future, but the table update (and the _PARTITIONTIME
) is irregular. Is there a way I can pull data only from the latest partition in BigQuery?
解决方案
Try
SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN (
SELECT MAX(TIMESTAMP(partition_id))
FROM [dataset.partitioned_table$__PARTITIONS_SUMMARY__]
)
or
SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN (
SELECT MAX(_PARTITIONTIME)
FROM [dataset.partitioned_table]
)
这篇关于如何选择BigQuery表中的最新分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!