本文介绍了如何选择BigQuery表中的最新分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图从日期分区BigQuery表中的最新分区中选择数据,但查询仍然从整个表读取数据。



我已经尝试过(据我所知,BigQuery不支持 QUALIFY ):

  SELECT col FROM table WHERE _PARTITIONTIME =(
SELECT pt FROM(
SELECT pt,RANK()OVER(ORDER BY pC DESC)as rnk FROM(
SELECT _PARTITIONTIME AS pt FROM table GROUP BY 1)


WHERE rnk = 1
);

但是这不起作用并读取所有行。

 从列表中选择列WHERE _PARTITIONTIME = TIMESTAMP('YYYY-MM-DD')

其中'YYYY-MM-DD'是一个特定的日期。



_PARTITIONTIME )是不规则的。有没有一种方法可以从BigQuery中的最新分区获取数据?

解决方案

尝试

  SELECT * FROM [dataset.partitioned_table] 
WHERE _PARTITIONTIME IN(
SELECT MAX(TIMESTAMP(partition_id))
FROM [ dataset.partitioned_table $ __ PARTITIONS_SUMMARY__]

  SELECT * FROM [dataset.partitioned_table] 
WHERE _PARTITIONTIME IN(
SELECT MAX(_PARTITIONTIME)
FROM [dataset.partitioned_table]


I am trying to select data from the latest partition in a date-partitioned BigQuery table, but the query still reads data from the whole table.

I've tried (as far as I know, BigQuery does not support QUALIFY):

SELECT col FROM table WHERE _PARTITIONTIME = (
  SELECT pt FROM (
    SELECT pt, RANK() OVER(ORDER by pt DESC) as rnk FROM (
      SELECT _PARTITIONTIME AS pt FROM table GROUP BY 1)
    )
  )
  WHERE rnk = 1
);

But this does not work and reads all rows.

SELECT col from table WHERE _PARTITIONTIME = TIMESTAMP('YYYY-MM-DD')

where 'YYYY-MM-DD' is a specific date does work.

However, I need to run this script in the future, but the table update (and the _PARTITIONTIME) is irregular. Is there a way I can pull data only from the latest partition in BigQuery?

解决方案

Try

SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN (
  SELECT MAX(TIMESTAMP(partition_id))
  FROM [dataset.partitioned_table$__PARTITIONS_SUMMARY__]
)

or

SELECT * FROM [dataset.partitioned_table]
WHERE _PARTITIONTIME IN (
  SELECT MAX(_PARTITIONTIME)
  FROM [dataset.partitioned_table]
)

这篇关于如何选择BigQuery表中的最新分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:48