问题描述
我如何将 GENERATE_DATE_ARRAY(start_date, end_date[, INTERVAL INT64_expr date_part]) 应用于数据集中的每条记录.我了解如何应用它来获取从开始到结束的单个日期范围,但不知道如何将相同的日期数组应用于每个 id.
How can i apply GENERATE_DATE_ARRAY(start_date, end_date[, INTERVAL INT64_expr date_part]) to each record in a dataset. I understand how to apply it to get a single date range from start to end, but don't know how to apply the same date array to each id.
假设我有两个不同 ID 的 x 和 y,日期如下:
Say i have two distinct ID's x and y with the following dates:
|id|date
--------------
1 |x |2021-01-01
2 |x |2021-01-03
3 |y |2021-01-06
4 |y |2021-01-09
我想为每个 ID 填写日期间隔
and i want to fill in the date gap for each ID
我怎样才能实现以下输出?
How can i achieve the following output?
|id|date
--------------
1 |x |2021-01-01
2 |x |2021-01-02
3 |x |2021-01-03
4 |y |2021-01-06
5 |y |2021-01-07
6 |y |2021-01-08
7 |y |2021-01-09
推荐答案
以下是 BigQuery Standard SQL
Below is for BigQuery Standard SQL
select id, date from (
select id, date, lead(date) over(partition by id order by date) next_date
from `project.dataset.table`
), unnest(generate_date_array(date, next_date)) date
where not next_date is null
-- order by date
如果适用于您问题中的样本数据 - 输出为
if to apply to sample data from your question - output is
这篇关于Bigquery - 为每个 ID 添加完整的日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!