在上一个问题中,给出了执行查询的答案,该查询将在新列中获取上一年的收入。这非常有效,但是现在我有一个后续问题。 (please review this link to have a look at the previous question)
用于获取此数据的查询(感谢Mikhail):
#standardSQL
SELECT
a.date, a.location, a.revenue,
DATE_SUB(a.date, INTERVAL 1 YEAR) date_last_year,
IFNULL(b.revenue, 0) revenue_last_year
FROM `project.dataset.table` a
LEFT JOIN `project.dataset.table` b
ON a.location = b.location
AND DATE_SUB(a.date, INTERVAL 1 YEAR) = b.date
简化的结果如下所示(限于1个位置):
date revenue revenue_last_year
2019-01-31 1471,2577 2185,406
2019-01-30 1291,1111 4723,7439
2019-01-29 2178,6532 2263,5283
2019-01-28 1531,8021 0
2019-01-26 1578,1247 2446,6234
2019-01-25 1299,644 1522,4537
2019-01-24 788,2669 1979,104
2019-01-23 787,441 4117,7927
2019-01-22 2437,2951 1876,2479
2019-01-21 1071,0476 0
2019-01-19 2291,0456 2289,8657
后续问题与去年的工作日有关。如您所见,去年的收入值为“ 0”。那是因为位置A关闭的那天。但是,为了每天进行准确的比较,我们需要找到该营业日的营业额。
有关更多信息,请查看下表,以了解我们在今年一月的最后10天中的情况,并在两栏中添加手动找到的去年收入:
date revenue revenue_last_year date revenue
2019-01-31 1471,2577 2185,406 2018-01-31 2185,406
2019-01-30 1291,1111 4723,7439 2018-01-30 4723,7439
2019-01-29 2178,6532 2263,5283 2018-01-29 2263,5283
2019-01-28 1531,8021 0 2018-01-27 2843,6616
2019-01-26 1578,1247 2446,6234 2018-01-26 2446,6234
2019-01-25 1299,644 1522,4537 2018-01-25 1522,4537
2019-01-24 788,2669 1979,104 2018-01-24 1979,104
2019-01-23 787,441 4117,7927 2018-01-23 4117,7927
2019-01-22 2437,2951 1876,2479 2018-01-22 1876,2479
2019-01-21 1071,0476 0 2018-01-20 2561,4086
2019-01-19 2291,0456 2289,8657 2018-01-19 2289,8657
请注意日期的差异。
什么是解决此问题的好方法?是否有必要针对工作日的查询进行调整,您将如何处理?
最佳答案
以下是BigQuery标准SQL
#standardSQL
SELECT
a.date, a.location, ANY_VALUE(a.revenue) revenue,
ARRAY_AGG(
STRUCT(b.date AS date_last_year, b.revenue AS revenue_last_year)
ORDER BY b.date DESC LIMIT 1
)[OFFSET(0)].*
FROM `project.dataset.table` a
CROSS JOIN `project.dataset.table` b
WHERE a.location = b.location
AND b.date BETWEEN DATE_SUB(DATE_SUB(a.date, INTERVAL 1 YEAR), INTERVAL 7 DAY) AND DATE_SUB(a.date, INTERVAL 1 YEAR)
GROUP BY a.date, a.location
您可以使用虚拟/样本数据(与上一个问题的答案相同)来测试,玩游戏,如以下示例所示
#standardSQL
WITH `project.dataset.table` AS (
SELECT DATE '2018-02-20' `date`, 'A' location, 1 revenue UNION ALL
SELECT '2018-02-21', 'A', 3 UNION ALL
SELECT '2019-02-20', 'A', 5 UNION ALL
SELECT '2019-02-21', 'A', 7 UNION ALL
SELECT '2019-02-22', 'A', 9 UNION ALL
SELECT '2018-02-20', 'B', 2 UNION ALL
SELECT '2018-02-22', 'B', 4 UNION ALL
SELECT '2019-02-20', 'B', 6 UNION ALL
SELECT '2019-02-21', 'B', 8 UNION ALL
SELECT '2019-02-22', 'B', 10
)
SELECT
a.date, a.location, ANY_VALUE(a.revenue) revenue,
ARRAY_AGG(
STRUCT(b.date AS date_last_year, b.revenue AS revenue_last_year)
ORDER BY b.date DESC LIMIT 1
)[OFFSET(0)].*
FROM `project.dataset.table` a
CROSS JOIN `project.dataset.table` b
WHERE a.location = b.location
AND b.date BETWEEN DATE_SUB(DATE_SUB(a.date, INTERVAL 1 YEAR), INTERVAL 7 DAY) AND DATE_SUB(a.date, INTERVAL 1 YEAR)
GROUP BY a.date, a.location
-- ORDER BY a.date, a.location
结果
Row date location revenue date_last_year revenue_last_year
1 2019-02-20 A 5 2018-02-20 1
2 2019-02-20 B 6 2018-02-20 2
3 2019-02-21 A 7 2018-02-21 3
4 2019-02-21 B 8 2018-02-20 2
5 2019-02-22 A 9 2018-02-21 3
6 2019-02-22 B 10 2018-02-22 4
注意日期的差异:o)