我下面有个桌子

    Date | Customer | Count | Daily_Count | ITD_Count
    d1   |   A      |  3    |  3          |
    d2   |   B      |  4    |  4          |
    d3   |   A      |  7    |  16         |
    d3   |   B      |  9    |  16         |
    d4   |   A      |  8    |  9          |
    d4   |   B      |  1    |  9          |

字段描述:


SUM(count) OVER (partition BY date )as Daily_Count

题 :
如何计算ITD_Count中的“运行总计”或“滚动总计”?
输出应该看起来像
Date | Customer | Count | Daily_Count | ITD_Count
 d1   |   A      |  3    |  3          |  3
 d2   |   B      |  4    |  4          |  7
 d3   |   A      |  7    |  16         |  23
 d3   |   B      |  9    |  16         |  23
 d4   |   A      |  8    |  9          |  31
 d4   |   B      |  1    |  9          |  31

我已经尝试过使用Window功能的多种变体。但是,我的所有尝试都遇到了障碍。

尝试1;
  SUM(daily_COunt) OVER (partition BY date order by date rows between unbounded preceding and current row ) as ITD_account_linking

尝试2:
  SUM(daily_COunt) OVER (partition BY date, daily_count order by date rows between unbounded preceding and current row ) as ITD_account_linking

并在此之后进行了更多尝试。 :(

欢迎任何可能的指导我正确方向的建议。

如果您需要更多详细信息,请告诉我。

最佳答案

使用配置单元窗口和分析功能。

SELECT Date, Customer, Count, Daily_Count,
        SUM(Daily_Count) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING) AS ITD_Count
FROM table;

关于hadoop - 基于2个字段的Hive查询滚动总数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37688123/

10-11 07:51