具有特定排除条件的Hive查询

本文介绍了具有特定排除条件的Hive查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试构建一个只包含以下功能或这些功能组合的配置单元查询。例如，这些功能包括：

name =summary

name =details

$ b

name1 =车辆统计数据

name1 =accelerometer

类似地，客户Lan不应该被计数，因为他在name1中额外完成了超速操作，这并不符合上述条件。

 客户姓名姓名1 
快速汇总车辆统计数据
快速细节加速度计
快速支出加速
 Lan摘要车辆统计
 Lan细节加速度计
 Lan细节加速
 Hana细节加速度计
 Hana摘要车辆统计

下表的计数必须为1，因为只有1名客户（Hana）在名称和车辆状态中仅完成摘要和详细信息和
accelerometerin name1。

这是我目前的查询：

<$从表1中选择名称，名称1，计数（distinct（customername））

其中date_time介于2017-01-01 00:00:00和2017 -01-10 00:00:00
按名称分组，名称1
在（'summary'，'detai ls'）
或name1（'vehicle stats'，'accelerometer'）

任何建议会很棒!!
解决方案
您也可以使用 collect_set 来

从表格1中选择客户名称 where date_time between 2017-01-01 00:00:00和2017-01-10 00:00:00 group by customername concat_ws（'，'，collect_set（name））='summary ，细节' 和concat_ws（'，'，collect_set（name1））='车辆状态，加速计'
您必须对 collect_set
的连接输出进行排序以进行比较。

I am trying to build a hive query that does only the below features or a combination of these features. For example, the features include
name = "summary"
name = "details"
name1 = "vehicle stats"
name1 = "accelerometer"
I have to count the number of customers who strictly follow the above conditions. For example, in the below table, customer "Joy" should not be counted because he has additionally done "expenses" in name even though he has both "summary" and "details" in name and "vehicle stats" and "accelerometer" in name1.
Similarly, customer "Lan" should not been counted as he has additionally done "speeding" in name1 which is not in the above conditions.
customername name name1 Joy summary vehicle stats Joy details accelerometer Joy expenses speeding Lan summary vehicle stats Lan details accelerometer Lan details speeding Hana details accelerometer Hana summary vehicle stats
Count for the below table has to be 1 as there is only 1 customer (Hana) who has done only "summary" and "details" in name and "vehicle stats" and "accelerometer" in name1.
This is the query that I currently have:
select name, name1, count(distinct(customername)) from table1 where date_time between "2017-01-01 00:00:00" and "2017-01-10 00:00:00" group by name, name1 having name in ('summary', 'details') or name1 in ('vehicle stats', 'accelerometer')
Any suggestions would be great!!
解决方案
You can also use collect_set to check only for the specified entries in those columns.
select customername from table1 where date_time between "2017-01-01 00:00:00" and "2017-01-10 00:00:00" group by customername having concat_ws(',',collect_set(name)) = 'summary,details' and concat_ws(',',collect_set(name1)) = 'vehicle stats,accelerometer'
You have to sort the concatenated output from collect_set for comparison.

这篇关于具有特定排除条件的Hive查询的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

Name1

具有特定排除条件的Hive查询

问题描述