hadoop - 无论我做什么，Hive总是给出 “Number of reduce tasks determined at compile time: 1”

create external table if not exists my_table
(customer_id STRING,ip_id STRING)
location 'ip_b_class';

然后:

hive> set mapred.reduce.tasks=50;
hive> select count(distinct customer_id) from my_table;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1

那里有160GB的空间，而带有1个 reducer 则需要很长时间...

[ihadanny@lvshdc2en0011 ~]$ hdu
Found 8 items
162808042208   hdfs://horton/ip_b_class

...

最佳答案

从逻辑上讲，此处最多只能有一个 reducer 。除非单个 map task 中的所有不同客户ID都集中到一个地方，否则无法建立区别，也无法产生单一计数。换句话说，除非您将所有客户ID放在一个地方，否则您不能说每个客户ID都是不同的并最终对其进行计数。

关于hadoop - 无论我做什么，Hive总是给出 “Number of reduce tasks determined at compile time: 1”，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/16218350/