本文介绍了否则子句不能按预期工作,这里有什么问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 spark-sql-2.4.1v 如何进行各种连接取决于列的值我需要为给定的值列获取 map_val 列的多个查找值,如下所示.

I am using spark-sql-2.4.1v how to do various joins depend on the value of column I need get multiple look up values of map_val column for given value columns as show below.

示例数据:

val data = List(
  ("20", "score", "school", "2018-03-31", 14 , 12),
  ("21", "score", "school", "2018-03-31", 13 , 13),
  ("22", "rate", "school", "2018-03-31", 11 , 14),
  ("21", "rate", "school", "2018-03-31", 13 , 12)
 )
val df = data.toDF("id", "code", "entity", "date", "value1", "value2")

df.show

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|    14|    12|
| 21|score|school|2018-03-31|    13|    13|
| 22| rate|school|2018-03-31|    11|    14|
| 21| rate|school|2018-03-31|    13|    12|
+---+-----+------+----------+------+------+




 val resultDs = df
                 .withColumn("value1",
                        when(col("code").isin("rate") , functions.callUDF("udfFunc",col("value1")))
                         .otherwise(col("value1").cast(DoubleType))
                      )

udfFunc 映射如下

11->a
12->b
13->c
14->d

预期输出

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|    14|    12|
| 21|score|school|2018-03-31|    13|    13|
| 22| rate|school|2018-03-31|    a |    14|
| 21| rate|school|2018-03-31|    c |    12|
+---+-----+------+----------+------+------+

但它给出的输出为

+---+-----+------+----------+------+------+
| id| code|entity|      date|value1|value2|
+---+-----+------+----------+------+------+
| 20|score|school|2018-03-31|  null|    12|
| 21|score|school|2018-03-31|  null|    13|
| 22| rate|school|2018-03-31|    a |    14|
| 21| rate|school|2018-03-31|    c |    12|
+---+-----+------+----------+------+------+

为什么否则"条件未按预期工作.知道这里出了什么问题吗??

why "otherwise" condition is not working as expected. any idea what is wrong here ??

推荐答案

列应该包含相同的数据类型.

Column should contains same datatype.

注意 - DoubleType 不能存储 StringTyp 数据,所以需要将 DoubleType 转换为 StringType.

Note - DoubleType can not store StringTyp data, So you need to convert DoubleType to StringType.

val resultDs = df
.withColumn("value1",
        when(col("code") === lit("rate") ,functions.callUDF("udfFunc",col("value1")))
        .otherwise(col("value1").cast(StringType)) // Should be StringType
    )

val resultDs = df
                 .withColumn("value1",
                        when(col("code").isin("rate") , functions.callUDF("udfFunc",col("value1")))
                         .otherwise(col("value1").cast(StringType)) // Modified to StringType
                      )

这篇关于否则子句不能按预期工作,这里有什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-08 21:46
查看更多