本文介绍了Spark withColumn-使用非列类型变量添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从变量值向数据框中添加一列?

How can I add a column to a data frame from a variable value?

我知道我可以使用.toDF(colName)创建数据框,并且.withColumn是添加列的方法.但是,当我尝试以下操作时,出现类型不匹配错误:

I know that I can create a data frame using .toDF(colName) and that .withColumn is the method to add the column. But, when I try the following, I get a type mismatch error:

val myList = List(1,2,3)
val myArray = Array(1,2,3)

myList.toDF("myList")
  .withColumn("myArray", myArray)

此编译错误在.withColumn调用内的myArray上.如何将其从Array [Int]转换为Column类型?

This compile error is on myArray within the .withColumn call. How can I convert it from an Array[Int] to a Column type?

推荐答案

错误消息正好出现了,您需要输入一列(或lit())作为第二个参数,作为withColumn()

The error message has exactly what is up, you need to input a column (or a lit()) as the second argument as withColumn()

尝试

import org.apache.spark.sql.functions.typedLit

val myList = List(1,2,3)
val myArray = Array(1,2,3)

myList.toDF("myList")
  .withColumn("myArray", typedLit(myArray))

:)

这篇关于Spark withColumn-使用非列类型变量添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-23 10:23