问题描述
我有如下列表[N]
val check = List ("a","b","c","d")
其中 N 可以是任意数量的元素.
where N can be any number of elements.
我有一个 dataframe
,只有一个名为value"的列.根据值的内容,我需要创建 N 列,列名作为列表中的元素,列内容作为 substring(x,y)
I have a dataframe
with only column called "value". Based on the contents of value i need to create N columns with column names as elements in the list and column contents as substring(x,y)
我尝试了所有可能的方法,例如 withColumn
、selectExpr
,但没有任何效果.请考虑 substring(X,Y)
其中 X 和 Y 是基于某些元数据的一些数字
I have tried all possible ways, like withColumn
, selectExpr
, nothing works.Please consider substring(X,Y)
where X and Y as some numbers based on some metadata
以下是我尝试过的不同代码,但都没有效果,
Below are my different codes which I tried, but none worked,
val df = sqlContext.read.text("xxxxx")
val coder: (String => String) = (arg: String) => {
val param = "NULL"
if (arg.length() > Y )
arg.substring(X,Y)
else
val sqlfunc = udf(coder)
val check = List ("a","b","c","d")
for (name <- check){val testDF2 = df.withColumn(name, sqlfunc(df("value")))}
testDF2 只有最后一列 d,其他列如 a,b,c 没有添加到表中
testDF2 has only last column d and other columns such as a,b,c are not added in table
var z:Array[String] = new Array[String](check.size)
var i=0
for ( x <- check ) {
if ( (i+1) == check.size) {
z(i) = s""""substring(a.value,X,Y) as $x""""
i = i+1}
else{
z(i) = s""""substring(a.value,X,Y) as $x","""
i = i+1}}
val zz = z.mkString(" ")
df.alias("a").selectExpr(s"$zz").show()
这会引发错误
请帮助如何在DF中动态添加列,列名作为List中的元素
Please help how to add columns in DF dynamically with column names as elements in List
我期待像下面这样的 Df
I am expecting an Df like below
-----------------------------
Value| a | b | c | d | .... N
-----------------------------
|xxx|xxx|xxx|xxx|xxx|xxxxxx-
|xxx|xxx|xxx|xxx|xxx|xxxxxx-
|xxx|xxx|xxx|xxx|xxx|xxxxxx-
-----------------------------
推荐答案
您可以使用例如 this user6910411 对类似问题的回答(有关更多可能性,请参阅她/他的完整答案):
you can dynamically add columns from your list using for instance this answer by user6910411 to a similar question (see her/his full answer for more possibilities):
val newDF = check.foldLeft()((df, name) => df.withColumn(name,$"value"))
这篇关于在数据框中动态添加列,列名作为列表中的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!