问题描述
我想知道是否可以更改数据框中列的位置,实际上可以更改架构吗?
I was wondering if it is possible to change the position of a column in a dataframe, actually to change the schema?
如果我有一个像[field1, field2, field3]
这样的数据框,而我想得到[field1, field3, field2]
.
Precisely if I have got a dataframe like [field1, field2, field3]
, and I would like to get [field1, field3, field2]
.
我不能输入任何代码.让我们想象一下,我们正在使用一个具有一百列的数据框,在进行一些连接和转换之后,其中一些列在目标表的模式方面被放错了位置.
I can't put any piece of code.Let us imagine we're working with a dataframe with one hundred columns, after some joins and transformations, some of these columns are misplaced regarding the schema of the destination table.
如何移动一列或几列,即:如何更改架构?
How to move one or several columns, i.e: how to change the schema?
推荐答案
您可以获取列名,并根据需要对其进行重新排序,然后在原始DataFrame上使用select
来按此新顺序获取新的列:
You can get the column names, reorder them however you want, and then use select
on the original DataFrame to get a new one with this new order:
val columns: Array[String] = dataFrame.columns
val reorderedColumnNames: Array[String] = ??? // do the reordering you want
val result: DataFrame = dataFrame.select(reorderedColumnNames.head, reorderedColumnNames.tail: _*)
这篇关于如何更改Spark数据框中的列位置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!