本文介绍了如何在 Julia Dataframe 中添加新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有数据框和向量,例如:

Let's say I have dataframe and vector such as :

dataframe = DataFrame(Data1 = rand(10), Data2 = rand(10));
Data3 = rand(10)

我想将 Data3 添加到数据框中,例如:

I want to add Data3 to the dataframe such as:

    Data1       Data2     Data3
    Float64     Float64   Float64
1   0.757345    0.903133  0.502133
2   0.294749    0.327502  0.323133
3   0.156397    0.427323  0.123133

在 Python 中,我可以只 df["Data3"] = Data3 来添加列,但在 Julia 数据帧中,df[!,Data3] = Data3 返回:

In Python, I can just df["Data3"] = Data3 to add column, but in Julia dataframe, df[!,Data3] = Data3 returns :

  • MethodError: 没有方法匹配 setindex!(::DataFrame, ::Vector{Float64}, ::typeof(!), ::Vector{Float64})

我也检查了 这个解决方案,但这给了我:

Also I've checked this solution, but this gave me :

  • ArgumentError: 不支持语法 df[column] 使用 df[!, column] 代替.

如何在 Julia Dataframe 中将向量添加为新列?

How can I add vector as a new column in Julia Dataframe?

推荐答案

你快到了,你正在寻找:

You were almost there, you are looking for:

dataf[!, :Data3] = Data3

dataframe[!, "Data3"] = Data3

dataframe.Data3 = Data3

请注意,我在这里使用 SymbolString - [!, :Data3] 是一个索引操作,所以它需要您希望存储数据的行 (!) 和列 (:Data3) 索引的标识符,而不是数据本身.

note that I'm using a Symbol or String here - the [!, :Data3] is an indexing operation, so it needs an identifier of the row (!) and column (:Data3) index where you want the data to be stored, not the data itself.

您正在将实际数据(一个 10 元素的随机数向量)绑定到变量 Data3,因此 dataframe[!, Data3] 与变量 Data3(而不是 SymbolString 的值为 Data3")等价于做

You are binding the actual data (a 10-element vector of random numbers) to the variable Data3, so doing dataframe[!, Data3] with the variable Data3 (rather than a Symbol or String with the value "Data3") is equivalent to doing

dataframe[!, rand(10)]

这意味着我想访问 DataFrame 的所有行 (!),以及由 10 个随机数标识的 10 列".现在用随机浮点数索引没有多大意义(dataframe[!, 0.532] 应该返回什么?)这就是为什么你会看到你看到的错误 - setindex 不接受 Vector{Float} 作为参数.

which means "I want to access all rows (!) of a DataFrame, and 10 columns identified by 10 random numbers". Now indexing by a random floating point number doesn't make a lot of sense (what should dataframe[!, 0.532] return?) which is why you get the error you see - setindex does not accept a Vector{Float} as an argument.

关于您链接的 Discourse 线程,它非常古老,并且 df["col"] 语法早已被弃用.DataFrames 中的基本索引概念是 DataFrame 是一个二维数据结构,因此应该由 df[row_indices, col_indices].

Regarding the Discourse thread you linked, it is very old and the df["col"] syntax has been deprecated a long time ago. The basic indexing concept in DataFrames is that a DataFrame is a two-dimensional data structure, and as such should be indexed by df[row_indices, col_indices].

DataFrames 支持多种指定有效索引的方式,这些方式太多了,此处无法详述,但在文档 这里.

DataFrames supports a variety of ways of specifying valid indices, which are too numerous to go into detail here but are listed in the docs here.

这篇关于如何在 Julia Dataframe 中添加新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-15 19:14