问题描述
在熊猫中,R的scale
函数的有效等效项是什么?例如
What is the efficient equivalent of R's scale
function in pandas? E.g.
newdf <- scale(df)
用熊猫写的?使用transform
是否有一种优雅的方式?
written in pandas? Is there an elegant way using transform
?
推荐答案
缩放在机器学习任务中非常常见,因此它是在scikit-learn的preprocessing
模块中实现的.您可以将pandas DataFrame传递给其scale
方法.
Scaling is very common in machine learning tasks, so it is implemented in scikit-learn's preprocessing
module. You can pass pandas DataFrame to its scale
method.
唯一的问题"是返回的对象不再是DataFrame,而是一个numpy数组;如果您仍然要将其传递给机器学习模型(例如SVM或逻辑回归),通常这不是真正的问题.如果要保留DataFrame,则需要一些解决方法:
The only "problem" is that the returned object is no longer a DataFrame, but a numpy array; which is usually not a real issue if you want to pass it to a machine learning model anyway (e.g. SVM or logistic regression). If you want to keep the DataFrame, it would require some workaround:
from sklearn.preprocessing import scale
from pandas import DataFrame
newdf = DataFrame(scale(df), index=df.index, columns=df.columns)
另请参见此处.
这篇关于在Python中在 pandas 中实现R标度功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!