本文介绍了 pandas :使用diff和groupby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用 pandas (版本0.20. 3)并且我想将diff方法与groupby一起应用,但要使用Databy而不是DataFrame,结果是下划线".
I am using pandas (version 0.20.3) and I want to apply the diff method with groupby but instead of a DataFrame, the result is an "underscore".
这是代码:
import numpy as np
import pandas as pd
# creating the DataFrame
data = np.random.random(18).reshape(6,3)
indexes = ['B']*3 + ['A']*3
columns = ['x', 'y', 'z']
df = pd.DataFrame(data, index=indexes, columns=columns)
df.index.name = 'chain_id'
# Now I want to apply the diff method in function of the chain_id
df.groupby('chain_id').diff()
结果是一个下划线!请注意,df.loc ['A'].diff()和df.loc ['B'].diff()返回预期结果,所以我不明白为什么它不适用于groupby.
And the result is an underscore!Note that df.loc['A'].diff() and df.loc['B'].diff() return the expected results so I don't understand why it wouldn't work with groupby.
推荐答案
IIUC,您的错误:无法从重复的轴重新索引
df.reset_index().groupby('chain_id').diff().set_index(df.index)
Out[859]:
x y z
chain_id
B NaN NaN NaN
B -0.468771 0.192558 -0.443570
B 0.323697 0.288441 0.441060
A NaN NaN NaN
A -0.198785 0.056766 0.081513
A 0.138780 0.563841 0.635097
这篇关于 pandas :使用diff和groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!