问题描述
在 Pandas 中,我有一个由两组组成的数据框,每组中有几个样本.每个组都有一个内部参考值,我想从该组内的所有样本值中减去该值.
In Pandas I have a data frame consisting of two groups with several samples in each group. Each group has an internal reference value that I want to subtract from all the sample values within that group.
s = u"""Group sample value
group1 ref1 18.1
group1 smp1 NaN
group1 smp2 20.3
group1 smp3 30.0
group2 ref2 16.1
group2 smp4 29.2
group2 smp5 19.9
group2 smp6 28.9
"""
df = pd.read_csv(io.StringIO(s), sep='\s+')
df = df.set_index(['Group', 'sample'])
df
Out[82]:
value
Group sample
group1 ref1 18.1
smp1 NaN
smp2 20.3
smp3 30.0
group2 ref2 16.1
smp4 29.2
smp5 19.9
smp6 28.9
我想做的是添加一个新列,其中从每个相应组内的所有样本 (smp) 中减去参考 (ref).像这样:
What I want do do is to add a new column where the reference (ref) has been subtracted from all samples (smp) within each respective group. Like this:
value deltaValue
SampleGroup sample
Group1 ref 18.1 0
smp1 NaN NaN
smp2 20.3 2.2
smp3 30.0 11.9
Group2 ref2 16.1 0
smp4 29.2 13.1
smp5 19.9 3.8
smp6 28.9 12.8
有谁知道如何做到这一点?谢谢!
Does anyone know how this can be done? Thanks!
推荐答案
这是一种无需循环的方法
Here's one way to do it without loops
首先创建一个 func
函数,该函数标识以 ref
开头的 sample
,然后计算 delta
值.
First create a func
function which identifies sample
which starts with ref
and then calculates delta
value.
In [33]: def func(grp):
ref = grp.ix[grp['sample'].str.startswith('ref'), 'value']
grp['delta'] = grp['value'] - ref.values[0]
return grp
使用这个 func
并应用到 dff.groupby('Group')
Use this func
and apply over the the dff.groupby('Group')
In [34]: dff.groupby('Group').apply(func)
Out[34]:
Group sample value delta
0 group1 ref1 18.1 0.0
1 group1 smp1 NaN NaN
2 group1 smp2 20.3 2.2
3 group1 smp3 30.0 11.9
4 group2 ref2 16.1 0.0
5 group2 smp4 29.2 13.1
6 group2 smp5 19.9 3.8
7 group2 smp6 28.9 12.8
首先你的 dff
应该是这样的,它可以像 dff = df.reset_index()
To begin with your dff
should be like, which could be created like dff = df.reset_index()
In [35]: dff
Out[35]:
Group sample value
0 group1 ref1 18.1
1 group1 smp1 NaN
2 group1 smp2 20.3
3 group1 smp3 30.0
4 group2 ref2 16.1
5 group2 smp4 29.2
6 group2 smp5 19.9
7 group2 smp6 28.9
这篇关于从 pandas 的行中减去组特定值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!