问题描述
我有一个带有退款原因的Pandas DataFrame.它包含以下示例数据行:
I have a Pandas DataFrame with customer refund reasons.It contains these example data rows:
**case_type** **claim_type**
1 service service
2 service service
3 chargeback service
4 chargeback local_charges
5 service supplier_service
6 chargeback service
7 chargeback service
8 chargeback service
9 chargeback service
10 chargeback service
11 service service_not_used
12 service service_not_used
我想将客户的原因与某种标记的原因进行比较.没问题,但是我也想查看特定组中的记录总数(客户原因).
I would like to compare the customer's reason with some sort of labeled reason. This is no problem, but I would also like to see the total number of records in a specific group (customer reason).
case_claim_type = df[["case_type", "claim_type"]]
case_claim_type.groupby(by=("case_type", "claim_type"))["case_type"].count()
哪个给我这个输出,例如:
Which gives me this output, for example:
**case_type** **claim_type**
service service 2
supplier_service 1
service_not_used 2
chargeback service 6
local_charges 1
我还希望每个case_type的输出总和.像这样:
I would also like to have have the sum of the output per case_type. Something like:
**case_type** **claim_type**
service service 2
supplier_service 1
service_not_used 2
total: 5
chargeback service 6
local_charges 1
total: 7
不一定必须采用最后一种输出格式,每case_type总计(汇总)的列也可以.
It doesn't necessarily has to be in this last output format, a column with the (aggregated) totals per case_type is also fine.
推荐答案
其中:
df = pd.DataFrame({'case_type':['Service']*20+['chargeback']*9,'claim_type':['service']*5+['local_charges']*5+['service_not_used']*5+['supplier_service']*5+['service']*8+['local_charges']})
df_out = df.groupby(by=("case_type", "claim_type"))["case_type"].count()
让我们使用pd.concat
,带有级别参数的sum
和assign
:
Let use pd.concat
, sum
with level parameter, and assign
:
(pd.concat([df_out.to_frame(),
df_out.sum(level=0).to_frame()
.assign(claim_type= "total")
.set_index('claim_type', append=True)])
.sort_index())
输出:
case_type
case_type claim_type
Service local_charges 5
service 5
service_not_used 5
supplier_service 5
total 20
chargeback local_charges 1
service 8
total 9
这篇关于 pandas groupby和组的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!