本文介绍了向行添加条目以使其统一的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个 csv
文件,其中包含日期,repair_id,现场维修次数和非现场维修次数,因此我的数据如下:
I have a csv
file with dates, repair_id, number of onsite repairs and number of offsite repairs, so that my data looks as:
data repair_id num_onsite num_offsite
2016-02-01 A 3 0
2016-02-01 B 2 1
2016-02-01 D 0 4
2016-02-02 A 1 3
2016-02-02 C 1 1
2016-02-02 E 0 6
...
2016-02-14 A 1 3
2016-02-14 B 0 4
2016-02-14 D 2 0
2016-02-14 E 3 0
有5种不同的 repair_id
,即: A,B,C,D,E
。如果维修人员( repair_id
)在给定日期没有工作,则该日期他们不在CSV文件中。我想通过包含它们来更改它,并为 num_onsite
和<$ c赋予 0
值
$ c> num_offsite ,以便我的表类似于:
There are 5 different repair_id
, namely: A, B, C, D, E
. If a repair man (repair_id
) had no work on a given date then they are not in the csv file for that date. I would like to change that by including them and have a 0
valuefor num_onsite
and num_offsite
so that my table would resemble:
data repair_id num_onsite num_offsite
2016-02-01 A 3 0
2016-02-01 B 2 1
2016-02-01 C 0 0 # added
2016-02-01 D 0 4
2016-02-01 E 0 0 # added
2016-02-02 A 1 3
2016-02-02 B 0 0 # added
2016-02-02 C 1 1
2016-02-02 D 0 0 # added
2016-02-02 E 0 6
...
2016-02-14 A 1 3
2016-02-14 B 0 4
2016-02-14 C 0 0 # added
2016-02-14 D 2 0
2016-02-14 E 3 0
我看过:
,但我无法使其正确输出
but I wasn't able to get it to output properly
推荐答案
df.set_index(["data","repair_id"]).unstack(fill_value=0).stack().reset_index()
data repair_id num_onsite num_offsite
0 2016-02-01 A 3.0 0.0
1 2016-02-01 B 2.0 1.0
2 2016-02-01 C 0.0 0.0
3 2016-02-01 D 0.0 4.0
4 2016-02-01 E 0.0 0.0
5 2016-02-02 A 1.0 3.0
6 2016-02-02 B 0.0 0.0
7 2016-02-02 C 1.0 1.0
8 2016-02-02 D 0.0 0.0
9 2016-02-02 E 0.0 6.0
这篇关于向行添加条目以使其统一的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!