问题描述
我是python的新手,正在努力处理pandas库中的数据.我有一个像这样的熊猫数据库:
I'm new to python and struggling to manipulate data in pandas library. I have a pandas database like this:
Year Value
0 91 1
1 93 4
2 94 7
3 95 10
4 98 13
并希望完成缺少的年份,以创建具有空值的行,如下所示:
And want to complete the missing years creating rows with empty values, like this:
Year Value
0 91 1
1 92 0
2 93 4
3 94 7
4 95 10
5 96 0
6 97 0
7 98 13
我该如何在Python中做到这一点?(我想这样做,这样我就可以绘制值而不用跳过年份)
How do i do that in Python?(I wanna do that so I can plot Values without skipping years)
推荐答案
我将创建一个新的数据框,该数据框以Year为索引,并包含您需要涵盖的整个日期范围.然后,您可以简单地在两个数据帧之间设置值,索引将确保它们正确匹配行(我不得不使用fillna将缺失的年份设置为零,默认情况下,它们将设置为NaN
):
I would create a new dataframe that has Year as an Index and includes the entire date range that you need to cover. Then you can simply set the values across the two dataframes, and the index will make sure that they correct rows are matched (I've had to use fillna to set the missing years to zero, by default they will be set to NaN
):
df = pd.DataFrame({'Year':[91,93,94,95,98],'Value':[1,4,7,10,13]})
df.index = df.Year
df2 = pd.DataFrame({'Year':range(91,99), 'Value':0})
df2.index = df2.Year
df2.Value = df.Value
df2= df2.fillna(0)
df2
Value Year
Year
91 1 91
92 0 92
93 4 93
94 7 94
95 10 95
96 0 96
97 0 97
98 13 98
最后,如果您不想将Year作为索引,则可以使用reset_index
:
Finally you can use reset_index
if you don't want Year as your index:
df2.drop('Year',1).reset_index()
Year Value
0 91 1
1 92 0
2 93 4
3 94 7
4 95 10
5 96 0
6 97 0
7 98 13
这篇关于Python Pandas根据时间序列中缺少的顺序值添加行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!