问题描述
所有处理之后,我可以创建以下数据框.唯一的问题是年份不正确.每个位置的日期按降序排列.因此,在2015-01-15
之后应该是2014-12-15
,而不是2015-12-15
.
After all the processing I was able to create below dataframe. The only problem with it is that the year is incorrect. The date is in decreasing order for each Location. So after 2015-01-15
it should be 2014-12-15
, not 2015-12-15
.
+--------------------+---------------+-------+
| Location | Date | Value |
+--------------------+---------------+-------+
| India | 2015-03-15 | -200|
| India | 2015-02-15 | 140 |
| India | 2015-01-15 | 155 |
| India | 2015-12-15 | 85 |
| India | 2015-11-15 | 45 |
| China | 2015-03-15 | 199 |
| China | 2015-02-15 | 164 |
| China | 2015-01-15 | 209 |
| China | 2015-12-15 | 24 |
| China | 2015-11-15 | 11 |
| Russia | 2015-03-15 | 48 |
| Russia | 2015-02-15 | 104 |
| Russia | 2015-01-15 | 106 |
| Russia | 2015-12-15 | -20 |
| Russia | 2015-11-15 | 10 |
推荐答案
进行 strong 假设,这些日期是每个月的15日结束的每月日期,并且给定是正确的,我们可以每月按Location
向后退.
Making the strong assumption that these are monthly dates ending on the 15th of every month and that the first value for a given Location
is correct, we can step backwards monthly by Location
.
# Create original dataframe.
df = pd.DataFrame({'Location': ['India'] * 5 + ['China'] * 5 + ['Russia'] * 5,
'Date': ['2015-03-15', '2015-02-15', '2015-01-15', '2015-12-15', '2015-11-15'] * 3,
'Value': [-200, 140, 155, 85, 45, 199, 164, 209, 24, 11, 48, 104, 106, -20, 10]})[
['Location', 'Date', 'Value']
]
# Convert dates to pandas Timestamps.
df['Date'] = pd.DatetimeIndex(df['Date'])
gb = df.groupby(['Location'])['Date']
df['Date'] = [
str(first_period - months) + '-15'
for location_months, first_period in zip(
gb.count(), gb.first().apply(lambda date: pd.Period(date, 'M')))
for months in range(location_months)
]
>>> df
Location Date Value
0 India 2015-03-15 -200
1 India 2015-02-15 140
2 India 2015-01-15 155
3 India 2014-12-15 85
4 India 2014-11-15 45
5 China 2015-03-15 199
6 China 2015-02-15 164
7 China 2015-01-15 209
8 China 2014-12-15 24
9 China 2014-11-15 11
10 Russia 2015-03-15 48
11 Russia 2015-02-15 104
12 Russia 2015-01-15 106
13 Russia 2014-12-15 -20
14 Russia 2014-11-15 10
最终日期为字符串形式,您可能希望再次通过以下方式将其转换回时间戳记:
The final dates are in string form which you may again wish to convert back to Timestamps via:
df['Date'] = pd.DatetimeIndex(df['Date'])
这篇关于如果上个月是一月,则减去一年的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!