我有一个熊猫数据框,其中包含年月数据列(yyyymm)。我正计划将数据插值为每日和每周值。这是我的df。

df:
    201301      201302      201303      ...    201709       201710
a   0.747711    0.793101    0.771819    ...    0.818161     0.812522
b   0.776537    0.759745    0.733673    ...    0.757496     0.765181
c   0.801699    0.847655    0.796586    ...    0.784537     0.763551
d   0.797942    0.687899    0.729911    ...    0.819887     0.772395
e   0.777472    0.799676    0.782947    ...    0.804533     0.791759
f   0.780933    0.750774    0.781056    ...    0.790846     0.773705
g   2.071699    2.261739    2.126915    ...    1.891780     2.098914


如您所见,我的df位于蒙特列数据中,我希望将其更改为每日值。我打算使用线性函数。这是例子。

# (201302 - 201301)/31 (since January 2013 has 31 days)
a = (0.793101-0.747711)/31
# now a is the daily increasing (or decresing depends on values) value for a day.
# 2013-01-01 value woud be
0.747711
# 2013-01-02 value woud be
0.747711 + a
# 2013-01-03 value woud be
0.747711 + (a*2)
# last day of January would be
0.747711 + (a*30)
# first day of Feb would be
0.747711 + (a*31) which is 0.793101 (201302 value)


所以我的df_daily从2013年到2017年10月(第一天)每天都有,其值与上面一样。我非常忙于处理时间戳,因此,如果有任何方法可以将我的值从月值转换为日值,那将是很棒的。谢谢!

哦,请让我知道我的问题是否令人困惑...

最佳答案

首先将datetimes的列转换为to_datetime,然后将reindexNaN转换为缺少的日期,最后interpolate

df.columns = pd.to_datetime(df.columns, format='%Y%m')
#by first and last values of columns
rng = pd.date_range(df.columns[0], df.columns[-1])
#alternatively min and max of columns
#rng = pd.date_range(df.columns.min(), df.columns.max())

df = df.reindex(rng, axis=1).interpolate(axis=1)


验证解决方案:

a = (0.793101-0.747711)/31
print (0.747711 + a)
print (0.747711 + a*2)
print (0.747711 + a*3)
0.7491751935483871
0.7506393870967742
0.7521035806451613




print (df)
  2013-01-01  2013-01-02  2013-01-03  2013-01-04  2013-01-05  2013-01-06  \
a    0.747711    0.749175    0.750639    0.752104    0.753568    0.755032
b    0.776537    0.775995    0.775454    0.774912    0.774370    0.773829
c    0.801699    0.803181    0.804664    0.806146    0.807629    0.809111
d    0.797942    0.794392    0.790842    0.787293    0.783743    0.780193
e    0.777472    0.778188    0.778905    0.779621    0.780337    0.781053
f    0.780933    0.779960    0.778987    0.778014    0.777042    0.776069
g    2.071699    2.077829    2.083960    2.090090    2.096220    2.102351

   2013-01-07  2013-01-08  2013-01-09  2013-01-10     ...      2017-09-22  \
a    0.756496    0.757960    0.759425    0.760889     ...        0.814214
b    0.773287    0.772745    0.772204    0.771662     ...        0.762876
c    0.810594    0.812076    0.813559    0.815041     ...        0.769847
d    0.776643    0.773094    0.769544    0.765994     ...        0.786643
e    0.781770    0.782486    0.783202    0.783918     ...        0.795591
f    0.775096    0.774123    0.773150    0.772177     ...        0.778847
g    2.108481    2.114611    2.120742    2.126872     ...        2.036774

   2017-09-23  2017-09-24  2017-09-25  2017-09-26  2017-09-27  2017-09-28  \
a    0.814026    0.813838    0.813650    0.813462    0.813274    0.813086
b    0.763132    0.763388    0.763644    0.763900    0.764156    0.764413
c    0.769147    0.768448    0.767748    0.767049    0.766349    0.765650
d    0.785060    0.783476    0.781893    0.780310    0.778727    0.777144
e    0.795165    0.794740    0.794314    0.793888    0.793462    0.793036
f    0.778276    0.777705    0.777133    0.776562    0.775990    0.775419
g    2.043678    2.050583    2.057487    2.064392    2.071296    2.078201

   2017-09-29  2017-09-30  2017-10-01
a    0.812898    0.812710    0.812522
b    0.764669    0.764925    0.765181
c    0.764950    0.764251    0.763551
d    0.775561    0.773978    0.772395
e    0.792611    0.792185    0.791759
f    0.774848    0.774276    0.773705
g    2.085105    2.092010    2.098914

[7 rows x 1735 columns]

09-11 19:16