我目前在为以下数据框架添加行时遇到麻烦,该数据框架是我为六家公司的股票的收益而构建的:
def importdata(data):
returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index
return returns_with_dates
输出:
Out[345]:
Company 1 Company 2 Company 3 Company 4 Company 5 Company 6
Dates
1997-01-02 31.087620 3.094705 24.058686 31.694404 37.162890 13.462241
1997-01-03 31.896592 3.109631 22.423629 32.064378 37.537013 13.511706
1997-01-06 31.723241 3.184358 18.803148 32.681000 37.038183 13.684925
1997-01-07 31.781024 3.199380 19.503886 33.544272 37.038183 13.660193
1997-01-08 31.607673 3.169431 19.387096 32.927650 37.537013 13.585995
1997-01-09 31.492106 3.199380 19.737465 33.420948 37.038183 13.759214
1997-01-10 32.589996 3.184358 19.270307 34.284219 37.661721 13.858235
1997-01-13 32.416645 3.199380 19.153517 35.147491 38.035844 13.660193
1997-01-14 32.301077 3.184358 19.503886 35.517465 39.407629 13.783946
1997-01-15 32.127726 3.199380 19.387096 35.887438 38.409967 13.759214
1997-01-16 32.532212 3.229232 19.737465 36.257412 39.282921 13.635460
1997-01-17 33.167833 3.259180 20.087835 37.490657 39.033505 13.858235
1997-01-20 33.456751 3.229232 20.438204 35.640789 39.657044 14.377892
1997-01-21 33.225616 3.244158 20.671783 36.010763 40.779413 14.179940
1997-01-22 33.110049 3.289033 21.489312 36.010763 40.654705 14.254138
1997-01-23 32.705563 3.199380 20.905363 35.394140 40.904121 14.229405
1997-01-24 32.127726 3.139579 20.204624 35.764114 40.405290 13.957165
1997-01-27 32.127726 3.094705 20.204624 35.270816 40.779413 13.882968
1997-01-28 31.781024 3.079778 20.788573 34.407544 41.153536 13.684925
1997-01-29 32.185510 3.094705 21.138942 34.654193 41.278244 13.858235
1997-01-30 32.647779 3.094705 21.022153 34.407544 41.652367 13.981898
1997-01-31 32.532212 3.064757 20.204624 34.037570 42.275905 13.858235
经过数小时的尝试,我尝试对它们进行汇总,以使我将1997-01-02至1997-01-08、1997-01-09至1997-01-15等行加起来,从而将前五行,然后是后五行。此外,我试图将日期保留为第5个元素的索引,因此在将1997-01-02至1997-01-08的元素相加的情况下,我试图保留1997-01-08作为对应的索引总结元素。值得一提的是,我一直以五行加法为例,但理想情况下,我尝试将每n行,然后是随后的n行相加,同时以与前面所述相同的方式维护日期。我想出了一种以数组形式进行操作的方法(如下面的代码所示),但是在这种情况下我无法保持日期。
returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index
returns_mat=returns_with_dates.as_matrix()
ndays=int(len(returns_mat)/n) # Number of "ndays" in our time-period
nday_returns=np.empty((ndays,min(np.shape(returns_mat)))) # Creates an empty array to fill
# and accommodate the n-day log-returns
for i in range(1,asset_number+1):
for j in range(1,ndays+1):
nday_returns[j-1,i-1]=np.sum(returns_mat[(n*j)-n:n*j,i-1])
return nday_returns
除了在DataFrame上下文中同时以我之前所说的方式维护日期,是否有其他方法可以做到这一点?我长期以来一直在努力做到这一点,但没有任何成功,这确实让我感到压力!由于某些原因,每个人都认为熊猫非常有用且易于使用,但我恰恰相反。任何帮助将不胜感激。提前致谢。
最佳答案
如果您缺少的日期数相同,则可以根据需要的天数resample
。使用resample
将日期保留在索引中。您也可以使用loffset
参数移动日期。
df.resample('7D', loffset='6D').sum()
Company 1 Company 2 Company 3 Company 4 Company 5 \
Dates
1997-01-08 158.096150 15.757505 104.176445 162.911704 186.313282
1997-01-15 160.927550 15.966856 97.052271 174.257561 190.553344
1997-01-22 165.492461 16.250835 102.424599 181.410384 199.407588
1997-01-29 160.927549 15.608147 103.242126 175.490807 204.520604
1997-02-05 65.179991 6.159462 41.226777 68.445114 83.928272
Company 6
Dates
1997-01-08 67.905060
1997-01-15 68.820802
1997-01-22 70.305665
1997-01-29 69.612698
1997-02-05 27.840133
关于python - 难以在 Pandas DataFrame中添加元素,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/42914040/