我目前在为以下数据框架添加行时遇到麻烦,该数据框架是我为六家公司的股票的收益而构建的:

def importdata(data):

returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index

return returns_with_dates


输出:

Out[345]:
        Company 1  Company 2   Company 3  Company 4  Company 5  Company 6
Dates
1997-01-02  31.087620   3.094705   24.058686  31.694404  37.162890  13.462241
1997-01-03  31.896592   3.109631   22.423629  32.064378  37.537013  13.511706
1997-01-06  31.723241   3.184358   18.803148  32.681000  37.038183  13.684925
1997-01-07  31.781024   3.199380   19.503886  33.544272  37.038183  13.660193
1997-01-08  31.607673   3.169431   19.387096  32.927650  37.537013  13.585995
1997-01-09  31.492106   3.199380   19.737465  33.420948  37.038183  13.759214
1997-01-10  32.589996   3.184358   19.270307  34.284219  37.661721  13.858235
1997-01-13  32.416645   3.199380   19.153517  35.147491  38.035844  13.660193
1997-01-14  32.301077   3.184358   19.503886  35.517465  39.407629  13.783946
1997-01-15  32.127726   3.199380   19.387096  35.887438  38.409967  13.759214
1997-01-16  32.532212   3.229232   19.737465  36.257412  39.282921  13.635460
1997-01-17  33.167833   3.259180   20.087835  37.490657  39.033505  13.858235
1997-01-20  33.456751   3.229232   20.438204  35.640789  39.657044  14.377892
1997-01-21  33.225616   3.244158   20.671783  36.010763  40.779413  14.179940
1997-01-22  33.110049   3.289033   21.489312  36.010763  40.654705  14.254138
1997-01-23  32.705563   3.199380   20.905363  35.394140  40.904121  14.229405
1997-01-24  32.127726   3.139579   20.204624  35.764114  40.405290  13.957165
1997-01-27  32.127726   3.094705   20.204624  35.270816  40.779413  13.882968
1997-01-28  31.781024   3.079778   20.788573  34.407544  41.153536  13.684925
1997-01-29  32.185510   3.094705   21.138942  34.654193  41.278244  13.858235
1997-01-30  32.647779   3.094705   21.022153  34.407544  41.652367  13.981898
1997-01-31  32.532212   3.064757   20.204624  34.037570  42.275905  13.858235


经过数小时的尝试,我尝试对它们进行汇总,以使我将1997-01-02至1997-01-08、1997-01-09至1997-01-15等行加起来,从而将前五行,然后是后五行。此外,我试图将日期保留为第5个元素的索引,因此在将1997-01-02至1997-01-08的元素相加的情况下,我试图保留1997-01-08作为对应的索引总结元素。值得一提的是,我一直以五行加法为例,但理想情况下,我尝试将每n行,然后是随后的n行相加,同时以与前面所述相同的方式维护日期。我想出了一种以数组形式进行操作的方法(如下面的代码所示),但是在这种情况下我无法保持日期。

returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index

returns_mat=returns_with_dates.as_matrix()
ndays=int(len(returns_mat)/n) # Number of "ndays" in our time-period

nday_returns=np.empty((ndays,min(np.shape(returns_mat)))) # Creates an empty array to fill
# and accommodate the n-day log-returns

for i in range(1,asset_number+1):
    for j in range(1,ndays+1):
        nday_returns[j-1,i-1]=np.sum(returns_mat[(n*j)-n:n*j,i-1])

return nday_returns


除了在DataFrame上下文中同时以我之前所说的方式维护日期,是否有其他方法可以做到这一点?我长期以来一直在努力做到这一点,但没有任何成功,这确实让我感到压力!由于某些原因,每个人都认为熊猫非常有用且易于使用,但我恰恰相反。任何帮助将不胜感激。提前致谢。

最佳答案

如果您缺少的日期数相同,则可以根据需要的天数resample。使用resample将日期保留在索引中。您也可以使用loffset参数移动日期。

df.resample('7D', loffset='6D').sum()

                 Company 1  Company 2   Company 3   Company 4   Company 5  \
Dates
1997-01-08  158.096150  15.757505  104.176445  162.911704  186.313282
1997-01-15  160.927550  15.966856   97.052271  174.257561  190.553344
1997-01-22  165.492461  16.250835  102.424599  181.410384  199.407588
1997-01-29  160.927549  15.608147  103.242126  175.490807  204.520604
1997-02-05   65.179991   6.159462   41.226777   68.445114   83.928272

            Company 6
Dates
1997-01-08  67.905060
1997-01-15  68.820802
1997-01-22  70.305665
1997-01-29  69.612698
1997-02-05  27.840133

关于python - 难以在 Pandas DataFrame中添加元素,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/42914040/

10-09 20:00