我有两张纸的Excel文件。

(“工作表1”)包含以下数据:

DATE       TMAX TMIN
20110706    317 211
20110707    322 211
20110708    317 211
20110709    322 211
20110710    328 222
20110711    333 244
20110712    356 250
20110713    356 222


另一个(“工作表2”)包括:

Start Date  End Date    Rep Month    Cost    kWh     kW
7/6/2011    8/3/2011    July     5,065.17    76,640      205
8/3/2011    9/7/2011    August   5,572.38    86,640      195


我的目标是在('工作表2')的特定日期范围内的kwh值上写另一列('工作表1')

例如:

DATE        TMAX    TMIN    kWh
20110706    317   211   76640
20110707    322   211   76640
20110708    317   211   76640
20110709    322   211   76640
20110710    328   222   76640
20110711    333   244   76640
20110712    356   250   76640
20110713    356   222   76640
20110801    344   228   76640
20110802    356   200   76640
20110803    367   200   86640
20110804    361   228   86640


我不知道为什么我的代码导致df [“ kWh”]为空('NaN'),从而导致写入('sheet1')的空白kWh列

这是我的代码如下:

import pandas as pd
from pandas import ExcelWriter

df = pd.read_excel("thecddhddtest.xlsx",'Sheet1')
df2 = pd.read_excel("thecddhddtest.xlsx",'Sheet2')
df.head()


df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")

df3 = df2.set_index("Start Date")

df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")

df3["kWh"].reindex(df["DATE"], method="ffill")
df["kWh"] = df3["kWh"].reindex(df["DATE"], method="ffill")
print(df["kWh"])


writer = ExcelWriter('thecddhddtestkWh.xlsx')
df.to_excel(writer,'Sheet1',index=False)
df2.to_excel(writer,'Sheet2',index=False)
writer.save()


结果是:

DATE       TMAX TMIN kWh
20110706    317 211
20110707    322 211
20110708    317 211
20110709    322 211
20110710    328 222
20110711    333 244
20110712    356 250
20110713    356 222

最佳答案

尝试我的解决方案-仅将DATE中的df列设置为index,然后将其设置为reindex

df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
#set column DATE to index
df = df.set_index("DATE")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
#reindex by index of df
df["kWh"] = df3["kWh"].reindex(df.index, method="ffill")
print(df["kWh"])
#DATE
#2011-07-06    76,640
#2011-07-07    76,640
#2011-07-08    76,640
#2011-07-09    76,640
#2011-07-10    76,640
#2011-07-11    76,640
#2011-07-12    76,640
#2011-07-13    76,640
#2011-08-01    76,640
#2011-08-02    76,640
#2011-08-03    86,640
#2011-08-04    86,640
#Name: kWh, dtype: object

10-08 12:50