我有两张纸的Excel文件。
(“工作表1”)包含以下数据:
DATE TMAX TMIN
20110706 317 211
20110707 322 211
20110708 317 211
20110709 322 211
20110710 328 222
20110711 333 244
20110712 356 250
20110713 356 222
另一个(“工作表2”)包括:
Start Date End Date Rep Month Cost kWh kW
7/6/2011 8/3/2011 July 5,065.17 76,640 205
8/3/2011 9/7/2011 August 5,572.38 86,640 195
我的目标是在('工作表2')的特定日期范围内的kwh值上写另一列('工作表1')
例如:
DATE TMAX TMIN kWh
20110706 317 211 76640
20110707 322 211 76640
20110708 317 211 76640
20110709 322 211 76640
20110710 328 222 76640
20110711 333 244 76640
20110712 356 250 76640
20110713 356 222 76640
20110801 344 228 76640
20110802 356 200 76640
20110803 367 200 86640
20110804 361 228 86640
我不知道为什么我的代码导致df [“ kWh”]为空('NaN'),从而导致写入('sheet1')的空白kWh列
这是我的代码如下:
import pandas as pd
from pandas import ExcelWriter
df = pd.read_excel("thecddhddtest.xlsx",'Sheet1')
df2 = pd.read_excel("thecddhddtest.xlsx",'Sheet2')
df.head()
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3["kWh"].reindex(df["DATE"], method="ffill")
df["kWh"] = df3["kWh"].reindex(df["DATE"], method="ffill")
print(df["kWh"])
writer = ExcelWriter('thecddhddtestkWh.xlsx')
df.to_excel(writer,'Sheet1',index=False)
df2.to_excel(writer,'Sheet2',index=False)
writer.save()
结果是:
DATE TMAX TMIN kWh
20110706 317 211
20110707 322 211
20110708 317 211
20110709 322 211
20110710 328 222
20110711 333 244
20110712 356 250
20110713 356 222
最佳答案
尝试我的解决方案-仅将DATE
中的df
列设置为index
,然后将其设置为reindex
。
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
#set column DATE to index
df = df.set_index("DATE")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
#reindex by index of df
df["kWh"] = df3["kWh"].reindex(df.index, method="ffill")
print(df["kWh"])
#DATE
#2011-07-06 76,640
#2011-07-07 76,640
#2011-07-08 76,640
#2011-07-09 76,640
#2011-07-10 76,640
#2011-07-11 76,640
#2011-07-12 76,640
#2011-07-13 76,640
#2011-08-01 76,640
#2011-08-02 76,640
#2011-08-03 86,640
#2011-08-04 86,640
#Name: kWh, dtype: object