在Python 3.6中工作时,我希望记录可重复的操作,而不会从函数代码内过多地调用记录器。优选地,具有自动选项以在函数中的每一行代码求值后吐出日志。那有可能吗?
下面是诸如过滤数据之类的最小可重现示例:
# Import libraries
import pandas as pd
import numpy as np
import logging
# Set up the logger and a dummy data frame
logger = logging.getLogger()
logger.setLevel(logging.INFO)
dummy_df = pd.DataFrame({
'col_A': np.arange(1, 1000, 1),
'col_B': np.arange(1001, 2000, 1)
})
# Subset rows of the dataframe
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
# Removing values A below 15
dummy_df = dummy_df.loc[dummy_df['col_A'] > 15]
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
# Removing values B below 1500 and above 1600
dummy_df = dummy_df.loc[(dummy_df['col_B'] > 1500) & (dummy_df['col_B'] < 1600)]
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
问题是我必须为基本相同的操作向记录器编写可重复调用。
最佳答案
假设条件是如此不同,以至于无法通过更改过滤器的条件来具有单个过滤器功能。
# Import libraries
import pandas as pd
import numpy as np
import logging
# Set up the logger and a dummy data frame
logger = logging.getLogger()
logger.setLevel(logging.INFO)
dummy_df = pd.DataFrame({
'col_A': np.arange(1, 1000, 1),
'col_B': np.arange(1001, 2000, 1)
})
def filter1(df):
return df.loc[df['col_A'] > 15]
def filter2(df):
return df.loc[(df['col_B'] > 1500) & (df['col_B'] < 1600)]
filters = (filter1, filter2)
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
for my_filter in filters:
dummy_df = my_filter(dummy_df)
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
您可以根据需要添加任意数量的过滤器