在Python 3.6中工作时,我希望记录可重复的操作,而不会从函数代码内过多地调用记录器。优选地,具有自动选项以在函数中的每一行代码求值后吐出日志。那有可能吗?

下面是诸如过滤数据之类的最小可重现示例:

# Import libraries
import pandas as pd
import numpy as np
import logging

# Set up the logger and a dummy data frame
logger = logging.getLogger()
logger.setLevel(logging.INFO)

dummy_df = pd.DataFrame({
    'col_A': np.arange(1, 1000, 1),
    'col_B': np.arange(1001, 2000, 1)
})

# Subset rows of the dataframe
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))

# Removing values A below 15
dummy_df = dummy_df.loc[dummy_df['col_A'] > 15]
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))

# Removing values B below 1500 and above 1600
dummy_df = dummy_df.loc[(dummy_df['col_B'] > 1500) & (dummy_df['col_B'] < 1600)]
logging.info("There are {} rows remaining".format(dummy_df.shape[0]))


问题是我必须为基本相同的操作向记录器编写可重复调用。

最佳答案

假设条件是如此不同,以至于无法通过更改过滤器的条件来具有单个过滤器功能。

# Import libraries
import pandas as pd
import numpy as np
import logging

# Set up the logger and a dummy data frame
logger = logging.getLogger()
logger.setLevel(logging.INFO)

dummy_df = pd.DataFrame({
    'col_A': np.arange(1, 1000, 1),
    'col_B': np.arange(1001, 2000, 1)
})

def filter1(df):
    return df.loc[df['col_A'] > 15]

def filter2(df):
    return df.loc[(df['col_B'] > 1500) & (df['col_B'] < 1600)]

filters = (filter1, filter2)

logging.info("There are {} rows remaining".format(dummy_df.shape[0]))
for my_filter in filters:
    dummy_df = my_filter(dummy_df)
    logging.info("There are {} rows remaining".format(dummy_df.shape[0]))


您可以根据需要添加任意数量的过滤器

09-28 07:48