我在Python多线程方面的实践相当糟糕。所以,现在,我正在研究如何从多个线程获取日志信息。我看到了很多不同的方法,但我想应该从简单的方法开始。所以任务是创建几个线程并记录每个线程的数据。为了识别日志的来源,我想在日志输出中放置一些自定义标记。我知道logging lib有一个reach LogRecord属性(thread、threadName等),它工作得很好。所以,我有一些例子(logging-from-multiple-threads)并做了一些修改。以下是完整的代码:

import logging
import threading
import time

logger = logging.getLogger()
syslog = logging.StreamHandler()
formatter = logging.Formatter('%(project)s : %(thread)x '
                              '%(levelname)-8s '
                              '%(message)s')
syslog.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(syslog)


class ContextFilter(logging.Filter):

    def __init__(self, project):
        super(ContextFilter, self).__init__()
        self.project = project

    def filter(self, record):
        record.project = self.project
        return True


def worker(args):
    while not args['stop']:
        logging.debug('Hi from {}'.format(args['project']))
        time.sleep(0.5)


def main():
    projects = ['project_1', 'project_2']
    info = {'stop': False}
    threads = []
    for project in projects:
        info['project'] = project
        logger.addFilter(ContextFilter(project))
        thread = threading.Thread(target=worker, args=(info,))
        thread.start()
        threads.append(thread)
    while True:
        try:
            logging.debug('Hello from main')
            time.sleep(1.75)
        except KeyboardInterrupt:
            info['stop'] = True
            break
    for t in threads:
        t.join()

if __name__ == '__main__':
    main()

下面是输出结果:
project_2 : 7fa627e77700 DEBUG    Hi from project_2
project_2 : 7fa6293d0700 DEBUG    Hello from main
project_2 : 7fa627676700 DEBUG    Hi from project_2
project_2 : 7fa627e77700 DEBUG    Hi from project_2
project_2 : 7fa627676700 DEBUG    Hi from project_2
project_2 : 7fa627e77700 DEBUG    Hi from project_2
project_2 : 7fa627676700 DEBUG    Hi from project_2
project_2 : 7fa627e77700 DEBUG    Hi from project_2
project_2 : 7fa627676700 DEBUG    Hi from project_2
project_2 : 7fa6293d0700 DEBUG    Hello from main
project_2 : 7fa627e77700 DEBUG    Hi from project_2

实际上,这不是我所期望的。你能告诉我我做错了什么吗?

最佳答案

部分问题来自对象变量的传递。当您传递args=(info,)时,您传递的是reference to an object(稍后将对其进行修改并传递给下一个对象),而不是对象的副本。将同一对象传递给多个线程可能会变得危险,从而可能导致race conditions
首先我们可以移除ContextFilter。我们将它们添加到全局记录器,而不是跟踪每个线程的任何内容。

import logging
import threading
import time

logger = logging.getLogger()
syslog = logging.StreamHandler()
formatter = logging.Formatter('%(project)s : %(thread)x '
                              '%(levelname)-8s '
                              '%(message)s')
syslog.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(syslog)

我发现在一般的构建中,类对于除了最简单的任务之外的所有任务都更有用。
这个类维护自己的threading.Thread状态,并用正确的running数据构建自己的日志适配器。
class Worker(threading.Thread):
    def __init__(self, info):
        self.running=False
        self.info=info
        self.logger=logging.LoggerAdapter(logger, self.info)
        super(Worker, self).__init__()
    def start(self):
        self.running=True
        super(Worker, self).start()
    def stop(self):
        self.running=False
    def run(self):
        while self.running:
            self.logger.debug('Hi from {}'.format(self.info['project']))
            time.sleep(0.5)

现在我们需要改变一些事情。我们需要使用我们自己的extra类。
我们不需要对记录器做任何事情,类将管理自己的LoggerAdapter。
我们希望确保每次都创建一个新的info对象,这非常简单,我们可以直接在函数调用(Worker)中传递它,而不必分配变量。
我们需要确保在从主线程登录时传递{'project': project}变量。用另一个LoggerAdapter可能更好。
一旦中断循环,我们就可以要求每个线程停止,然后等待每个线程(project可能会移动到join()类的stop方法中)
def main():
    projects = ['project_1', 'project_2']
    threads = []
    for project in projects:
        thread = Worker({'project': project})
        thread.start()
        threads.append(thread)
    while True:
        try:
            logging.debug('Hello from main', extra={'project':'main'})
            time.sleep(1.75)
        except KeyboardInterrupt:
            break
    for t in threads:
        t.stop()
    for t in threads:
        t.join()

if __name__ == '__main__':
    main()

此代码产生如下结果
project_1 : 7f4b44180700 DEBUG    Hi from project_1
project_2 : 7f4b4397f700 DEBUG    Hi from project_2
main : 7f4b45c8d700 DEBUG    Hello from main
project_1 : 7f4b44180700 DEBUG    Hi from project_1
project_2 : 7f4b4397f700 DEBUG    Hi from project_2
project_1 : 7f4b44180700 DEBUG    Hi from project_1
project_2 : 7f4b4397f700 DEBUG    Hi from project_2
project_1 : 7f4b44180700 DEBUG    Hi from project_1
project_2 : 7f4b4397f700 DEBUG    Hi from project_2
main : 7f4b45c8d700 DEBUG    Hello from main
project_1 : 7f4b44180700 DEBUG    Hi from project_1

有很多方法可以整理代码,使其更具可读性,但这至少应该为您提供一些学习和试验的起点。当您了解更多关于线程的信息时,还应该阅读thread synchronization机制。我最近开始using Workers线程间的通信,这是更容易调试的主要代码。

07-24 09:36
查看更多