我在Python多线程方面的实践相当糟糕。所以,现在,我正在研究如何从多个线程获取日志信息。我看到了很多不同的方法,但我想应该从简单的方法开始。所以任务是创建几个线程并记录每个线程的数据。为了识别日志的来源,我想在日志输出中放置一些自定义标记。我知道logging lib有一个reach LogRecord属性(thread、threadName等),它工作得很好。所以,我有一些例子(logging-from-multiple-threads)并做了一些修改。以下是完整的代码:
import logging
import threading
import time
logger = logging.getLogger()
syslog = logging.StreamHandler()
formatter = logging.Formatter('%(project)s : %(thread)x '
'%(levelname)-8s '
'%(message)s')
syslog.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(syslog)
class ContextFilter(logging.Filter):
def __init__(self, project):
super(ContextFilter, self).__init__()
self.project = project
def filter(self, record):
record.project = self.project
return True
def worker(args):
while not args['stop']:
logging.debug('Hi from {}'.format(args['project']))
time.sleep(0.5)
def main():
projects = ['project_1', 'project_2']
info = {'stop': False}
threads = []
for project in projects:
info['project'] = project
logger.addFilter(ContextFilter(project))
thread = threading.Thread(target=worker, args=(info,))
thread.start()
threads.append(thread)
while True:
try:
logging.debug('Hello from main')
time.sleep(1.75)
except KeyboardInterrupt:
info['stop'] = True
break
for t in threads:
t.join()
if __name__ == '__main__':
main()
下面是输出结果:
project_2 : 7fa627e77700 DEBUG Hi from project_2
project_2 : 7fa6293d0700 DEBUG Hello from main
project_2 : 7fa627676700 DEBUG Hi from project_2
project_2 : 7fa627e77700 DEBUG Hi from project_2
project_2 : 7fa627676700 DEBUG Hi from project_2
project_2 : 7fa627e77700 DEBUG Hi from project_2
project_2 : 7fa627676700 DEBUG Hi from project_2
project_2 : 7fa627e77700 DEBUG Hi from project_2
project_2 : 7fa627676700 DEBUG Hi from project_2
project_2 : 7fa6293d0700 DEBUG Hello from main
project_2 : 7fa627e77700 DEBUG Hi from project_2
实际上,这不是我所期望的。你能告诉我我做错了什么吗?
最佳答案
部分问题来自对象变量的传递。当您传递args=(info,)
时,您传递的是reference to an object(稍后将对其进行修改并传递给下一个对象),而不是对象的副本。将同一对象传递给多个线程可能会变得危险,从而可能导致race conditions
首先我们可以移除ContextFilter。我们将它们添加到全局记录器,而不是跟踪每个线程的任何内容。
import logging
import threading
import time
logger = logging.getLogger()
syslog = logging.StreamHandler()
formatter = logging.Formatter('%(project)s : %(thread)x '
'%(levelname)-8s '
'%(message)s')
syslog.setFormatter(formatter)
logger.setLevel(logging.DEBUG)
logger.addHandler(syslog)
我发现在一般的构建中,类对于除了最简单的任务之外的所有任务都更有用。
这个类维护自己的
threading.Thread
状态,并用正确的running
数据构建自己的日志适配器。class Worker(threading.Thread):
def __init__(self, info):
self.running=False
self.info=info
self.logger=logging.LoggerAdapter(logger, self.info)
super(Worker, self).__init__()
def start(self):
self.running=True
super(Worker, self).start()
def stop(self):
self.running=False
def run(self):
while self.running:
self.logger.debug('Hi from {}'.format(self.info['project']))
time.sleep(0.5)
现在我们需要改变一些事情。我们需要使用我们自己的
extra
类。我们不需要对记录器做任何事情,类将管理自己的LoggerAdapter。
我们希望确保每次都创建一个新的info对象,这非常简单,我们可以直接在函数调用(
Worker
)中传递它,而不必分配变量。我们需要确保在从主线程登录时传递
{'project': project}
变量。用另一个LoggerAdapter可能更好。一旦中断循环,我们就可以要求每个线程停止,然后等待每个线程(
project
可能会移动到join()
类的stop
方法中)def main():
projects = ['project_1', 'project_2']
threads = []
for project in projects:
thread = Worker({'project': project})
thread.start()
threads.append(thread)
while True:
try:
logging.debug('Hello from main', extra={'project':'main'})
time.sleep(1.75)
except KeyboardInterrupt:
break
for t in threads:
t.stop()
for t in threads:
t.join()
if __name__ == '__main__':
main()
此代码产生如下结果
project_1 : 7f4b44180700 DEBUG Hi from project_1
project_2 : 7f4b4397f700 DEBUG Hi from project_2
main : 7f4b45c8d700 DEBUG Hello from main
project_1 : 7f4b44180700 DEBUG Hi from project_1
project_2 : 7f4b4397f700 DEBUG Hi from project_2
project_1 : 7f4b44180700 DEBUG Hi from project_1
project_2 : 7f4b4397f700 DEBUG Hi from project_2
project_1 : 7f4b44180700 DEBUG Hi from project_1
project_2 : 7f4b4397f700 DEBUG Hi from project_2
main : 7f4b45c8d700 DEBUG Hello from main
project_1 : 7f4b44180700 DEBUG Hi from project_1
有很多方法可以整理代码,使其更具可读性,但这至少应该为您提供一些学习和试验的起点。当您了解更多关于线程的信息时,还应该阅读thread synchronization机制。我最近开始using
Worker
s线程间的通信,这是更容易调试的主要代码。