python - 使用Python解析大型journalctl文件以匹配关键字的有效方法

分析journelctl文件时，要查找的关键字有：error、boot、warning、traceback
一旦遇到关键字，我需要为每个关键字增加计数器并打印匹配行。
因此，我尝试了如下操作：从文件中读取它，并使用collections module-counter对象跟踪计数和re.findall：

import re
from collections import Counter

keywords = [" error ", " boot ", " warning ", " traceback "]

def journal_parser():
    for keyword in keywords:
        print(keyword)  # just for debugging
        word = re.findall(keyword, open("/tmp/journal_slice.log").read().lower())
        count = dict(Counter(word))
        print(count)

以上的解决方案解决了我的问题，但我期待更有效的方式，如果有的话。
请告知。

最佳答案

下面是一个更有效的方法：

def journal_parser(context):
    with open("/tmp/journal_slice.log") as f:
        data = f.read()
        words = re.findall(r"|".join(keywords), data, re.I) # case insensitive matching by passing the re.I flag (ignore case)
        count = dict(Counter(words))
        print(count)

关于python - 使用Python解析大型journalctl文件以匹配关键字的有效方法，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/49772156/