因此,我最近开始学习python,在工作中,我们希望找到一种方法来简化在日志文件中查找特定关键字的过程,从而更容易确定要添加到阻止列表中的IP。

我决定开始编写一个python脚本,该脚本将接收一个日志文件,接收一个包含关键术语列表的文件,然后在日志文件中查找那些关键术语,然后在其中写入与会话ID匹配的行找到了关键术语;到一个新文件。

import sys
import time
import linecache
from datetime import datetime

def timeStamped(fname, fmt='%Y-%m-%d-%H-%M-%S_{fname}'):
    return datetime.now().strftime(fmt).format(fname=fname)

importFile = open('rawLog.txt', 'r') #pulling in log file
importFile2 = open('keyWords.txt', 'r') #pulling in keywords
exportFile = open(timeStamped('ParsedLog.txt'), 'w') #writing the parsed log

FILE = importFile.readlines()
keyFILE = importFile2.readlines()

logLine = 1  #for debugging purposes when testing
parseString = ''
holderString = ''
sessionID = []
keyWords= []
j = 0

for line in keyFILE: #go through each line in the keyFile
        keyWords = line.split(',') #add each word to the array

print(keyWords)#for debugging purposes when testing, this DOES give all the correct results


for line in FILE:
        if keyWords[j] in line:
                parseString = line[29:35] #pulling in session ID
                sessionID.append(parseString) #saving session IDs to a list
        elif importFile == '' and j < len(keyWords):  #if importFile is at end of file and we are not at the end of the array
                importFile.seek(0) #goes back to the start of the file
                j+=1        #advance the keyWords array

        logLine +=1 #for debugging purposes when testing
importFile2.close()
print(sessionID) #for debugging purposes when testing



importFile.seek(0) #goes back to the start of the file


i = 0
for line in FILE:
        if sessionID[i] in line[29:35]: #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
                holderString = line #pulling the line of log file
                exportFile.write(holderString)#writing the log file line to a new text file
                print(holderString) #for debugging purposes when testing
                if i < len(sessionID):
                    i+=1

importFile.close()
exportFile.close()


在我的keyWords列表上并没有迭代,我可能犯了一些愚蠢的菜鸟错误,但是我没有足够的经验来意识到自己搞砸了。当我检查输出时,它仅在rawLog.txt文件的keyWords列表中搜索第一项。

第三个循环的确返回了基于第二个列表拉出并尝试进行迭代的sessionID出现的结果(由于我从未小于sessionID列表的长度,这是由于我的会话ID仅小于sessionID列表的长度,因此出现了超出范围的异常) 1个值)。

该程序确实成功写入并命名了新的日志文件,并带有DateTime和ParsedLog.txt。

最佳答案

在我看来,您的第二个循环需要一个内部循环而不是一个内部if语句。例如。

for line in FILE:
    for word in keyWords:
            if word in line:
                    parseString = line[29:35] #pulling in session ID
                    sessionID.append(parseString) #saving session IDs to a list
                    break # Assuming there will only be one keyword per line, else remove this
    logLine +=1 #for debugging purposes when testing
importFile2.close()
print(sessionID) #for debugging purposes when testing


假设我理解正确,那就是。

关于python - Python脚本未遍历数组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/28843870/

10-11 15:19
查看更多