我是python编程的新手,我正尽力学习File I / O。
我目前正在制作一个简单的程序,以从文本文档中读取并打印出结果。到目前为止,我已经能够在该网站上的许多资源和问题的帮助下创建该程序。
但是我很好奇如何从文本文档中读取多个单独的字符串并将结果字符串保存到文本文档中。
下面的程序是我制作的,使我可以在文本文档中搜索关键字,并将这些关键字之间的结果打印到另一个文本文件中。但是,每次搜索我只能设置一组开始和结束关键字:
from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename
root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector Text")
filename = askopenfilename() # Data Search Text File
outputfilename = askopenfilename() #Output Text File
with open(filename, "rb") as f_input:
start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")
reText = re.search("%s(.*?)%s" % (re.escape(start_token + ",SHOWALL"), re.escape(end_token)), f_input.read(), re.S)
if reText:
output = reText.group(1)
fo = open(outputfilename, "wb")
fo.write(output)
fo.close()
print output
else:
tkMessageBox.showinfo("Output", "Sorry that input was not found in the file")
print "not found"
因此,该程序的作用是,它允许用户选择一个文本文档,在该文档中搜索“开始关键字”和“结束关键字”,然后将这两个关键字之间的所有内容打印出到新的文本文档中。
我要实现的目标是允许用户选择一个文本文档并在该文本文档中搜索多个关键字集,并将结果打印到同一输出文本文件中。
换句话说,我有以下文本文档:
something something something something
something something something something STARTkeyword1 something
data1
data2
data3
data4
data5
ENDkeyword1
something something something something
something something something something STARTkeyword2 something
data1
data2
data3
data4
data5
Data6
ENDkeyword2
something something something something
something something something something STARTkeyword3 something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3
我希望能够使用3个不同的开始关键字和3个不同的结束关键字搜索此文本文档,然后将两者之间的内容打印到同一输出文本文件中。
因此,例如,我的输出文本文档如下所示:
something
data1
data2
data3
data4
data5
ENDkeyword1
something
data1
data2
data3
data4
data5
Data6
ENDkeyword2
something
data1
data2
data3
data4
data5
data6
data7
data8
ENDkeyword3
我尝试过的一种蛮力方法是使用户一次输入一个新的关键字的循环,但是,每当我尝试写入Text文档中的相同输出文件时,它将使用Append覆盖先前的条目。有什么方法可以使用户可以在文本文档中搜索多个字符串并以循环或不循环的方式打印出多个结果吗?
-----------------编辑:
非常感谢大家,我越来越接近您的技巧,以完善的最终版本左右。这是我当前的代码:
def process(infile, outfile, keywords):
keys = [ [k[0], k[1], 0] for k in keywords ]
endk = None
with open(infile, "rb") as fdin:
with open(outfile, "wb") as fdout:
for line in fdin:
if endk is not None:
fdout.write(line)
if line.find(endk) >= 0:
fdout.write("\n")
endk = None
else:
for k in keys:
index = line.find(k[0])
if index >= 0:
fdout.write(line[index + len(k[0]):].lstrip())
endk = k[1]
k[2] += 1
if endk is not None:
raise Exception(endk + " not found before end of file")
return keys
from Tkinter import *
import tkSimpleDialog
import tkMessageBox
from tkFileDialog import askopenfilename
root = Tk()
w = Label(root, text ="Configuration Inspector")
w.pack()
tkMessageBox.showinfo("Welcome", "This is version 1.00 of Configuration Inspector ")
infile = askopenfilename() #
outfile = askopenfilename() #
start_token = tkSimpleDialog.askstring("Serial Number", "What is the device serial number?")
end_token = tkSimpleDialog.askstring("End Keyword", "What is the end keyword")
process(infile,outfile,((start_token + ",SHOWALL",end_token),))
到目前为止,它仍然有效,但是现在该是部分让自己迷失方向的时候了,那就是由定界符分隔的多字符串输入。所以如果我输入了
STARTKeyword1,STARTKeyword2,STARTKeyword3,STARTKeyword4
进入程序提示符,我希望能够将这些关键字分开并将其放入
处理(输入文件,输出文件,关键字)
函数,以便仅提示用户输入一次,并允许多个字符串搜索文件。我正在考虑使用循环或将分离的输入创建到数组中。
如果这个问题与原先的问题相去甚远,我想我将关闭这个问题,然后再打开另一个问题,这样我就可以在信用到期的地方给予信用。
最佳答案
我将使用一个单独的函数,该函数需要:
输入文件的路径
输出文件的路径
包含(startkeyword,endkeyword)对的可迭代
然后,如果在开始和结束之间,我将逐行复制文件,并计算每对被发现的时间。这样,呼叫者就可以知道找到了哪些对以及每个对有多少次。
这是一个可能的实现:
def process(infile, outfile, keywords):
'''Search through inputfile whatever is between a pair startkeyword (excluded)
and endkeyword (included). Each chunk if copied to outfile and followed with
an empty line.
infile and outfile are strings representing file paths
keyword is an iterable containing pairs (startkeyword, endkeyword)
Raises an exception if an endkeyword is not found before end of file
Returns a list of lists [ startkeyword, endkeyword, nb of occurences]'''
keys = [ [k[0], k[1], 0] for k in keywords ]
endk = None
with open(infile, "r") as fdin:
with open(outfile, "w") as fdout:
for line in fdin:
if endk is not None:
fdout.write(line)
if line.find(endk) >= 0:
fdout.write("\n")
endk = None
else:
for k in keys:
index = line.find(k[0])
if index >= 0:
fdout.write(line[index + len(k[0]):].lstrip())
endk = k[1]
k[2] += 1
if endk is not None:
raise Exception(endk + " not found before end of file")
return keys