我有一些具有以下结构的zipfiles(700+)(该文件就是这样的)

<?xml version="1.0" encoding="UTF-8"?>
<Values version="2.0">
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">false</value>
    <value name="retrievalSuspended">false</value>
</record>
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">false</value>
    <value name="retrievalSuspended">false</value>
</record>
</Values>


我想要实现的是替换,无论第一次出现字段processingSuspended和retrieveSuspended的值是true还是false。将其替换为false。但仅适用于首次出现。

编辑:

通过请求即时消息添加我到目前为止所拥有的内容,在这里我可以获取想要的字段,但是。我相信有一个简单的方法可以做到这一点:

import os
import zipfile
import glob
import time
import re

def main():
    rList = []
    for z in glob.glob("*.zip"):
        root = zipfile.ZipFile(z)
        for filename in root.namelist():
            if filename.find("node.ndf") >= 0:
                for line in root.read(filename).split("\n"):
                    if line.find("broker-trigger") >= 0:
                        for iline in root.read(filename).split("\n"):
                            Values = dict()
                            #match Processing state
                            if iline.find("processingSuspended") >= 0:
                                mpr = re.search(r'(.*>)(.*?)(<.*)',
                                                iline, re.M|re.I)
                            #match Retrieval state
                            if iline.find("retrievalSuspended") >= 0:
                                mr = re.search(r'(.*>)(.*?)(<.*)',
                                               iline, re.M|re.I)
                                Values['processingSuspended'] = mpr.group(2)
                                Values['retrievalSuspended'] = mr.group(2)
                                #print mr.group(2)
                                rList.append(Values)
    print rList

if __name__== "__main__":
    main()


提前致谢。

最佳答案

尝试使用lxml

>>> xml = '''\
<?xml version="1.0" encoding="UTF-8"?>
<Values version="2.0">
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">true</value>
    <value name="retrievalSuspended">true</value>
</record>
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">true</value>
    <value name="retrievalSuspended">true</value>
</record>
</Values>\
'''

>>> from lxml import etree
>>> tree = etree.fromstring(xml)
>>> tree.xpath('//value[@name="processingSuspended"]')[0].text = 'false'
>>> tree.xpath('//value[@name="retrievalSuspended"]')[0].text = 'false'


xpath表达式'//value[@name="processingSuspended"]'查找属性value等于name的所有标签"processingSuspended"。然后,我们将第一个与[0]一起使用,并将标签的文本更改为'false'

输出:

>>> print(etree.tostring(tree, pretty_print=True))
<Values version="2.0">
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">false</value>
    <value name="retrievalSuspended">false</value>
</record>
<record name="trigger">
    <value name="uniqueId">6xjUCpDlrTVHRsEVmxx0Ews6ni8=</value>
    <value name="processingSuspended">true</value>
    <value name="retrievalSuspended">true</value>
</record>
</Values>

>>>

关于python - 仅替换文件中第一次出现的字段/单词,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/17577885/

10-14 16:10