本文介绍了iterparse抛出“找不到元素:第1行,第0列",我不确定为什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网络应用程序(使用Twisted),该应用程序通过Internet接收xml块(因为整个xml可能不会在单个数据包中全部出现).我的思维过程是在接收到XML消息时慢慢构建它.我已经从xml.etree.ElementTree坐定"在iterparse上.我一直在摸索一些代码,以下代码(非扭曲代码)可以正常工作:

I have a network application (using Twisted) that receives chunks of xml (as in the entire xml may not come in its entirety in a single packet) over the internet. My thought process is to slowly build the xml message as it's received. I've "settled" on iterparse from xml.etree.ElementTree. I've been dabbling in some code and the following (non-Twisted code) works fine:

import xml.etree.ElementTree as etree
from io import StringIO

buff = StringIO(unicode('<notorious><burger/></notorious>'))

for event, elem in etree.iterparse(buff, events=('end',)):
    if elem.tag == 'notorious':
        print(etree.tostring(elem))

然后,我构建了以下代码来模拟如何最终接收数据:

Then I built the following code to simulate how data may be received on my end:

import xml.etree.ElementTree as etree
from io import StringIO

chunks = ['<notorious>','<burger/>','</notorious>']
buff = StringIO()

for ch in chunks:
    buff.write(unicode(ch))
    if buff.getvalue() == '<notorious><burger/></notorious>':
        print("it should work now")
    try:
        for event, elem in etree.iterparse(buff, events=('end',)):
            if elem.tag == 'notorious':
                print(etree.tostring(elem))
        except Exception as e:
            print(e)

但是代码吐出了:

我不能把头缠住它.当第二个示例中的stringIO与第一个代码示例中的stringIO内容相同时,为什么会发生该错误?

I can't wrap my head around it. Why does that error occur when the stringIO from the 2nd sample has the same contents of the stringIO in the first code sample?

ps:

  1. 我知道我不是第一个提出这个问题的人,但是没有其他线程回答了我的问题.如果我错了,请给我适当的线程.
  2. 如果您对使用其他模块有建议,请不要将其放在答案中.添加评论.

谢谢

推荐答案

文件对象和类似文件的对象具有文件位置.读取/写入后,文件位置会前进.您需要更改文件位置(使用 <file_object>.seek(..) ),然后再将文件对象传递到etree.iterparse,以便可以从文件的开头读取.

File objects and file-like objects have a file position. Once it's read / written, the file position advance. You need to change the file position (using <file_object>.seek(..)) before pass the file object to etree.iterparse so that it can read from the beginning of the file.

...
buff.seek(0) # <-----
for event, elem in etree.iterparse(buff, events=('end',)):
    if elem.tag == 'notorious':
        print(etree.tostring(elem))

这篇关于iterparse抛出“找不到元素:第1行,第0列",我不确定为什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 04:41