本文介绍了使用Python将XML解析为表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将XML解析为Python中的表式结构.想象这样的XML:

I'm trying to parse XML to table-like structure in Python.Imagine XML like this:

<?xml version="1.0" encoding="UTF-8"?>
<base>
  <element1>element 1</element1>
  <element2>element 2</element2>
  <element3>
    <subElement3>subElement 3</subElement3>
  </element3>
</base>

我想要这样的结果:

KEY                       | VALUE
base.element1             | "element 1"
base.element2             | "element 2"
base.element3.subElement3 | "subElement 3"

我尝试使用xml.etree.cElementTree,然后使用此处描述的功能

I've tried using xml.etree.cElementTree, then functions described here How to convert an xml string to a dictionary in Python?

是否有任何功能可以做到这一点?我发现的所有答案都是针对特定的XML方案编写的,因此需要针对每个新的XML方案进行编辑.作为参考,在R中,使用XML和XML2包以及xmlToList函数很容易.

Is there any function that can do this? All answers I found are written for particular XML schemes and would need to be edited for each new XML scheme.For reference, in R it's easy with XML and XML2 packages and xmlToList function.

推荐答案

使用以下脚本,我已经获得了所需的结果.

I've got the needed outcome using following script.

XML文件:

<?xml version="1.0" encoding="UTF-8"?>
<base>
  <element1>element 1</element1>
  <element2>element 2</element2>
  <element3>
    <subElement3>subElement 3</subElement3>
  </element3>
</base>

Python代码:

import pandas as pd
from lxml import etree

data = "C:/Path/test.xml"

tree = etree.parse(data)

lstKey = []
lstValue = []
for p in tree.iter() :
    lstKey.append(tree.getpath(p).replace("/",".")[1:])
    lstValue.append(p.text)

df = pd.DataFrame({'key' : lstKey, 'value' : lstValue})
df.sort_values('key')

结果:

这篇关于使用Python将XML解析为表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 14:01