我对Python非常陌生,我进行了很多搜索,但找不到解决方案。我想将以下xml文件解析为csv文件。
<List>
<item>
<id>5939c5e20d82880efce93933</id>
<sensorEvents>
<sensorEvents>
<avgSped>48.55647532226298</avgSped>
<completed>true</completed>
</sensorEvents>
<sensorEvents>
<avgSped>39.53368357145088</avgSped>
<completed>true</completed>
</sensorEvents>
<sensorEvents>
<avgSped>41.41160105233052</avgSped>
<completed>true</completed>
</sensorEvents>
</sensorEvents>
</item>
.
.
.
.
</List>
我写的代码是这样的:
import xml.etree.ElementTree as ET
import csv
tree = ET.parse("my_xml_file.xml")
root = tree.getroot()
f = open('my_csv_file.csv', 'w')
csvwriter = csv.writer(f)
head = ['ID','avgSped','completed']
csvwriter.writerow(head)
for Item in root.findall('item'):
for Sensorevents in Item.findall('sensorEvents'):
row = []
id_ = Item.find('id').text
row.append(id_)
avgSped_ = Sensorevents.find('sensorEvents').find('avgSped').text
row.append(avgSped_)
completed_ = Sensorevents.find('sensorEvents').find('completed').text
row.append(completed_)
csvwriter.writerow(row)
f.close()
结果是这样的:
有3个sensorEvent,但是我的代码仅捕获了第一个。如何修改代码以读取所有sensorEvent?
任何帮助都非常感谢。
最佳答案
由于您有一个包含3个<sensorEvents>
的<sensorEvents>
标记,因此第一个<sensorEvents>
会遮蔽<sensorEvents>
中的子级<sensorEvents>
。
这表示
for Sensorevents in Item.findall('sensorEvents'):
只会循环一次
<sensorEvents>
<sensorEvents>
<avgSped>48.55647532226298</avgSped>
<completed>true</completed>
</sensorEvents>
<sensorEvents>
<avgSped>39.53368357145088</avgSped>
<completed>true</completed>
</sensorEvents>
<sensorEvents>
<avgSped>41.41160105233052</avgSped>
<completed>true</completed>
</sensorEvents>
</sensorEvents>
然后
avgSped_ = Sensorevents.find('sensorEvents').find('avgSped').text
row.append(avgSped_)
completed_ = Sensorevents.find('sensorEvents').find('completed').text
仅获取第一个标签的数据。
你应该试试
for Item in root.findall('item'):
for root_Sensorevents in Item.findall('sensorEvents'):
for Sensorevents in root_Sensorevents.findall('sensorEvents'):
...