xml:
<?xml version='1.0' encoding='utf-8'?>
<!--this is a test about xml-->
<booklist type='scicence and enginerring'>
<book category='math'>
<title>learing math</title>
<title>learing math1</title>
<author>zhagn san</author>
<pageNumber>562</pageNumber>
</book>
<book category='python'>
<title>learing Python</title>
<author>li si</author>
<pageNumber>544</pageNumber>
</book>
</booklist>
下面代码可以看出每一层节点和内容介绍:
#coding=utf-8
from xml.dom.minidom import parse
DOMTree=parse(r"d:\\test.xml")
booklist=DOMTree.documentElement
print booklist
print "*"*30
books=booklist.getElementsByTagName('book')
print "books:",books
print "books[0].childNodes:",books[0].childNodes
print "books[0].childNodes[1]:",books[0].childNodes[1]
print "books[0].childNodes[1].childNodes:",books[0].childNodes[1].childNodes
print "books[0].childNodes[1].childNodes[0]:",books[0].childNodes[1].childNodes[0]
print "books[0].childNodes[1].childNodes[0].data:",books[0].childNodes[1].childNodes[0].data
#print "books[0].childNodes[1].childNodes[1]:",books[0].childNodes[1].childNodes[1]
解释:
#books是获取booklist对象中所有book节点的list集合
books: [<DOM Element: book at 0x28855d0>, <DOM Element: book at 0x2885990>]
#books[0].childNodes: 是第一个book节点的所有子节点,是一个列表
books[0].childNodes: [<DOM Text node "u'\n '">, <DOM Element: title at 0x28856e8>, <DOM Text node "u'\n '">, <DOM Element: title at 0x2885788>, <DOM Text node "u'\n '">, <DOM Element: author at 0x2885828>, <DOM Text node "u'\n '">, <DOM Element: pageNumber at 0x28858c8>, <DOM Text node "u'\n '">]
#books[0].childNodes[1]: 是第一个book节点的第二个子节点,是一个title标签元素-:<title>learing math1</title>,包含了标签和标签的子节点:文本节点
books[0].childNodes[1]: <DOM Element: title at 0x28856e8>
#books[0].childNodes[1].childNodes: 是第一个book节点的第二个子节点(title标签元素)的子节点(文本节点),是个列表
books[0].childNodes[1].childNodes: [<DOM Text node "u'learing ma'...">]
#books[0].childNodes[1].childNodes[0]: 是第一个book节点的第二个子节点(title标签元素)的子节点(文本节点-list)的第一个元素
books[0].childNodes[1].childNodes[0]: <DOM Text node "u'learing ma'...">
#books[0].childNodes[1].childNodes[0].data:是第一个book节点的第二个子节点(title标签元素)的子节点(文本节点-list)的第一个元素的值
books[0].childNodes[1].childNodes[0].data: learing math
#books[0].childNodes[1].childNodes[1]: 试图是拿到第一个book节点的第二个子节点(title标签元素)的子节点(文本节点-list)的第二个元素,但是报错了,可知该列表中只有一个元素
books[0].childNodes[1].childNodes[1]:
Traceback (most recent call last):
File "task_test.py", line 17, in <module>
print "books[0].childNodes[1].childNodes[1]:",books[0].childNodes[1].childNodes[1]
IndexError: list index out of range
由此而知,
凡是.以childNodes 结尾的,结果就是列表
凡是.以childNodes[i] 结尾的,结果就是一个节点元素
c:\Python27\Scripts>python task_test.py
<DOM Element: booklist at 0x28854b8>
******************************
books: [<DOM Element: book at 0x28855d0>, <DOM Element: book at 0x2885990>]
books[0].childNodes: [<DOM Text node "u'\n '">, <DOM Element: title at 0x28856e8>, <DOM Text node "u'\n '">, <DOM Element: title at 0x2885788>, <DOM Text node "u'\n '">, <DOM Element: author at 0x2885828>, <DOM Text node "u'\n '">, <DOM Element: pageNumber at 0x28858c8>, <DOM Text node "u'\n '">]
books[0].childNodes[1]: <DOM Element: title at 0x28856e8>
books[0].childNodes[1].childNodes: [<DOM Text node "u'learing ma'...">]
books[0].childNodes[1].childNodes[0]: <DOM Text node "u'learing ma'...">
books[0].childNodes[1].childNodes[0].data: learing math
books[0].childNodes[1].childNodes[1]:
Traceback (most recent call last):
File "task_test.py", line 17, in <module>
print "books[0].childNodes[1].childNodes[1]:",books[0].childNodes[1].childNodes[1]
IndexError: list index out of range