问题描述
我需要使用 VBA 从 XML 中获取不同的属性名称.
I need to get the distinct attributes names from the XML using VBA.
这是我的代码.
sub test()
Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")
XMLFileName = "C:UsersInput.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False
Dim mainnode As Object
Dim node As Object
Set mainnode = XMLFile.SelectNodes("//Elements")
For Each node In mainnode
For Each child In node.ChildNodes
Debug.Print child.BaseName
Dim kiddo As Object
For Each kiddo In child.ChildNodes
Debug.Print kiddo.BaseName
Next kiddo
Next child
Next node
End sub
这是示例 XML.我需要从 XML 中获取属性名称 num
.
Here is the sample XML. I need to get the attribute name num
from the XML.
<Elements>
<Details>
<Name>ABC</Name>
<Address>123ABC</Address>
<College>
<collname>safasf</collname>
<collnumber/>
</College>
</Details>
<Dept num="123">
<Deptname>IT</Deptname>
<ID>A123</ID>
</Dept>
</Elements>
预期结果:
Elements
Details
Name
Address
College
collname
collnumber
Dept
num
Deptname
ID
以上代码的实际结果:
Elements
Details
Name
Address
College
collname
Dept
Deptname
ID
我的代码未获取num"属性和 <collnumber/>
标记.有人可以让我知道如何使用 VBA 从 XML 中获取属性名称和标签名称
The "num" attribute and <collnumber/>
tag is not fetched by my code. Could someone let me know how to fetch the attribute names along with the tag names from XML using VBA
推荐答案
通过递归函数调用显示包含属性的 XML 结构
我的示例代码演示了一种方法
My example code demonstrates a way to
[1]
使用 XMLDOM 方法和 将整个 XML 结构分配给一个二维数组[2]
可选择将其写回工作表.
[1]
assign the entire XML structure to a 2-dim array using XMLDOM methods and[2]
optionally write it back to a sheet.
放大提示:
我添加了这些 ► 结构化提示以提供更多帮助,而不是仅显示代码,正如我所说,其中许多要点也会导致其他用户重复提问:
I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:
- 试图列出
XML
结构,随着节点元素层次深度的增加,你失去了良好的视野(类型常量 1NODE_ELEMENT
),所以我紧急推荐使用 ► 递归调用. - 此外,您可能没有考虑过节点文本(类型常量 3
NODE_TEXT
)的特殊结构是命名的第一个子父元素 - cf主函数listChildNodes
中的 A. 和 B. 部分.您通过子节点的循环不会区分提到的类型.只需研究引用函数中的注释即可了解详细信息. - 我想您的 XML 文件以所需的处理指令开头,例如
<?xml version="1.0" encoding="utf-8"?>
,这样才能真正识别为XML文件. - 调用过程
DisplayXML()
使用 后期绑定 而不是像您的帖子那样对 MS XML 的早期绑定引用,但使用推荐的 MSXML2 6.0 版强>.它通过它的DocumentElement
(BTW 单个节点元素)和一个引用预定义的 2-暗淡数组
v
. - 版本控制:如果您使用
Set XDoc = CreateObject("MSXML2.DOMDocument")
将XMLFILE
对象设置为内存,通常您是获取较旧的默认版本 (3.0),因此在大多数情况下,最好使用显式Set XDoc = CreateObject("MSXML2.DOMDocument.6.0")
代替(自动包括 XPath). - 如果您不使用
Load
函数来获取True
(文件加载成功)或False
(加载错误)返回,不需要将文件名设置在方括号()中. - 搜索字符串中的 XPath 运算符
//
将返回 任何 级别的任何出现(参见XMLFile.SelectNodes("//Elements")
在你的 OP 中). - 还要考虑使用 XSLT,这是一种专门用于将 XML 文件转换为各种最终使用格式的语言.
- Trying to list
XML
structures you lose good view with increasing hierarchy depth of your node elements (type constant 1NODE_ELEMENT
), so I urgently recommend the use of ► recursive calls as used in this example code. - Furthermore you might have not considered the special construction of node text (type constant 3
NODE_TEXT
) being the first child of a name giving parent element - c.f. sections A. and B. in main functionlistChildNodes
. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details. - I suppose your XML file starts with a needed processing instruction like e.g.
<?xml version="1.0" encoding="utf-8"?>
, so that it can be actually identified as XML file. - The calling procedure
DisplayXML()
uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via itsDocumentElement
<Elements>
(BTW a single node element) and a second argument referring to a predefined 2-dim arrayv
. - Versioning: If you would set your
XMLFILE
object to memory withSet XDoc = CreateObject("MSXML2.DOMDocument")
generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitlySet XDoc = CreateObject("MSXML2.DOMDocument.6.0")
instead (including XPath automatically). - If you don't use the
Load
function to get aTrue
(file loaded successfully) orFalse
(load error) back, it is not necessary to set the file name into brackets (). - The XPath operator
//
in search strings would return any occurences at any level (c.f.XMLFile.SelectNodes("//Elements")
in your OP). - Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.
调用过程DisplayXML
提示: 仅使用调用过程中的估计项数(例如 1000)来确定数组的行数就足够了,因为主函数执行 ReDim
(包括双换位)在需要时自动进行.尽管如此,我从一开始就通过 XPath/XMLDOM 表达式 XMLFile.SelectNodes("//*").Length
计算整个文件中的任何项目,在这里添加了确切的项目计数.
Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim
(including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length
counting any item in the entire file.
Option Explicit ' declaration head of your code module
Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM") ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")
XMLFileName = "C:UsersInput.xml" ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML
If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
listChildNodes XMLFile.DocumentElement, v ' call helper function
' [2] write results to sheet "Dump" ' change to your sheet name
With ThisWorkbook.Worksheets("Dump")
.Range("A:B") = "" ' clear result range
.Range("A1:B1") = Split("XML Tag,Node Value", ",") ' titles
.Range("A2").Resize(UBound(v), UBound(v, 2)) = v ' get 2-dim info array
End With
Else
MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub
在工作表中显示的结构化结果
提示:如果不想要层次缩进或枚举层次层次结构,可以很容易地适配下面的主函数listChildNodes()
.
Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes()
below.
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 0 Elements | |
+----+---------------------+-----------------+
|3 | 1 Details | |
+----+---------------------+-----------------+
|4 | 2 Name | ABC |
+----+---------------------+-----------------+
|5 | 2 Address | 123ABC |
+----+---------------------+-----------------+
|6 | 2 College | |
+----+---------------------+-----------------+
|7 | 3 collname | safasf |
+----+---------------------+-----------------+
|8 | 3 collnumber | |
+----+---------------------+-----------------+
|9 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|10 | 2 Deptname | IT |
+----+---------------------+-----------------+
|11 | 2 ID | A123 |
+----+---------------------+-----------------+
也可以引用一个精确的节点元素,例如通过
It is also possible to refer to a precise node element, e.g. via
listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1 ' starting from item no 1 and Level no 1
这将单独列出指定的节点集:
This would list the indicated node set alone:
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|3 | 2 Deptname | IT |
+----+---------------------+-----------------+
|4 | 2 ID | A123 |
+----+---------------------+-----------------+
递归主函数listChildNodes()
循环遍历子节点集合,此函数重复(递归")调用自身(即当前节点对象)并将整个 XML 结构分配给给定的二维数组(第二个参数).此外,它允许缩进并指示层次结构级别.注意这个例子中的数组必须是从 1 开始的.
Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.
Edit 20/8 2018
包括 数组大小自动增加,如果项目计数器 i
超过当前数组的上限(UBound(v)
,即在它的第一个维度 = 项目数).技术说明:由于这样的 ReDim
在次要(这里是第一个)维度中是不可能的,将行"(dim 1)更改为列"(dim 2)的中间换位是必要的.
Edit 20/8 2018
includes an automatic increase of array size if the items counter i
exceeds the current array's upper boundary (UBound(v)
, i.e. in its first dimension = items count). Technical note: As such a ReDim
isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.
Function listChildNodes(oCurrNode As Object, _
ByRef v As Variant, _
Optional ByRef i As Long = 1, _
Optional iLvl As Integer = 0 _
) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author: T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
' (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
If oCurrNode Is Nothing Then Exit Function
If i < 1 Then i = 1 ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed
If i >= UBound(v) Then ' change array size if needed
Dim tmp As Variant
tmp = Application.Transpose(v) ' change rows to columns
ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000) ' increase row numbers
v = Application.Transpose(tmp) ' transpose back
Erase tmp
End If
Const NAMEColumn& = 1, VALUEColumn& = 2 ' constants for column 1 and 2
' Declare variables
Dim oChildNode As Object ' late bound node object
Dim bDisplay As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then ' 3 ... NODE_TEXT
' display pure text content (NODE_TEXT) of parent elements
v(i, VALUEColumn) = oCurrNode.Text ' nodeValue of text node
' return
listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then ' 1 ... NODE_ELEMENT
' --------------------------------------------------------------
' B.1 NODE_ELEMENT WITHOUT text node immediately below,
' a) e.g. <Details> followed by node element <NAME>,
' (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
' b) e.g. <College> node element without any child node
' Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
' (see section A. getting the FirstChild of a NODE_ELEMENT)
' --------------------------------------------------------------
' a) display parent elements of other element nodes
If oCurrNode.HasChildNodes Then
If Not oCurrNode.FirstChild.NodeType = 3 Then ' <>3 ... not a NODE_TEXT
bDisplay = True
End If
' b) always display empty node elements
Else ' empty NODE_ELEMENT
bDisplay = True
End If
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
' --------------------------------------------------------------
' B.2 check child nodes
' --------------------------------------------------------------
For Each oChildNode In oCurrNode.ChildNodes
' ~~~~~~~~~~~~~~~~~
' recursive call <<
' ~~~~~~~~~~~~~~~~~
bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
Next oChildNode
' return
listChildNodes = False
Else ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
If oCurrNode.NodeType = 8 Then ' 8 ... NODE_COMMENT
v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
i = i + 1
End If
' return
listChildNodes = False
End If
End Function
'辅助函数getAtts()
由上述函数调用的这个辅助函数返回一个字符串,该字符串枚举给定节点的所有属性名称和值,类似于XPath
符号;代码可以很容易地适应您的需要.
This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath
notation; the code can be easily adapted to your needs.
Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note: called by above function listChildNodes()
' Author: T.M.
Dim sAtts$, ii&
If node.Attributes.Length > 0 Then
ii = 0: sAtts = ""
For ii = 0 To node.Attributes.Length - 1
sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
Next ii
End If
' return
getAtts = sAtts
End Function
这篇关于使用 VBA 从 XML 获取属性名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!