问题描述
我需要使用VBA从XML中获得不同的属性名称.
I need to get the distinct attributes names from the XML using VBA.
这是我的代码.
sub test()
Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")
XMLFileName = "C:\Users\Input.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False
Dim mainnode As Object
Dim node As Object
Set mainnode = XMLFile.SelectNodes("//Elements")
For Each node In mainnode
For Each child In node.ChildNodes
Debug.Print child.BaseName
Dim kiddo As Object
For Each kiddo In child.ChildNodes
Debug.Print kiddo.BaseName
Next kiddo
Next child
Next node
End sub
这是示例XML.我需要从XML中获取属性名称num
.
Here is the sample XML. I need to get the attribute name num
from the XML.
<Elements>
<Details>
<Name>ABC</Name>
<Address>123ABC</Address>
<College>
<collname>safasf</collname>
<collnumber/>
</College>
</Details>
<Dept num="123">
<Deptname>IT</Deptname>
<ID>A123</ID>
</Dept>
</Elements>
预期结果:
Elements
Details
Name
Address
College
collname
collnumber
Dept
num
Deptname
ID
以上代码的实际结果:
Elements
Details
Name
Address
College
collname
Dept
Deptname
ID
我的代码未获取"num"属性和<collnumber/>
标记.有人可以让我知道如何使用VBA从XML中获取属性名称和标记名称
The "num" attribute and <collnumber/>
tag is not fetched by my code. Could someone let me know how to fetch the attribute names along with the tag names from XML using VBA
推荐答案
通过递归函数调用显示包含属性的XML结构
我的示例代码演示了一种方法
My example code demonstrates a way to
-
[1]
使用XMLDOM方法和 将整个XML结构分配给2维数组 -
[2]
(可选)将其写回到工作表中.
[1]
assign the entire XML structure to a 2-dim array using XMLDOM methods and[2]
optionally write it back to a sheet.
放大提示:
我添加了这些►结构化提示,以提供比仅显示代码更多的帮助,因为我说过,这些要点中的很多也导致其他用户重复提问:
I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:
- 尝试列出
XML
个结构,但随着节点元素层次结构深度的增加(类型常数1NODE_ELEMENT
),您会失去良好的视野,因此,我紧急建议使用►递归调用在此示例代码中. - 此外,您可能还没有考虑到节点文本(类型常量3
NODE_TEXT
)的特殊构造是给定父元素名称的第一个孩子 -cf主功能listChildNodes
中A.和B.部分. 您遍历子节点的循环不会区分所提到的类型.只需研究引用函数中的注释以了解详细信息即可. - 我想您的XML文件以所需的处理指令开头,例如
<?xml version="1.0" encoding="utf-8"?>
,以便可以将其实际识别为XML文件. - 调用过程
DisplayXML()
使用 late绑定,而不是类似于您的帖子的早期绑定对MS XML的引用,但是使用推荐的 MSXML2版本6.0 .它通过DocumentElement
<Elements>
( BTW是单个节点元素)和引用预定义2维数组v
的第二个参数来调用main函数. - 版本::如果您要使用
Set XDoc = CreateObject("MSXML2.DOMDocument")
将XMLFILE
对象设置为内存,则通常会获得较旧的默认版本(3.0),因此在大多数情况下,最好显式使用代替(自动包含XPath). - 如果不使用
Load
函数取回True
(文件成功加载)或False
(加载错误),则无需设置文件放在方括号()中. - 搜索字符串中的XPath运算符
//
将返回 any 级别的所有匹配项(例如,OP中的XMLFile.SelectNodes("//Elements")
). - 还考虑使用 XSLT ,这是一种专用语言,旨在将XML文件转换为各种最终用途格式.
- Trying to list
XML
structures you lose good view with increasing hierarchy depth of your node elements (type constant 1NODE_ELEMENT
), so I urgently recommend the use of ► recursive calls as used in this example code. - Furthermore you might have not considered the special construction of node text (type constant 3
NODE_TEXT
) being the first child of a name giving parent element - c.f. sections A. and B. in main functionlistChildNodes
. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details. - I suppose your XML file starts with a needed processing instruction like e.g.
<?xml version="1.0" encoding="utf-8"?>
, so that it can be actually identified as XML file. - The calling procedure
DisplayXML()
uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via itsDocumentElement
<Elements>
(BTW a single node element) and a second argument referring to a predefined 2-dim arrayv
. - Versioning: If you would set your
XMLFILE
object to memory withSet XDoc = CreateObject("MSXML2.DOMDocument")
generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitlySet XDoc = CreateObject("MSXML2.DOMDocument.6.0")
instead (including XPath automatically). - If you don't use the
Load
function to get aTrue
(file loaded successfully) orFalse
(load error) back, it is not necessary to set the file name into brackets (). - The XPath operator
//
in search strings would return any occurences at any level (c.f.XMLFile.SelectNodes("//Elements")
in your OP). - Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.
调用过程DisplayXML
Calling procedure DisplayXML
提示:只需在调用过程中用估计的项数(例如1000)来估计数组的行数就足够了,因为主函数执行ReDim
(包括双重换位)(如果需要).尽管如此,我还是从一开始就通过XPath/XMLDOM表达式XMLFile.SelectNodes("//*").Length
在整个文件中对任何项目进行计数来添加确切的项目计数.
Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim
(including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length
counting any item in the entire file.
Option Explicit ' declaration head of your code module
Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM") ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")
XMLFileName = "C:\Users\Input.xml" ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML
If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
listChildNodes XMLFile.DocumentElement, v ' call helper function
' [2] write results to sheet "Dump" ' change to your sheet name
With ThisWorkbook.Worksheets("Dump")
.Range("A:B") = "" ' clear result range
.Range("A1:B1") = Split("XML Tag,Node Value", ",") ' titles
.Range("A2").Resize(UBound(v), UBound(v, 2)) = v ' get 2-dim info array
End With
Else
MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub
结构化结果显示在工作表中
提示:如果您不希望级别缩进或枚举级别层次结构,则可以轻松地调整下面的主要功能listChildNodes()
.
Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes()
below.
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 0 Elements | |
+----+---------------------+-----------------+
|3 | 1 Details | |
+----+---------------------+-----------------+
|4 | 2 Name | ABC |
+----+---------------------+-----------------+
|5 | 2 Address | 123ABC |
+----+---------------------+-----------------+
|6 | 2 College | |
+----+---------------------+-----------------+
|7 | 3 collname | safasf |
+----+---------------------+-----------------+
|8 | 3 collnumber | |
+----+---------------------+-----------------+
|9 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|10 | 2 Deptname | IT |
+----+---------------------+-----------------+
|11 | 2 ID | A123 |
+----+---------------------+-----------------+
还可以引用精确的节点元素,例如通过
It is also possible to refer to a precise node element, e.g. via
listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1 ' starting from item no 1 and Level no 1
这将单独列出指示的节点集:
This would list the indicated node set alone:
+----+---------------------+-----------------+
| | A | B |
+----+---------------------+-----------------+
|1 | XML Tag | Node Value |
+----+---------------------+-----------------+
|2 | 1 Dept[@num="123"]| |
+----+---------------------+-----------------+
|3 | 2 Deptname | IT |
+----+---------------------+-----------------+
|4 | 2 ID | A123 |
+----+---------------------+-----------------+
递归主要功能listChildNodes()
Recursive main function listChildNodes()
遍历子节点集合,此函数反复(递归")调用自身(即当前节点对象),并将整个XML结构分配给给定的2-dim数组(第二个参数).此外,它还允许使用并指示层次结构级别. 请注意,此示例中的数组必须基于1.
Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.
Edit 20/8 2018
包括自动增加阵列大小. 技术说明:由于这样的ReDim
在较小(此处为第1个)维度上是不可能的,因此需要将行"(第1维)更改为列"(第2维)的中间换位.
Edit 20/8 2018
includes an automatic increase of array size if the items counter i
exceeds the current array's upper boundary (UBound(v)
, i.e. in its first dimension = items count). Technical note: As such a ReDim
isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.
Function listChildNodes(oCurrNode As Object, _
ByRef v As Variant, _
Optional ByRef i As Long = 1, _
Optional iLvl As Integer = 0 _
) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author: T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
' (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
If oCurrNode Is Nothing Then Exit Function
If i < 1 Then i = 1 ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed
If i >= UBound(v) Then ' change array size if needed
Dim tmp As Variant
tmp = Application.Transpose(v) ' change rows to columns
ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000) ' increase row numbers
v = Application.Transpose(tmp) ' transpose back
Erase tmp
End If
Const NAMEColumn& = 1, VALUEColumn& = 2 ' constants for column 1 and 2
' Declare variables
Dim oChildNode As Object ' late bound node object
Dim bDisplay As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then ' 3 ... NODE_TEXT
' display pure text content (NODE_TEXT) of parent elements
v(i, VALUEColumn) = oCurrNode.Text ' nodeValue of text node
' return
listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then ' 1 ... NODE_ELEMENT
' --------------------------------------------------------------
' B.1 NODE_ELEMENT WITHOUT text node immediately below,
' a) e.g. <Details> followed by node element <NAME>,
' (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
' b) e.g. <College> node element without any child node
' Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
' (see section A. getting the FirstChild of a NODE_ELEMENT)
' --------------------------------------------------------------
' a) display parent elements of other element nodes
If oCurrNode.HasChildNodes Then
If Not oCurrNode.FirstChild.NodeType = 3 Then ' <>3 ... not a NODE_TEXT
bDisplay = True
End If
' b) always display empty node elements
Else ' empty NODE_ELEMENT
bDisplay = True
End If
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
' --------------------------------------------------------------
' B.2 check child nodes
' --------------------------------------------------------------
For Each oChildNode In oCurrNode.ChildNodes
' ~~~~~~~~~~~~~~~~~
' recursive call <<
' ~~~~~~~~~~~~~~~~~
bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)
If bDisplay Then
v(i, NAMEColumn) = String(iLvl * 2, " ") & _
iLvl & " " & _
oCurrNode.nodename & getAtts(oCurrNode)
i = i + 1
End If
Next oChildNode
' return
listChildNodes = False
Else ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
If oCurrNode.NodeType = 8 Then ' 8 ... NODE_COMMENT
v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
i = i + 1
End If
' return
listChildNodes = False
End If
End Function
'助手功能getAtts()
'Helper function getAtts()
由上述函数调用的此辅助函数返回一个字符串,该字符串枚举给定节点的所有属性名称和值,类似于XPath
表示法;该代码可以轻松地适应您的需求.
This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath
notation; the code can be easily adapted to your needs.
Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note: called by above function listChildNodes()
' Author: T.M.
Dim sAtts$, ii&
If node.Attributes.Length > 0 Then
ii = 0: sAtts = ""
For ii = 0 To node.Attributes.Length - 1
sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
Next ii
End If
' return
getAtts = sAtts
End Function
这篇关于使用VBA从XML获取属性名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!