问题描述
一个简单的问题。我正在尝试编写一个程序来解析的HTML p>对于范例来说,足够的一部分源代码(第154至174行)是:
< p>(英国飞机公司)< / p>
< ul>
< li>< a href =/ wiki / B.A.C._Ititle =B.A.C.Iclass =mw-redirect> B.A.C。 I< / A>< /锂>
< li>< a href =/ wiki / B.A.C._IItitle =B.A.C.IIclass =mw-redirect> B.A.C。 II蛋白酶; / A>< /锂>
< li>< a href =/ wiki / B.A.C._IIItitle =B.A.C.IIIclass =mw-redirect> B.A.C。 III< / A>< /锂>
< li>< a href =/ wiki / B.A.C._IVtitle =B.A.C.IVclass =mw-redirect> B.A.C。 IV抑制剂; / A>< /锂>
< li>< a href =/ wiki / B.A.C._Vtitle =B.A.C.Vclass =mw-redirect> B.A.C。 V< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VItitle =B.A.C。VIclass =mw-redirect> B.A.C。 VI< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VIItitle =B.A.C。VIIclass =mw-redirect> B.A.C。 VII< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VII_Mk.2title =B.A.C。VII Mk.2class =mw-redirect> B.A.C。 VII Mk.2< / a>< / li>
< li>< a href =/ wiki / B.A.C._VII_Planettetitle =B.A.C。VII Planetteclass =mw-redirect> B.A.C。 VII Planette< / a>< / li>
< li>< a href =/ wiki / B.A.C._VIIItitle =B.A.C。VIIIclass =mw-redirect> B.A.C。 VIII< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VIII_Bat-Boattitle =B.A.C。VIII Bat-Boatclass =mw-redirect> B.A.C。 VIII Bat-Boat< / a>< / li>
< li>< a href =/ wiki / B.A.C._IXtitle =B.A.C。IXclass =mw-redirect> B.A.C。 IX< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Cupidtitle =B.A.C。丘比特class =mw-redirect> B.A.C。丘比特< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Dronetitle =B.A.C。无人机class =mw-redirect> B.A.C。无人驾驶飞机< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Super_Dronetitle =B.A.C。超级无人机class =mw-redirect> B.A.C。超级无人机< / a>< / li>
< li>< a href =/ wiki / B.A._Swallow_2title =B.A。燕子2class =mw-redirect> B.A。燕子2< / a>< / li>
< li>< a href =/ wiki / B.A._Eagle_2title =B.A。Eagle 2class =mw-redirect> B.A。鹰2< / a>< / li>
< li>< a href =/ wiki / B.A._Double_Eagletitle =B.A。Double Eagleclass =mw-redirect> B.A。 Double Eagle< / a>< / li>
< / ul>
我正在尝试设计出一些东西。所以我可以到< p> HTML标签
,但是我无法点击列表项来循环显示我想要的内容,因为它们进一步包含在< ul>< / ul>标签
。下一步是什么?
Sub ICE()
设置结果= IE.document。 getElementsByTagName(p)
对于每个itm在结果中
如果itm.innerHTML =(英国飞机公司)然后
结束如果
下一个itm
End Sub
为了更简洁的说明,本阶段的研究是根据由提供
用户推荐
- >提及
我想要的是让VBA在运行时点击因为它是一个实际的链接。我正在研究ron中的代码(可以在中看到) :
如果itm.outerhtml =BAC VII然后
itm.Click
做直到不IE.Busy和IE.readyState = 4
DoEvents
循环
退出
结束如果
...这里使用outerHTML,但我的努力的核心是循环和逻辑运算符
我写了这段代码,但它不起作用。
设置结果= IE。 document.getElementsByTagName(p)
对于每个itm在结果中
如果itm.innerHTML =(英国飞机公司)然后
设置Results2 = IE.document.getElementsByTagName (ul)
对于每个itm2在Results2
如果itm2.innerHTML =BAC V然后
MsgBox itm2.innerHTML
结束如果
下一个itm2
结束如果
下一个itm
这将列出与英国飞机公司p标签下的飞机
Sub GetAircraft()
Dim xHttp As MSXML2.XMLHTTP
Dim hDoc As MSHTML.HTMLDocument
Dim hUls As MSHTML.IHTMLElementCollection
Dim hUl As MSHTML.HTMLListElement
Dim hLi As MSHTML.HTMLLIElement
设置xHttp =新建MSXML2.XMLHTTP
xHttp.OpenGET,http://en.wikipedia.org/wiki/List_of_aircraft_% 28B%29
xHttp.send
Do
DoEvents
循环直到xHttp.readyState = 4
设置hDoc =新的HTMLDocument
hDoc.body.innerHTML = xHttp.responseText
设置hUls = hDoc.getElementsByTagName(ul)
'浏览所有< ul>标签
对于每个hUl在hUls
'只有前一个标签是
如果不是hUl.PreviousSibling是没有
'只有前一个标签是&p;
如果TypeName(hUl.PreviousSibling)=HTMLParaElement然后
'只有前面的段落被指定的文本
如果hUl.PreviousSibling.innerText =(英国飞机公司)然后
'循环通过< li>并打印出来
对于每个hLi在hUl.Children
Debug.Print hLi.innerText
下一个hLi
如果
结束If
End If
下一页hUl
End Sub
A simple question. I am trying to write a procedure to parse the HTML of this Site
A part of the source code (lines 154 to 174) that is sufficient for a paradigm is:
<p>(British Aircraft Company)</p>
<ul>
<li><a href="/wiki/B.A.C._I" title="B.A.C. I" class="mw-redirect">B.A.C. I</a></li>
<li><a href="/wiki/B.A.C._II" title="B.A.C. II" class="mw-redirect">B.A.C. II</a></li>
<li><a href="/wiki/B.A.C._III" title="B.A.C. III" class="mw-redirect">B.A.C. III</a></li>
<li><a href="/wiki/B.A.C._IV" title="B.A.C. IV" class="mw-redirect">B.A.C. IV</a></li>
<li><a href="/wiki/B.A.C._V" title="B.A.C. V" class="mw-redirect">B.A.C. V</a></li>
<li><a href="/wiki/B.A.C._VI" title="B.A.C. VI" class="mw-redirect">B.A.C. VI</a></li>
<li><a href="/wiki/B.A.C._VII" title="B.A.C. VII" class="mw-redirect">B.A.C. VII</a></li>
<li><a href="/wiki/B.A.C._VII_Mk.2" title="B.A.C. VII Mk.2" class="mw-redirect">B.A.C. VII Mk.2</a></li>
<li><a href="/wiki/B.A.C._VII_Planette" title="B.A.C. VII Planette" class="mw-redirect">B.A.C. VII Planette</a></li>
<li><a href="/wiki/B.A.C._VIII" title="B.A.C. VIII" class="mw-redirect">B.A.C. VIII</a></li>
<li><a href="/wiki/B.A.C._VIII_Bat-Boat" title="B.A.C. VIII Bat-Boat" class="mw-redirect">B.A.C. VIII Bat-Boat</a></li>
<li><a href="/wiki/B.A.C._IX" title="B.A.C. IX" class="mw-redirect">B.A.C. IX</a></li>
<li><a href="/wiki/B.A.C._Cupid" title="B.A.C. Cupid" class="mw-redirect">B.A.C. Cupid</a></li>
<li><a href="/wiki/B.A.C._Drone" title="B.A.C. Drone" class="mw-redirect">B.A.C. Drone</a></li>
<li><a href="/wiki/B.A.C._Super_Drone" title="B.A.C. Super Drone" class="mw-redirect">B.A.C. Super Drone</a></li>
<li><a href="/wiki/B.A._Swallow_2" title="B.A. Swallow 2" class="mw-redirect">B.A. Swallow 2</a></li>
<li><a href="/wiki/B.A._Eagle_2" title="B.A. Eagle 2" class="mw-redirect">B.A. Eagle 2</a></li>
<li><a href="/wiki/B.A._Double_Eagle" title="B.A. Double Eagle" class="mw-redirect">B.A. Double Eagle</a></li>
</ul>
I am in the process of trying to engineer something out. So i can get to the <p> HTML Tag
but i cannot tap on the list items to loop out what i want because they are further enclosed between the <ul></ul> tags
. What would be your next steps?
Sub ICE()
Set Results = IE.document.getElementsByTagName("p")
For Each itm In Results
If itm.innerHTML = "(British Aircraft Company)" Then
End If
Next itm
End Sub
For a more concise picture this stage of my study is based on the answer at VBA parsing of href provided by ron
Recomendation by user Doug Glancy
--> It might be helpful to mention the desired results.
What i want is to have the capability to make VBA to 'click' on runtime the href of my preference since it is an actual link. I am studying code from ron on that which is (and can be seen in the previous example):
If itm.outerhtml = "B.A.C. VII" Then
itm.Click
Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
Exit For
End If
...here outerHTML is being used however the nucleus of my effort is the loop and the logical operator
I wrote this piece of code however it does not work
Set Results = IE.document.getElementsByTagName("p")
For Each itm In Results
If itm.innerHTML = "(British Aircraft Company)" Then
Set Results2 = IE.document.getElementsByTagName("ul")
For Each itm2 In Results2
If itm2.innerHTML = "B.A.C. V" Then
MsgBox itm2.innerHTML
End If
Next itm2
End If
Next itm
This will list out the aircraft under the p tag with British Aircraft Company
Sub GetAircraft()
Dim xHttp As MSXML2.XMLHTTP
Dim hDoc As MSHTML.HTMLDocument
Dim hUls As MSHTML.IHTMLElementCollection
Dim hUl As MSHTML.HTMLListElement
Dim hLi As MSHTML.HTMLLIElement
Set xHttp = New MSXML2.XMLHTTP
xHttp.Open "GET", "http://en.wikipedia.org/wiki/List_of_aircraft_%28B%29"
xHttp.send
Do
DoEvents
Loop Until xHttp.readyState = 4
Set hDoc = New HTMLDocument
hDoc.body.innerHTML = xHttp.responseText
Set hUls = hDoc.getElementsByTagName("ul")
'Go through all the <ul> tags
For Each hUl In hUls
'Only if previous tag is something
If Not hUl.PreviousSibling Is Nothing Then
'Only if previous tag is <p>
If TypeName(hUl.PreviousSibling) = "HTMLParaElement" Then
'Only if previous paragraph is specified text
If hUl.PreviousSibling.innerText = "(British Aircraft Company)" Then
'loop through the <li> and print them out
For Each hLi In hUl.Children
Debug.Print hLi.innerText
Next hLi
End If
End If
End If
Next hUl
End Sub
这篇关于VBA HTML标签层次结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!