好吧,
我花了好几个小时研究如何从桌子上抓取文本,但在我的一生中,我没有遇到一种在我的情况下有效的方法。
下面是我试图从中获取信息的HTML示例
<table class="empDetailCard foldable unfolded">
<tr>
<td colspan="4" class="title">
<span class="fold-control">full name</span>
</td>
</tr>
<tr class="fold-row">
<td>
<div class="badgePhoto reg">
<img class="photo " src="removed" />
</div>
</td>
<td>
<span class="line">
<span class="section-title">Employee Info</span>
</span>
<div class="employeeInfo">
<div>
<span class="line">
<span class="row-label">Login</span>
mylogin</span>
<span class="line">
<span class="row-label">Empl ID</span>
1234567</span>
<span class="line">
<span class="row-label">Badge</span>
1234567</span>
<span class="line">
<span class="row-label">Dept ID</span>
1234567</span>
<span class="line">
<span class="row-label">Location</span>
1234567
</span>
<span class="line">
<span class="row-label">Manager</span>
<a href="removed"
title="">John, Smith</a>
</span>
</td>
</tr>
</table>
我试过使用GetElementByID、GetElementByName甚至regex从登录表中获取“mylogin”,但我没有成功。
Function IdtoLogin(empID As String)
Dim H As Object, html As Object, objResult As Object
Set H = CreateObject("WinHttp.WinHttpRequest.5.1")
H.Open "GET", "myurl" & empID
H.setRequestHeader "Content-Type", "text/xml"
H.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
H.SetAutoLogonPolicy 0
H.send
Set html = New HTMLDocument
html.Body.innerHTML = H.ResponseText
Set objResult = html.GetElementById("Login")
IdtoLogin = objResult.innerHTML
End Function
响应返回带有登录信息的正确HTML,但它无法获取元素ID并抛出“运行时错误91”。如果有人能为我指出显而易见的事情那就太好了,因为我快疯了。
最佳答案
尝试CSS选择器
html.querySelector("div.employeeInfo span")
例如,你想要的信息可能是outerHTML的一部分。通过使用singular
querySelector
可以得到第一个节点匹配,在所示的HTML中,它是mylogin
: