好吧,
我花了好几个小时研究如何从桌子上抓取文本,但在我的一生中,我没有遇到一种在我的情况下有效的方法。
下面是我试图从中获取信息的HTML示例

<table class="empDetailCard foldable unfolded">
    <tr>
        <td colspan="4" class="title">
            <span class="fold-control">full name</span>
        </td>
    </tr>
    <tr class="fold-row">
        <td>
            <div class="badgePhoto reg">
    <img class="photo " src="removed" />
</div>
</td>
        <td>
           <span class="line">
               <span class="section-title">Employee Info</span>
           </span>
           <div class="employeeInfo">
               <div>
                   <span class="line">
                       <span class="row-label">Login</span>
                       mylogin</span>
                   <span class="line">
                       <span class="row-label">Empl ID</span>
                       1234567</span>
                   <span class="line">
                       <span class="row-label">Badge</span>
                       1234567</span>
                   <span class="line">
                       <span class="row-label">Dept ID</span>
                       1234567</span>
                   <span class="line">
                       <span class="row-label">Location</span>
                       1234567
                       </span>
                   <span class="line">
                       <span class="row-label">Manager</span>
                       <a href="removed"
                    title="">John, Smith</a>
                    </span>
        </td>
    </tr>
</table>

我试过使用GetElementByID、GetElementByName甚至regex从登录表中获取“mylogin”,但我没有成功。
Function IdtoLogin(empID As String)
     Dim H As Object, html As Object, objResult As Object
     Set H = CreateObject("WinHttp.WinHttpRequest.5.1")
     H.Open "GET", "myurl" & empID
     H.setRequestHeader "Content-Type", "text/xml"
     H.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
     H.SetAutoLogonPolicy 0
     H.send

     Set html = New HTMLDocument
     html.Body.innerHTML = H.ResponseText
     Set objResult = html.GetElementById("Login")
     IdtoLogin = objResult.innerHTML

End Function

响应返回带有登录信息的正确HTML,但它无法获取元素ID并抛出“运行时错误91”。如果有人能为我指出显而易见的事情那就太好了,因为我快疯了。

最佳答案

尝试CSS选择器

html.querySelector("div.employeeInfo span")

例如,你想要的信息可能是outerHTML的一部分。通过使用singularquerySelector可以得到第一个节点匹配,在所示的HTML中,它是mylogin
html - 文本的Excel VBA Scrape HTML表-LMLPHP

10-08 06:41