问题描述
我有以下代码可以导航到网站,输入两个名称(例如,此处使用的真实名称将从电子表格中提取10个名称的列表),然后搜索其记录.我正在尝试将生成的结果表提取到电子表格中.我已经尝试了几种方法,但似乎无法使其正常工作.在注释"Scrape Table Here"下查找代码.我知道这涉及到访问网站的HTML,我也可以这样做,但是我对HTML不够熟悉,无法独自解决这一问题.奖励问题:我还想将每个人的ID#添加到电子表格中.在HTML中,它在"MP_Details?"之后列出.例如,对于罗伯特·琼斯",我要抓住的是"36481".基本上,屏幕快照中所有以红色突出显示的内容,我都想从表中拉出并在电子表格中吐出:
Sub Input_And_Return()'创建Internet Explorer的新实例将Dim ieApp作为对象:设置ieApp = New InternetExplorer昏暗的ieDoc作为对象昏暗的html作为HTMLDocumentieApp.Visible =真ieApp.navigate"https://hdmaster.net/MP/MP_Public"同时做ieApp.Busy:DoEvents:循环直到ieApp.readyState = READYSTATE_COMPLETE:DoEvents:循环执行设置ieDoc = ieApp.document设置html = ieApp.document'在搜索框中输入名称,然后单击搜索使用ieDoc.forms(0).SearchFor.Value =安德森,凯利"&Chr(10)&琼斯·罗伯特".提交结束于'在这里刮桌子'关闭IE并重置状态栏设置ieApp = NothingApplication.StatusBar ="结束子
一些丑陋的代码来获取短ID
选项显式公共子Input_And_Return()昏暗的ieApp作为对象,即ieDoc作为对象设置ieApp = New InternetExplorer使用ieApp.Visible = True.navigate"https://hdmaster.net/MP/MP_Public"而.Busy或.readyState<4:DoEvents:Wend使用.document.forms(0).SearchFor.Value =安德森,凯利"&Chr $(10)&琼斯·罗伯特".提交Dim r Long,c Long,tr作为对象,td作为对象,hTable作为对象,aNodeList作为对象设置hTable = .getElementsByClassName("newTable")(0)设置aNodeList = .getElementsByClassName("newTable")(0).querySelectorAll("[align = center] [onclick * ='javascript:rowClick']")Dim idDict作为对象,i一样长,tempVal一样长设置idDict = CreateObject("Scripting.Dictionary")对于i = 0到aNodeList.Length-1tempVal = Split(Split(aNodeList.Item(i).onclick,"id =")(1),Chr $(39))(0)如果不是idDict.exists(tempVal),则idDict.Add tempVal,vbNullString接下来我与hTable对于.getElementsByTagName("tr")中的每个trr = r + 1:c = 1对于tr.getElementsByTagName("td")中的每个td单元格(r,c).值= td.innerTextc = c + 1下一个td下一个tr如果idDict.Count = r-1则Cells(2,c).Resize(idDict.Count,1)= Application.WorksheetFunction.Transpose(idDict.keys)结束于结束于.放弃结束于结束子
I have the following code which navigates to a website, enters in two names (used here for example, the real names will pull a list of 10 names from a spreadsheet), then searches for their records. I'm trying to pull the resulting table that is generated into a spreadsheet. I've tried it a few ways but can't seem to get it to work. Looking for code to go under the comment "Scrape Table Here". I know this involves accessing the site's HTML which I can also do but I'm not familiar enough with HTML to figure this one out on my own. Bonus question: I'd like to also add each person's ID# to the spreadsheet. In the HTML, it's listed after "MP_Details?". For example, for "Robert Jones" it's "36481" that I'm looking to grab. Basically everything highlighted in red in the screenshot, I'd like to pull from the table and spit out on a spreadsheet:
Sub Input_And_Return()
'Create new instance of Internet Explorer
Dim ieApp As Object: Set ieApp = New InternetExplorer
Dim ieDoc As Object
Dim html As HTMLDocument
ieApp.Visible = True
ieApp.navigate "https://hdmaster.net/MP/MP_Public"
Do While ieApp.Busy: DoEvents: Loop
Do Until ieApp.readyState = READYSTATE_COMPLETE: DoEvents: Loop
Set ieDoc = ieApp.document
Set html = ieApp.document
'Enter names into search box and click search
With ieDoc.forms(0)
.SearchFor.Value = "Anderson, Kelly" & Chr(10) & "Jones, Robert"
.submit
End With
'Scrape Table Here
'Close down IE and reset status bar
Set ieApp = Nothing
Application.StatusBar = ""
End Sub
You could copy the table outerHTML to the clipboard and paste that to Excel. It is nice, easy and quick.
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer
Dim nameList As String
nameList = "Anderson, Kelly" & Chr$(10) & "Jones, Robert"
With IE
.Visible = True
.navigate "https://hdmaster.net/MP/MP_Public"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document
.querySelector("[name=SearchFor]").Value = nameList
.querySelector("#search").Click
End With
While .Busy Or .readyState < 4: DoEvents: Wend
Dim clipboard As Object
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText .document.querySelector(".newTable").outerHTML
clipboard.PutInClipboard
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
.Quit
End With
End Sub
References (VBE > Tools > References):
- Microsoft HTML Object Library
- Microsoft Internet Controls
Your code version of the above:
Public Sub Input_And_Return()
Dim ieApp As Object: Set ieApp = New InternetExplorer
Dim ieDoc As Object
With ieApp
.Visible = True
.navigate "https://hdmaster.net/MP/MP_Public"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document.forms(0)
.SearchFor.Value = "Anderson, Kelly" & Chr$(10) & "Jones, Robert"
.submit
Dim clipboard As Object
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText .getElementsByClassName("newTable")(0).outerHTML
clipboard.PutInClipboard
End With
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
.Quit
End With
End Sub
Or by looping rows and columns of the table:
Public Sub Input_And_Return()
Dim ieApp As Object, ieDoc As Object
Set ieApp = New InternetExplorer
With ieApp
.Visible = True
.navigate "https://hdmaster.net/MP/MP_Public"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document.forms(0)
.SearchFor.Value = "Anderson, Kelly" & Chr$(10) & "Jones, Robert"
.submit
Dim r As Long, c As Long, tr As Object, td As Object
With .getElementsByClassName("newTable")(0)
For Each tr In .getElementsByTagName("tr")
r = r + 1: c = 1
For Each td In tr.getElementsByTagName("td")
Cells(r, c).Value = td.innerText
c = c + 1
Next td
Next tr
End With
End With
.Quit
End With
End Sub
Output:
EDIT:
Some ugly code to get the short ids
Option Explicit
Public Sub Input_And_Return()
Dim ieApp As Object, ieDoc As Object
Set ieApp = New InternetExplorer
With ieApp
.Visible = True
.navigate "https://hdmaster.net/MP/MP_Public"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document.forms(0)
.SearchFor.Value = "Anderson, Kelly" & Chr$(10) & "Jones, Robert"
.submit
Dim r As Long, c As Long, tr As Object, td As Object, hTable As Object, aNodeList As Object
Set hTable = .getElementsByClassName("newTable")(0)
Set aNodeList = .getElementsByClassName("newTable")(0).querySelectorAll("[align=center][onclick*='javascript:rowClick']")
Dim idDict As Object, i As Long, tempVal As Long
Set idDict = CreateObject("Scripting.Dictionary")
For i = 0 To aNodeList.Length - 1
tempVal = Split(Split(aNodeList.Item(i).onclick, "id=")(1), Chr$(39))(0)
If Not idDict.exists(tempVal) Then idDict.Add tempVal, vbNullString
Next i
With hTable
For Each tr In .getElementsByTagName("tr")
r = r + 1: c = 1
For Each td In tr.getElementsByTagName("td")
Cells(r, c).Value = td.innerText
c = c + 1
Next td
Next tr
If idDict.Count = r - 1 Then Cells(2, c).Resize(idDict.Count, 1) = Application.WorksheetFunction.Transpose(idDict.keys)
End With
End With
.Quit
End With
End Sub
这篇关于从网站抓取表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!