问题描述
我想检索html页面<h3>
标记的href属性,我能够获取innerText,但是我不知道如何访问href属性,其中有几个<h3>
标记文档,但是暂时我只需要第一个.稍后我会处理其余的...
I want to retrieve the href attribute of the <h3>
tags of an html page, I am able to get the innerText, but I don't know how to access the href attribute, there are several <h3>
tags in the document, but for the time being I just need the first one. I will deal with the rest later...
这是我到目前为止得到的代码
This is the code I got so far
Sub Scrap()
Dim IE As New InternetExplorer
Dim sDD As String
Dim Doc As HTMLDocument
IE.Visible = True
IE.navigate "https://www.oneoiljobsearch.com/senior-reservoir-engineer-jobs/?page=1"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
sDD = Trim(Doc.getElementsByTagName("h3")(0).innerText)
'sDD contains the string "Senior Reservoir Engineer"
End Sub
下面是HTML文档的一部分,用于从以下位置提取数据:
Below is a portion of the HTML document to extract data from:
<div class="front_job_details">
<h3>
<a href="/jobs/senior-reservoir-engineer-oslo-norway-7?cmp=js&from=job-search-form-2" target="_blank">
Senior Reservoir Engineer
</a>
</h3>
我需要检索的文本是:"/jobs/senior-reservoir-engineer-oslo-norway-7?cmp = js& from = job-search-form-2"
The text I need to retrieve is: "/jobs/senior-reservoir-engineer-oslo-norway-7?cmp=js&from=job-search-form-2"
在此先感谢您的帮助.
Thanks in advance for your help.
推荐答案
尝试
dim hr as string
hr = Doc.getElementsByTagName("h3")(0).getElementsByTagName("a")(0).href
debug.print hr
getElementsByTagName集合是从零开始的,而.Length(H3的数量,在其他方法中称为Count)是从一开始的.
The getElementsByTagName collection is zero-based but the .Length (the # of H3's, called Count in other methods) is one-based.
dim i as long
for i=0 to Doc.getElementsByTagName("h3").length - 1
debug.print Doc.getElementsByTagName("h3")(i).getElementsByTagName("a")(0).href
next i
这将获得第一个< A>每个H3中的标签.您可以复制该方法,以从每个H3中获取多个A.
This gets the first <A> tag from each H3. You could duplicate the method to get multiple A's from each H3.
这篇关于如何使用Excel VBA获取href属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!