本文介绍了使用VBA excel进行网页抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我应该使用哪个代码从这个 HTML 代码中提取价格?我想通过 获取元素身份证标签名类名网站网址:https://www.tatacliq.com/vivo-v5-32gb-gold-4-gb-ram-dual-sim-4g/p-mp000000000734559假设我想在下面的 HTML 中获取价格"的数据(即获取14999"): <div itemprop="offers" itemscope="" itemtype="http://schema.org/Offer"类=价格"><p class="old" id="mrpPriceId" style="display:none"></p><p class="sale" id="mopPriceId" style="display:none"></p><p class="sale" id="spPriceId" style="display:none"><!-- 对于 TPR-4358 开始--><span itemprop="price">14999.00</span><span itemprop="priceCurrency">INR</span><meta itemprop="itemCondition" content="http://schema.org/NewCondition"/><meta itemprop="可用性"content="http://schema.org/InStock"/>可在线使用</meta><!-- 对于 TPR-4358 结束--></p><p class=" Savings pdp-savings" id="savingsOnProductId"样式=显示:无"><span></span></p><br>这是我的代码: Function Scraptatacliq(tatacliq_url As String, the_display_price As String, the_display_seller As String)''MsgBox "函数内部"'Dim objIE 作为新的 InternetExplorerMediumScraptatacliq = ""如果 tatacliq_url = "" 然后退出函数the_display_price = ""the_display_seller = ""'MsgBox tatacliq_url开始:Dim objIE 作为对象Set objIE = CreateObject("InternetExplorer.Application")'设置 objIE = 新 InternetExplorerMediumobjIE.Top = 0objIE.Left = 0objIE.Height = 800'objIE.Visible = TrueobjIE.Navigate (tatacliq_url)'MsgBox tatacliq_url做事件如果 Err.Number <>0 那么对象.退出设置 objIE = 无转到_开始:万一循环直到 objIE.ReadyState = 4'环形计数 = 500000做如果计数0 那么计数 = 计数 - 1万一循环直到计数 = 0'提取显示价格'在错误恢复下一个'MsgBox "我要提取显示价格"设置 the_input_element = objIE.Document.getElementById("spPriceId").getElementsbyTagName("price")product_display_price = the_input_element.innertext'MsgBox product_display_price'如果 product_display_price = "" 那么' 设置 the_input_element =objIE.Document.getElementById("product_list_price")' product_display_price = the_input_element.innertext'万一[占位符来解释我的代码当前出了什么问题] 解决方案 javascript 发生了一些我无法理解的事情,并且可能会改变您废弃事物的方式.在这里,一旦您拥有 objIE.Document.getElementById("spPriceId"),就可以使用 outertext 来抓取价格:函数 Scraptatacliq(tatacliq_url As String)Scraptatacliq = ""如果 tatacliq_url = "" 然后退出函数the_display_price = ""the_display_seller = ""开始:Dim objIE 作为对象Set objIE = CreateObject("InternetExplorer.Application")objIE.Top = 0objIE.Left = 0objIE.Height = 800objIE.Navigate (tatacliq_url)做事件如果 Err.Number <>0 那么对象.退出设置 objIE = 无转到_开始:万一循环直到 objIE.ReadyState = 4objIE.Visible = True设置 the_input_element = objIE.Document.getElementById("spPriceId")product_display_price = the_input_element.outertextMsgBox product_display_price结束函数如果您能看到停用 javascript 会发生什么,另一种解决方案可能会更简洁Which code should I use to extract price from this HTML code? I would like to get element either byIdtagnameclassnameWebsite url: https://www.tatacliq.com/vivo-v5-32gb-gold-4-gb-ram-dual-sim-4g/p-mp000000000734559Let's say I want to fetch data for "price" in below HTML (i.e. get this "14999"): <h3 class="company author"> </h3> <div itemprop="offers" itemscope="" itemtype="http://schema.org/Offer" class="price"> <p class="old" id="mrpPriceId" style="display:none"> </p> <p class="sale" id="mopPriceId" style="display:none"> </p> <p class="sale" id="spPriceId" style="display:none"> <!-- For TPR-4358 Start --> <span itemprop="price">14999.00</span> <span itemprop="priceCurrency">INR</span> <meta itemprop="itemCondition" content="http://schema.org/NewCondition" /> <meta itemprop="availability" content="http://schema.org/InStock"/>Available online</meta> <!-- For TPR-4358 End --> </p> <p class="savings pdp-savings" id="savingsOnProductId" style="display:none"> <span></span> </p> <br>Here is my code: Function Scraptatacliq(tatacliq_url As String, the_display_price As String, the_display_seller As String) ''MsgBox "Inside the function" 'Dim objIE As New InternetExplorerMedium Scraptatacliq = "" If tatacliq_url = "" Then Exit Function the_display_price = "" the_display_seller = "" 'MsgBox tatacliq_url the_start: Dim objIE As Object Set objIE = CreateObject("InternetExplorer.Application") 'Set objIE = New InternetExplorerMedium objIE.Top = 0 objIE.Left = 0 objIE.Height = 800 'objIE.Visible = True objIE.Navigate (tatacliq_url) 'MsgBox tatacliq_url Do DoEvents If Err.Number <> 0 Then objIE.Quit Set objIE = Nothing GoTo the_start: End If Loop Until objIE.ReadyState = 4 'Loop Count = 500000 Do If Count <> 0 Then Count = Count - 1 End If Loop Until Count = 0 'EXTRACT DISPLAY PRICE 'On Error Resume Next 'MsgBox "I am about to extract display price" Set the_input_element = objIE.Document.getElementById("spPriceId").getElementsbyTagName("price") product_display_price = the_input_element.innertext 'MsgBox product_display_price 'If product_display_price = "" Then ' Set the_input_element = objIE.Document.getElementById("product_list_price") ' product_display_price = the_input_element.innertext 'End If [ Placeholder to explain what is currently going wrong with my code ] 解决方案 There is something happening with javascript that I cannot understand, and might change the way you can scrap your thing.Here, it is possible to scrape the price with outertext once you have objIE.Document.getElementById("spPriceId"):Function Scraptatacliq(tatacliq_url As String) Scraptatacliq = "" If tatacliq_url = "" Then Exit Function the_display_price = "" the_display_seller = ""the_start: Dim objIE As Object Set objIE = CreateObject("InternetExplorer.Application") objIE.Top = 0 objIE.Left = 0 objIE.Height = 800 objIE.Navigate (tatacliq_url) Do DoEvents If Err.Number <> 0 Then objIE.Quit Set objIE = Nothing GoTo the_start: End If Loop Until objIE.ReadyState = 4 objIE.Visible = True Set the_input_element = objIE.Document.getElementById("spPriceId") product_display_price = the_input_element.outertext MsgBox product_display_priceEnd FunctionAnother solution might be cleaner if you can see what happens if you deactivate javascript 这篇关于使用VBA excel进行网页抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 05-16 10:21