本文介绍了从网站/网页下载/保存文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从下面的链接下载第一个/前5个日期的PDF文件,并将其保存在桌面上。我不知道如何开始,但也无法在Google上找到明确的内容。



你认为可以帮助我吗?



解决方案

我将使用Internet Explorer,并使用SHDocVw.InternetExplorer对象(VBA参考Microsoft Internet您可以(a)使用创建一个新的Internet Explorer窗口Set x = New SHDocVw.InternetExplorer

/ code>或(b)使用获取现有的Internet Explorer窗口设置owins = CreateObject(Shell.Application)。Windows owins 是一个数组,循环遍历,直到找到一个 Mid(TypeName(owins(i).Document),1,12)=HTMLDocument

一旦你有一个Internet Explorer ie ,你可以调用 ie.Navigate(url)转到网页



要等待Internet Explorer在您询问之前完成导航,您可以执行以下操作:

  Do While mascot_win.Busy 
Application.Wait DateAdd(s,1,Now)
DoEvents
循环

要获取该页面上前五个PDF的URL,您需要检查页面的HTML。有两种方式,这取决于HTML的格式。如果HTML写得很好,那么您可以使用 ie.Document.all()导航文档对象模型(标签,如XML)。但是如果HTML格式不完整,您可能需要从 ie.Document.all(0).innerHTML 中阅读HTML。



根据您给出的链接的外观,您将会寻找以下内容:

  < li> Data de< strong> 22.03.2013< / strong>,numarul:< a href =/ wp-content / uploads / Ordin-149P-din-22.03.2013.pdf> 149P<一个>< /锂> 

一旦您隔离了每个PDF网址(使用< ;在DOM模型中使用> 标签,或者在HTML上使用大量 Mid()调用),则可以使用以下方式下载它:

 私有声明函数URLDownloadToFile _ 
Liburlmon_
别名URLDownloadToFileA_
(_
ByVal pCaller As Long,_
ByVal szURL As String,_
ByVal szFileName As String,_
ByVal dwReserved As Long,_
ByVal lpfnCB As Long _
)As Long

Dim ss As String
Dim ts As String
ss =http://blah/blah/blah.pdf
ts =c:\meh\blah.pdf
URLDownloadToFile 0,ss,ts,0,0


I need to download the PDF files from the link below for the first/top 5 dates and save them on Desktop for instance. I have no clue how to start but also couldn't find something explicit on Google.

Do you think you can help me?

http://cetatenie.just.ro/ordine/articol-11/

解决方案

I would use Internet Explorer, and automate it using an SHDocVw.InternetExplorer object (VBA reference 'Microsoft Internet Controls', ieframe.dll).

You can either (a) create a new Internet Explorer window using Set x = New SHDocVw.InternetExplorer or (b) acquire an existing Internet Explorer window using Set owins = CreateObject("Shell.Application").Windows (owins is an array, loop through it until you find one where Mid(TypeName(owins(i).Document), 1, 12) = "HTMLDocument").

Once you have an Internet Explorer ie, you can call ie.Navigate(url) to go to a website.

To wait for Internet Explorer to finish navigating before you interrogate it, you can run something like:

Do While mascot_win.Busy
    Application.Wait DateAdd("s", 1, Now)
    DoEvents
Loop

To get the URLs for the first five PDFs on that page, you'd need to examine the HTML of the page. There are two ways, depending on how well-formed the HTML is. If the HTML is well-written, then you can navigate the Document Object Model (the tags, like XML) with ie.Document.all(). But if the HTML is not well-formed, you may have to resort to reading the HTML from ie.Document.all(0).innerHTML.

By the looks of the link you gave, you will be looking for things like:

<li>Data de <strong>22.03.2013</strong>, numarul: <a href="/wp-content/uploads/Ordin-149P-din-22.03.2013.pdf">149P</a></li>

Once you have isolated each PDF URL (using either the attribute of the <a> tag in the DOM model or using lots of Mid() calls on the HTML), you can download it using:

Private Declare Function URLDownloadToFile _
Lib "urlmon" _
Alias "URLDownloadToFileA" _
( _
    ByVal pCaller As Long, _
    ByVal szURL As String, _
    ByVal szFileName As String, _
    ByVal dwReserved As Long, _
    ByVal lpfnCB As Long _
) As Long

Dim ss As String
Dim ts As String
ss = "http://blah/blah/blah.pdf"
ts = "c:\meh\blah.pdf"
URLDownloadToFile 0, ss, ts, 0, 0

这篇关于从网站/网页下载/保存文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 15:07