问题描述
我试图将整个网页保存在我的系统上作为.HTML文件,然后解析该文件,找到一些标签并使用它们。我能够保存/解析http:/ url,但无法保存/解析https:/ url。我正在使用Perl。我使用以下代码来保存http,它工作正常。但不适用于https。是否可以解析https页面?? ..:
I m trying to save the whole web page on my system as a .HTML file and then parse that file, to find some tags and use them. I m able to save/parse http:/url, but not able to save/parse https:/url . I'm using Perl. I m using the following code to save http and it works fine. but doesn't work for https. Is it possible to parse https page ??.. :
use strict;
use warnings;
use LWP::Simple qw($ua get);
use LWP::UserAgent;
use LWP::Protocol::https;
use HTTP::Cookies;
sub main
{
my $ua = LWP::UserAgent->new();
my $cookies = HTTP::Cookies->new(
file => "cookies.txt",
autosave => 1,
);
$ua->cookie_jar($cookies);
$ua->agent("Google Chrome/30");
#$ua->ssl_opts( SSL_ca_file => 'cert.pfx' );
$ua->proxy('http','http://proxy.com');
my $response = $ua->get('http://google.com');
#$ua->credentials($response, "", "usrname", "password");
unless($response->is_success) {
print "Error: " . $response->status_line;
}
# Let's save the output.
my $save = "save.html";
unless(open SAVE, '>' . $save) {
die "nCannot create save file '$save'n";
}
# Without this line, we may get a
# 'wide characters in print' warning.
binmode(SAVE, ":utf8");
print SAVE $response->decoded_content;
close SAVE;
print "Saved ",
length($response->decoded_content),
" bytes of data to '$save'.";
}
main();
推荐答案
总是值得查看你所用模块的文档重新使用...
Always worth checking the documentation for the modules that you're using...
您正在使用来自。在该食谱中,有,其中说:
You're using modules from libwww-perl. That includes a cookbook. And in that cookbook, there is a section about HTTPS, which says:
文件这样说:
所以你只需要安装。
这篇关于使用perl脚本检索https://examle.com网址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!