在使用 PHP 做简单的爬虫的时候,我们经常会遇到需要下载远程图片的需求,所以下面来简单实现这个需求
1:使用curl

比如我们有下面这两张图片:

$images = [
'https://img.alicdn.com/tps/TB1jjaYOFXXXXa2aXXXXXXXXXXX-276-402.jpg_150x10000q90.jpg',
'https://img.alicdn.com/tfs/TB15QQ5cgMPMeJjy1XbXXcwxVXa-520-280.jpg_q90_.webp'
];

第一步,我们可以直接来使用最简单的代码实现:

function download($url, $path = 'images/')
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // 信任任何证书
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$file = curl_exec($ch);
curl_close($ch);
$filename = pathinfo($url, PATHINFO_BASENAME);
$resource = fopen($path . $filename, 'a');
fwrite($resource, $file);
fclose($resource);
}

那么在下载远程图片的时候就可以这样:

foreach ( $images as $url ) {
download($url);
}

1:封装成一个类
缕清思路之后,我们可以将这个基本的功能封装到一个类中:

class Spider {

    public function downloadImage($url, $path = 'images/')
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // 信任任何证书
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$file = curl_exec($ch);
curl_close($ch);
$filename = pathinfo($url, PATHINFO_BASENAME);
$resource = fopen($path . $filename, 'a');
fwrite($resource, $file);
fclose($resource);
}
}

或者,我们还可以这样稍微优化一下:

class Spider {

    public function downloadImage($url, $path='images/')
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // 信任任何证书
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$file = curl_exec($ch);
curl_close($ch); $this->saveAsImage($url, $file, $path);
} private function saveAsImage($url, $file, $path)
{
$filename = pathinfo($url, PATHINFO_BASENAME);
$resource = fopen($path . $filename, 'a');
fwrite($resource, $file);
fclose($resource);
}
}

封装成类之后,我们可以这样调用代码来下载图片:

$spider = new Spider();

foreach ( $images as $url ) {
$spider->downloadImage($url);
}

这样,对一般可访问的远程图片下载就OK了....

或者再这样完善下下载方法的封装:

    /**
* 下载远程图片保存到本地
* @access public
* @author lxhui<772932587@qq.com>
* @since 1.0
* @return array
* @params string $url 远程图片地址
* @params string $save_dir 需要保存的地址
* @params string $filename 保存文件名
*/
function download($url, $save_dir = './public/upload/loan/',$filename='')
{
if(trim($save_dir)=='')
$save_dir='./'; if(trim($filename)==''){//保存文件名
$allowExt = array('gif', 'jpg', 'jpeg', 'png', 'bmp');
$ext=strrchr($url,'.');
if(!in_array($ext,$allowExt))
return array('file_name'=>'','save_path'=>'','error'=>3); $filename=time().$ext;
}
if(0!==strrpos($save_dir,'/'))
$save_dir.='/'; //创建保存目录
if(!file_exists($save_dir)&&!mkdir($save_dir,0777,true))
return array('file_name'=>'','save_path'=>'','error'=>5); $ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // 信任任何证书
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$file = curl_exec($ch);
curl_close($ch);
$filename = pathinfo($url, PATHINFO_BASENAME);
$resource = fopen($save_dir . $filename, 'a');
fwrite($resource, $file);
fclose($resource);
unset($file,$url);
return array('file_name'=>$filename,'save_path'=>$save_dir.$filename,'error'=>0);
}
05-07 14:51
查看更多