问题描述
出于研究目的,我正在尝试从这两个网站获取牛皮癣患者的临床图像:
I am trying to obtain clinical images of psoriasis patients from these two websites for research purposes:
http://www.dermis.net/dermisroot/zh/31346/diagnostic.htm
http://dermatlas.med.jhmi.edu/derm/
对于第一个站点,我尝试仅使用firefox保存页面,但是它仅保存缩略图,而不保存完整尺寸的图像.我可以使用名为"downloadthemall"的Firefox附件访问完整尺寸的图像,但是它将每个图像保存为新的html页面的一部分,我不知道有什么方法可以仅提取图像.
For the first site, I tried just saving the page with firefox, but it only saved the thumbnails and not the full-sized images. I was able to access the full-sized images using a firefox addon called "downloadthemall", but it saved each image as part of a new html page and I do not know of any way to extract just the images.
我还尝试使用我大学的一台linux计算机,并使用wget镜像网站,但我无法使其正常运行,仍然不确定原因.
I also tried getting on one of my university's linux machines and using wget to mirror the websites, but I was not able to get it to work and am still unsure as to why.
因此,我想知道编写一个简短的脚本(或最简单的方法)是否容易(a)获得第一个网站上链接的全尺寸图片,以及(b)获得所有全尺寸图片,在第二个站点上调整大小的图像,文件名中带有牛皮癣".
Consequently, I am wondering whether it would be easy to write a short script (or whatever method is easiest) to (a) obtain the full-sized images linked to on the first website, and (b) obtain all full-sized images on the second site with "psoriasis" in the filename.
我已经编程了两年,但是对Web开发的经验却为零,并且希望获得有关如何进行此操作的任何建议.
I have been programming for a couple of years, but have zero experience with web development and would appreciate any advice on how to go about doing this.
推荐答案
为什么不使用wget从域中递归下载图像?这是一个示例:
Why not use wget to recursively download images from the domain? Here is an example:
wget -r -P /save/location -A jpeg,jpg,bmp,gif,png http://www.domain.com
这是手册页: http://www.gnu.org/软件/wget/manual/wget.html
这篇关于如何从网站上抓取全尺寸图片?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!