问题描述
我在python中使用google image api下载了第20张图片结果,代码如下:
导入操作系统导入系统导入时间从 urllib 导入 FancyURLopener导入 urllib2导入 simplejsonsearchTerm = "猫"# 替换 '%20' 的搜索词中的空格 ' ' 以符合请求searchTerm = searchTerm.replace(' ','%20')# 使用定义的版本启动 FancyURLopener类 MyOpener(FancyURLopener):version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'myopener = MyOpener()# 设置计数为0计数=0对于范围(0,4)中的我:# 请注意,每次迭代的开始都会发生变化,以便为每个循环请求一组新的图像url = ('https://ajax.googleapis.com/ajax/services/search/images?'+'v=1.0&q='+searchTerm7+'&start='+str(i*4)+'&userip=MyIP&imgsz=xlarge|xxlarge|huge')打印网址request = urllib2.Request(url, None, {'Referer': 'testing'})响应 = urllib2.urlopen(请求)# 使用 JSON 获取结果结果 = simplejson.load(响应)数据 = 结果['responseData']数据信息 = 数据['结果']# 迭代每个结果并获得未转义的 url对于 dataInfo 中的 myUrl:计数 = 计数 + 1打印 myUrl['unescapedUrl']os.chdir(新路径)myopener.retrieve(myUrl['unescapedUrl'],str(num)+'-'+str(count))# 休眠一秒以防止 Google 阻止 IP时间.sleep(3)
但现在我想使用谷歌自定义搜索来做到这一点,以获得更好的结果.我知道我应该注册以获得 APIKey,但我没有找到任何简单的例子作为我发布的代码.有人可以帮忙吗,我真的迷失在谷歌文档中.
显然对免费 api 有限制,每天 100 个请求,对吗?
我现在在这里,但仍然无法工作
导入操作系统导入系统导入时间从 urllib 导入 FancyURLopener导入 urllib2导入 simplejson导入 cStringIO导入打印searchTerm="猫"# 使用定义的版本启动 FancyURLopener类 MyOpener(FancyURLopener):version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'myopener = MyOpener()url='https://www.googleapis.com/customsearch/v1?key=API_KEY&cx=017576662512468239146:omuauf_lfve'+'&q='+searchTerm+'&searchType=image'+'&start=0'+'&imgSize=xlarge|xxlarge|巨大'打印网址request = urllib2.Request(url, None, {'Referer': 'testing'})响应 = urllib2.urlopen(请求)# 使用 JSON 获取结果数据 = json.load(响应)pprint.PrettyPrinter(indent=4).pprint(data['items'][0])
你可以使用这个 Google API 客户端 Python 库.
演示:
此处 是一个示例(我将其更改为):
from apiclient.discovery 导入构建service = build("customsearch", "v1",developerKey="** 您的开发人员密钥 **")res = service.cse().list(q='蝴蝶',cx=' ** 你的 cx **',搜索类型='图像',数量=3,imgType='剪贴画',文件类型='png',安全 = '关闭').执行()如果不是 res 中的项目":打印 '没有结果 !!
res 是:{}'.format(res)别的:对于 res['items'] 中的项目:打印('{}:
{}'.format(item['title'], item['link']))
输出:
剪贴画 - 蝴蝶:http://openclipart.org/image/800px/svg_to_png/3965/jonata_Butterfly.png动物、蝴蝶、昆虫、自然 - 免费图片 - 158831:http://pixabay.com/static/uploads/photo/2013/07/13/11/51/animal-158831_640.png剪贴画 - 帝王蝶:http://openclipart.org/image/800px/svg_to_png/110023/Monarch_Butterfly_by_Merlin2525.png
是的,免费
版本有限制,您可以,然后选择您的自定义搜索引擎,然后在 基础 标签,将 Image search
选项设置为 ON
,对于 Sites to search
部分,选择 Search the entire web but强调包含的网站
选项.
链接:
- https://google-api-client-libraries.appspot.com/documentation/customsearch/v1/python/latest/customsearch_v1.cse.html
- https://developers.google.com/custom-search/json-api/v1/reference/cse/list
- https://www.google.com/cse/all
- https://developers.google.com/api-客户端库/python/apis/customsearch/v1
- https://console.developers.google.com/project
- https://developers.google.com/api-client-图书馆/python/start/get_started
- https://developers.google.com/api-client-library/python/guide/aaa_apikeys
I have used google image api in python to download 20 first image result with the following code:
import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson
searchTerm = "Cat"
# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')
# Start FancyURLopener with defined version
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()
# Set count to 0
count=0
for i in range(0,4):
# Notice that the start changes for each iteration in order to request a new set of images for each loop
url = ('https://ajax.googleapis.com/ajax/services/search/images?'+'v=1.0&q='+searchTerm7+'&start='+str(i*4)+'&userip=MyIP&imgsz=xlarge|xxlarge|huge')
print url
request = urllib2.Request(url, None, {'Referer': 'testing'})
response = urllib2.urlopen(request)
# Get results using JSON
results = simplejson.load(response)
data = results['responseData']
dataInfo = data['results']
# Iterate for each result and get unescaped url
for myUrl in dataInfo:
count = count + 1
print myUrl['unescapedUrl']
os.chdir(newpath)
myopener.retrieve(myUrl['unescapedUrl'],str(num)+'-'+str(count))
# Sleep for one second to prevent IP blocking from Google
time.sleep(3)
But now i would like to use google custom search to do that, in order to get better result. I have understand that i should register to get a APIKey but i did'nt find any simple example as the code i post. Does some one can help, i am really lost in the google documentation.
Visibly there is restriction for the free api, 100 request a day, is that correct?
Edit: I am here rightnow, but still not work
import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson
import cStringIO
import pprint
searchTerm="Cat"
# Start FancyURLopener with defined version
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()
url='https://www.googleapis.com/customsearch/v1?key=API_KEY&cx=017576662512468239146:omuauf_lfve'+'&q='+searchTerm+'&searchType=image'+'&start=0'+'&imgSize=xlarge|xxlarge|huge'
print url
request = urllib2.Request(url, None, {'Referer': 'testing'})
response = urllib2.urlopen(request)
# Get results using JSON
data = json.load(response)
pprint.PrettyPrinter(indent=4).pprint(data['items'][0])
You can use this Google APIs Client Library for Python.
Demo:
Here is a sample (i change it to):
from apiclient.discovery import build
service = build("customsearch", "v1",
developerKey="** your developer key **")
res = service.cse().list(
q='butterfly',
cx=' ** your cx **',
searchType='image',
num=3,
imgType='clipart',
fileType='png',
safe= 'off'
).execute()
if not 'items' in res:
print 'No result !!
res is: {}'.format(res)
else:
for item in res['items']:
print('{}:
{}'.format(item['title'], item['link']))
Output:
Clipart - Butterfly:
http://openclipart.org/image/800px/svg_to_png/3965/jonata_Butterfly.png
Animal, Butterfly, Insect, Nature - Free image - 158831:
http://pixabay.com/static/uploads/photo/2013/07/13/11/51/animal-158831_640.png
Clipart - Monarch Butterfly:
http://openclipart.org/image/800px/svg_to_png/110023/Monarch_Butterfly_by_Merlin2525.png
Yes, there is a limitation for Free
edition and you can monitor it from Google developer console:
Note:
Go to your Custom Search Engine, then select your custom search engine, then in Basics tab,set Image search
option to ON
, and for Sites to search
section, select Search the entire web but emphasize included site
option.
Links:
- https://google-api-client-libraries.appspot.com/documentation/customsearch/v1/python/latest/customsearch_v1.cse.html
- https://developers.google.com/custom-search/json-api/v1/reference/cse/list
- https://www.google.com/cse/all
- https://developers.google.com/api-client-library/python/apis/customsearch/v1
- https://console.developers.google.com/project
- https://developers.google.com/api-client-library/python/start/get_started
- https://developers.google.com/api-client-library/python/guide/aaa_apikeys
这篇关于使用谷歌自定义搜索 api 下载图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!