我使用以下JSON查询进行了api调用:
import requests
import json
import pandas as pd
url = ("https://api.meetup.com/2/groups?zip=b1+1aa&offset=0&format=json&lon=-1.89999997616&category_id=34&photo-host=public&page=500&radius=200.0&fields=&lat=52.4799995422&order=id&desc=false&sig_id=243750775&sig=ed49065d620a34c10e1f0f91dd58da2e36547af1")
data = requests.get(url).json()
df = pd.io.json.json_normalize(data['results'])
这样就变成一个数据帧,但是,我还有5个要查询的网址页面,如下所示:
url2 = ("https://api.meetup.com/2/groups?zip=b1+1aa&offset=1&format=json&lon=-1.89999997616&category_id=34&photo-host=public&page=500&radius=200.0&fields=&lat=52.4799995422&order=id&desc=false&sig_id=243750775&sig=ed49065d620a34c10e1f0f91dd58da2e36547af1")
和
url3
类似,只是通过offset=2
更改页面等是关键。我想知道是否可以使用for循环遍历所有这些页面。
最佳答案
首先,不要在URL中对查询字符串进行硬编码,而是将查询数据作为字典传递给request
,即:
url = "https://api.meetup.com/2/groups"
querydict = {
"zip":"b1+1aa",
"offset": 0,
"format":"json",
"lon":-1.89999997616,
"category_id": 34,
"photo-host":"public",
# etc
}
response = requests.get(url, params=querydict)
然后,您要做的就是循环直到拥有所有想要的内容,并在每次迭代中更新
querydict["offset"]
:url = "https://api.meetup.com/2/groups"
querydict = {
"zip":"b1+1aa",
"offset": 0,
"format":"json",
"lon":-1.89999997616,
"category_id": 34,
"photo-host":"public",
# etc
}
while True:
response = requests.get(url, params=querydict)
# check your response status, check the json data
# etc
if we_have_enough(response):
break
# ok let's fetch next page
querydict["offset"] += 1