如何进入数据集的第二页?无论我做什么,它都只返回第 1 页。
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
myURL = 'https://jobs.collinsaerospace.com/search-jobs/'
uClient = uReq(myURL)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
container = page_soup.findAll("section", {"id":"search-results"}, {"data-current-page":"4"})
for child in container:
for heading in child.find_all('h2'):
print(heading.text)
最佳答案
尝试使用以下脚本从您感兴趣的任何页面获取结果。您需要做的就是根据您的要求更改范围。我可以定义一个 while 循环来穷尽整个内容,但这不是你问的问题。
import requests
from bs4 import BeautifulSoup
link = 'https://jobs.collinsaerospace.com/search-jobs/results?'
params = {
'CurrentPage': '',
'RecordsPerPage': 15,
'Distance': 50,
'SearchResultsModuleName': 'Search Results',
'SearchFiltersModuleName': 'Search Filters',
'SearchType': 5
}
for page in range(1,5): #This is where you change the range to get the results from whatever page you want
params['CurrentPage'] = page
res = requests.get(link,params=params)
soup = BeautifulSoup(res.json()['results'],"lxml")
for name in soup.select("h2"):
print(name.text)
关于python - 网页抓取 - 进入第 2 页,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56819523/