BeautifulSoup 会简单地向您返回页面到达页面时所发现的html,其中不包含 companyName_99a4824b 类标记.>只有在用户等待页面完全加载后,HTML才会包含所需的标记.如果要抓取这些数据,则需要使用类似 Selenium ,您可以指示它等待页面所需的元素准备就绪.i try to scrape some informations from a webpage and on the one page it is working fine, but on the other webpage it is not working cause i only get a none return-valueThis code / webpage is working fine:# https://realpython.com/beautiful-soup-web-scraper-python/import requestsfrom bs4 import BeautifulSoupURL = "https://www.monster.at/jobs/suche/?q=Software-Devel&where=Graz"page = requests.get(URL)soup = BeautifulSoup(page.content, "html.parser")name_box = soup.findAll("div", attrs={"class": "company"})print (name_box)But with this code / webpage i only get a None as return-value# https://www.freecodecamp.org/news/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe/import requestsfrom bs4 import BeautifulSoupURL = "https://www.bloomberg.com/quote/SPX:IND"page = requests.get(URL)soup = BeautifulSoup(page.content, "html.parser")name_box = soup.find("h1", attrs={"class": "companyName__99a4824b"})print (name_box)Why is that?(at first i thought due the number in the class on the second webpage "companyName__99a4824b" it changes the classname dynamicly - but this is not the case - when i refresh the webpage it is still the same classname...) 解决方案 The reason you get None is that the Bloomberg page uses Javascript to load its content while the user is on the page.BeautifulSoup simply returns to you the html of the page as found as soon as it reaches the page -- which does not contain the companyName_99a4824b class-tag.Only after the user has waited for the page to fully load does the html include the desired tag.If you want to scrape that data, you'll need to use something like Selenium, which you can instruct to wait until the desired element of the page is ready. 这篇关于网站抓取/Beautifulsoup/有时不返回?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
08-04 23:41