本文介绍了Python:Beautifulsoup 返回 None 或 []的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,我正在练习我的请求和网络抓取技巧,所以我试图抓取 youtube 上的热门页面,并拉出热门视频的标题,这是这个链接 youtube

hello im practicing my requests and web scraping skills, so im attempting to scrape the trending page on youtube, and pull the title of the videos that are trending, which is this link youtube

这是我正在运行的代码

import requests
from bs4 import BeautifulSoup

url = 'https://www.youtube.com/feed/trending'
html = requests.get(url)
soup = BeautifulSoup(html.content, "html.parser")
a = soup.find_all("a", {"id": "video-title"})
print(a)

及其返回 [],我不明白为什么它在源代码中返回 [],

and its returning [], i dont understand why its returning [] when its the in the source code,

推荐答案

网络正在发展,因为它变得越来越难以理解.在大多数情况下,现代"网页不再由服务器生成,因为用户会看到它们;相反,大量的脚本被发送给用户,并且基本上将任何东西注入到 DOM 中.

The web is devolving in that it's becoming increasingly inscrutable. "Modern" webpages, for the most part, are no longer generated by the server as the user will see them; rather, globs of script are being sent to the user and basically injecting whatever ¯\_(ツ)_/¯ into the DOM.

这就是为什么你需要在成熟的浏览器中使用 Selenium 绑定,正如上面 QHarr 所提到的.

That's why you'll need to use Selenium bindings with a full-blown browser, as mentioned by QHarr above.

我很抱歉没有将其作为评论,但显然我需要 50 分才能做到这一点.

My apologies for not making this a comment, but apparently I need 50 points to do that.

这篇关于Python:Beautifulsoup 返回 None 或 []的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 11:54
查看更多