问题描述
我正在使用下面提供的代码创建一个列表,其中包含公共 YouTube 播放列表中的视频标题.它适用于包含少于 100 个视频的播放列表.对于包含超过 100 个视频的播放列表,播放列表中前 100 个视频的标题将添加到列表中.我认为这种行为背后的原因是因为当我们在浏览器中加载同一页面时,会加载前 100 个视频.当您向下滚动页面时,会加载剩余的视频.有没有办法从播放列表中获取所有视频的标题?
I am using code provided below to create a list containing titles of videos in a public YouTube playlist. It works well for playlists containing less than 100 videos. For playlists containing more than 100 videos, titles of first 100 videos in the playlist will be added to the list. I think reason behind this behaviour is because when we load the same page in browser, first 100 videos are loaded. Remaining videos are loaded as you scroll down the page. Is there any way to get titles of all videos from a playlist?
from bs4 import BeautifulSoup as bs
import requests
url = "https://www.youtube.com/playlist?list=PLRdD1c6QbAqJn0606RlOR6T3yUqFWKwmX"
r = requests.get(url)
soup = bs(r.text,'html.parser')
res = soup.find_all('tr',{'class':'pl-video yt-uix-tile'})
titles = []
for video in res:
titles.append(video.get('data-title'))
推荐答案
我在 Abrogans 输入的帮助下创建了以下脚本一>.
I created following script with the help of inputs from Abrogans.
此外,这个要点很有帮助.
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Firefox()
url = "https://www.youtube.com/playlist?list=PLRdD1c6QbAqJn0606RlOR6T3yUqFWKwmX"
driver.get(url)
elem = driver.find_element_by_tag_name('html')
elem.send_keys(Keys.END)
time.sleep(3)
elem.send_keys(Keys.END)
innerHTML = driver.execute_script("return document.body.innerHTML")
page_soup = bs(innerHTML, 'html.parser')
res = page_soup.find_all('span',{'class':'style-scope ytd-playlist-video-renderer'})
titles = []
for video in res:
if video.get('title') != None:
titles.append((video.get('title')))
driver.close()
这篇关于用于创建包含 100 多个视频的 YouTube 播放列表的视频标题列表的 Python 脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!