本文介绍了网页抓取视频的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过在 https://www.watchcartoononline.com/bobs-burgers-season-9-episode-3-tweentrepreneurs.

我不知道如何从该网站提取视频网址.我使用 Chrome 和 Firefox Web 开发人员工具来确定它在 iframe 中,但是使用 BeautifulSoup 搜索 iframe 提取 src url,返回与视频无关的链接.对 mp4 或 flv 文件的引用在哪里(我在开发人员工具中看到 - 即使禁止单击它们).

I cannot figure out how to extract the video url from this website. I used Chrome and Firefox web developer tools to figure out it is in an iframe, but extracting src urls with BeautifulSoup searching for iframes, returns links that have nothing to do with the video. Where are the references to mp4 or flv files (which I see in Developer Tools - even though clicking them is forbidden).

如果您了解如何使用 BeautifulSoup 和请求进行视频网络抓取,我们将不胜感激.

Any understanding on how to do video web scraping with BeautifulSoup and requests would be appreciated.

如果需要,这里有一些代码.很多教程都说要使用a"标签,但我没有收到任何a"标签.

Here is some code if needed. A lot of tutorials say to use 'a' tags, but I didn't receive any 'a' tags.

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.watchcartoononline.com/bobs-burgers-season-9-episode-5-live-and-let-fly")
soup = BeautifulSoup(r.content,'html.parser')
links = soup.find_all('iframe')
for link in links:
    print(link['src'])

推荐答案

import requests
url = "https://disk19.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e03.mp4?st=_EEVz36ktZOv7ZxlTaXZfg&e=1541637622"
def download_file(url,filename):
    # NOTE the stream=True parameter
    r = requests.get(url, stream=True)
    with open(filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
                #f.flush() commented by recommendation from J.F.Sebastian
    return filename

download_file(url,"bobs.burgers.s09e03.mp4")

此代码会将此特定剧集下载到您的计算机上.视频网址嵌套在 标签中的 标签内.

This code will download this particular episode onto your computer. The video url is nested inside the <video> tag in the <source> tag.

这篇关于网页抓取视频的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 20:46