问题描述
这是一个 YouTube 频道网址,用户名中包含西里尔字母:
https://www.youtube.com/c/%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B/视频
This is a YouTube channel URL that includes Cyrillic characters in the username:
https://www.youtube.com/c/%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B/videos
我正在尝试通过调用 YouTube DATA API v3 从 URL 获取频道的 ID:
I am trying to obtain the channel's id from the URL by calling the YouTube DATA API v3:
https://www.googleapis.com/youtube/v3/channels?key=[YouTubeAPIkey]&forUsername=%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B&part=id
但是调用没有返回任何数据.
But the call returns no data.
供参考,https://www.youtube.com/c/besogontv/videos"返回有效结果:
For reference, "https://www.youtube.com/c/besogontv/videos" returns a valid result:
https://www.googleapis.com/youtube/v3/channels?key=[YouTubeAPIkey]&forUsername=besogontv
只是为了看看它是否可行,我尝试解码 URL 编码,然后重新编码为 UTF8,但没有任何区别.
Just to see if it may work, I tried decoding the URL encoding and then re-encoding to UTF8, but it didn't make a difference.
是否有我遗漏的字符编码问题?
Is there some character encoding issue I'm missing?
推荐答案
如果您将发出以下命令(在任何 GNU/Linux bash
提示符下):
If you'll issue the following command (at any GNU/Linux bash
prompt):
$ wget \
--quiet \
--output-document=- \
--content-on-error \
"https://www.googleapis.com/youtube/v3/channels?key=$APP_KEY&id=UCk8LWzqGcHz21FWysiXuCHw&part=brandingSettings,contentDetails,id,snippet,statistics,status,topicDetails&maxResults=1"
您会看到 лучшиедокументальныефильмы
不是频道的用户名,而是它的 customUrl
!
you'll see that лучшиедокументальныефильмы
is not the channel's user name, but its customUrl
!
forUsername
属性不适用于给定频道的自定义网址,因为这些网址不能保证唯一地代表任何给定频道.
The forUsername
property does not function for a given channel's custom URL since these URLs are not guaranteed to uniquely represent any given channel.
通过在 Google 的问题跟踪器上查询这两个词组channels forusername
或 vanity URL
以查看用户从 Google 员工那里得到的简洁/原始的官方回复.
Do convince yourself by querying on Google's issue tracker for either of these two phrases channels forusername
or vanity URL
to see the terse/raw official responses users got from Google's staff.
确实,有时官方文档和工作人员的回复确实缺乏有用/有意义的明确规范和/或公式.(我自己也已经体验过了!)
Indeed, at times, the official docs and staff responses do lack useful/meaningful clear-cut specifications and/or formulations. (I already experienced all these myself too!)
最后一点,您可以从 https://www.youtube.com/c/лучшиедокументальныефильмы
获取的 HTML 页面中抓取您感兴趣的频道 ID,但请记住根据其DTOS docs
,Google 禁止此活动:
As a final note, you may scrape out of the HTML page obtained from https://www.youtube.com/c/лучшиедокументальныефильмы
the channel ID of your interest, but please bear in mind that this activity is forbidden by Google, as per its DTOS docs
:
抓取
您和您的 API 客户端不得也不得鼓励、启用或要求他人直接或间接抓取 YouTube 应用程序或 Google 应用程序,或获取抓取的 YouTube 数据或内容.公共搜索引擎只能根据 YouTube 的 robots.txt 文件或事先获得 YouTube 的书面许可来抓取数据.
You and your API Clients must not, and must not encourage, enable, or require others to, directly or indirectly, scrape YouTube Applications or Google Applications, or obtain scraped YouTube data or content. Public search engines may scrape data only in accordance with YouTube's robots.txt file or with YouTube's prior written permission.
我建议使用搜索,而不是抓取.list
API 端点,使用 q
参数为 лучшиедокументальныефильмы
和 type
参数为 channel(如果您能够处理隐含的模糊性).
Instead of scraping, I'd recommend using the Search.list
API endpoint, invoked with the q
parameter being лучшиедокументальныефильмы
and the type
parameter being channel
(if you're able to cope with the fuzziness implied).
更新回答相关的 SO 问题
Update upon answering to a related SO question
这里是一个简单的 Python3 脚本,用于实现您正在寻找的功能.将您的自定义 URL 应用于此脚本会产生预期的结果:
Here is a simple Python3 script implementing the functionality that you're looking for. Applying your custom URL to this script produces the expected result:
$ python3 youtube-search.py \
--custom-url Лучшиедокументальныефильмы \
--app-key ...
UCk8LWzqGcHz21FWysiXuCHw
$ python3 youtube-search.py \
--user-name Лучшиедокументальныефильмы \
--app-key ...
youtube-search.py: error: user name "Лучшиедокументальныефильмы": no associated channel found
请注意,您必须将您的应用程序密钥作为参数传递给此脚本,作为命令行选项 --app-key
(使用 --help
获取简要帮助信息).
Note that you have to pass to this script your application key as argument to the command line option --app-key
(use --help
for brief help info).
这篇关于如何从包含西里尔字符的频道用户名中获取 YouTube 频道 ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!