本文介绍了如何从包含西里尔字符的频道用户名中获取 YouTube 频道 ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个 YouTube 频道网址,用户名中包含西里尔字母:
https://www.youtube.com/c/%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B/视频

This is a YouTube channel URL that includes Cyrillic characters in the username:
https://www.youtube.com/c/%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B/videos

我正在尝试通过调用 YouTube DATA API v3 从 URL 获取频道的 ID:

I am trying to obtain the channel's id from the URL by calling the YouTube DATA API v3:

https://www.googleapis.com/youtube/v3/channels?key=[YouTubeAPIkey]&forUsername=%D0%9B%D1%83%D1%87%D1%88%D0%B8%D0%B5%D0%B4%D0%BE%D0%BA%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B&part=id

但是调用没有返回任何数据.

But the call returns no data.

供参考,https://www.youtube.com/c/besogontv/videos"返回有效结果:

For reference, "https://www.youtube.com/c/besogontv/videos" returns a valid result:

https://www.googleapis.com/youtube/v3/channels?key=[YouTubeAPIkey]&forUsername=besogontv

只是为了看看它是否可行,我尝试解码 URL 编码,然后重新编码为 UTF8,但没有任何区别.

Just to see if it may work, I tried decoding the URL encoding and then re-encoding to UTF8, but it didn't make a difference.

是否有我遗漏的字符编码问题?

Is there some character encoding issue I'm missing?

推荐答案

如果您将发出以下命令(在任何 GNU/Linux bash 提示符下):

If you'll issue the following command (at any GNU/Linux bash prompt):

$ wget \
--quiet \
--output-document=- \
--content-on-error \
"https://www.googleapis.com/youtube/v3/channels?key=$APP_KEY&id=UCk8LWzqGcHz21FWysiXuCHw&part=brandingSettings,contentDetails,id,snippet,statistics,status,topicDetails&maxResults=1"

您会看到 лучшиедокументальныефильмы 不是频道的用户名,而是它的 customUrl

you'll see that лучшиедокументальныефильмы is not the channel's user name, but its customUrl!

forUsername属性不适用于给定频道的自定义网址,因为这些网址不能保证唯一地代表任何给定频道.

The forUsername property does not function for a given channel's custom URL since these URLs are not guaranteed to uniquely represent any given channel.

通过在 Google 的问题跟踪器上查询这两个词组channels forusernamevanity URL 以查看用户从 Google 员工那里得到的简洁/原始的官方回复.

Do convince yourself by querying on Google's issue tracker for either of these two phrases channels forusername or vanity URL to see the terse/raw official responses users got from Google's staff.

确实,有时官方文档和工作人员的回复确实缺乏有用/有意义的明确规范和/或公式.(我自己也已经体验过了!)

Indeed, at times, the official docs and staff responses do lack useful/meaningful clear-cut specifications and/or formulations. (I already experienced all these myself too!)

最后一点,您可以从 https://www.youtube.com/c/лучшиедокументальныефильмы 获取的 HTML 页面中抓取您感兴趣的频道 ID,但请记住根据其DTOS docs,Google 禁止此活动:

As a final note, you may scrape out of the HTML page obtained from https://www.youtube.com/c/лучшиедокументальныефильмы the channel ID of your interest, but please bear in mind that this activity is forbidden by Google, as per its DTOS docs:

抓取

您和您的 API 客户端不得也不得鼓励、启用或要求他人直接或间接抓取 YouTube 应用程序或 Google 应用程序,或获取抓取的 YouTube 数据或内容.公共搜索引擎只能根据 YouTube 的 robots.txt 文件或事先获得 YouTube 的书面许可来抓取数据.

You and your API Clients must not, and must not encourage, enable, or require others to, directly or indirectly, scrape YouTube Applications or Google Applications, or obtain scraped YouTube data or content. Public search engines may scrape data only in accordance with YouTube's robots.txt file or with YouTube's prior written permission.

我建议使用搜索,而不是抓取.list API 端点,使用 q 参数为 лучшиедокументальныефильмыtype 参数为 channel(如果您能够处理隐含的模糊性).

Instead of scraping, I'd recommend using the Search.list API endpoint, invoked with the q parameter being лучшиедокументальныефильмы and the type parameter being channel (if you're able to cope with the fuzziness implied).

更新回答相关的 SO 问题

Update upon answering to a related SO question

这里是一个简单的 Python3 脚本,用于实现您正在寻找的功能.将您的自定义 URL 应用于此脚本会产生预期的结果:

Here is a simple Python3 script implementing the functionality that you're looking for. Applying your custom URL to this script produces the expected result:

$ python3 youtube-search.py \
--custom-url Лучшиедокументальныефильмы \
--app-key ...
UCk8LWzqGcHz21FWysiXuCHw

$ python3 youtube-search.py \
--user-name Лучшиедокументальныефильмы \
--app-key ...
youtube-search.py: error: user name "Лучшиедокументальныефильмы": no associated channel found

请注意,您必须将您的应用程序密钥作为参数传递给此脚本,作为命令行选项 --app-key(使用 --help 获取简要帮助信息).

Note that you have to pass to this script your application key as argument to the command line option --app-key (use --help for brief help info).

这篇关于如何从包含西里尔字符的频道用户名中获取 YouTube 频道 ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-23 22:45