问题描述
我正在尝试验证我的应用程序的YouTube网址.
I'm trying to validate YouTube URLs for my application.
到目前为止,我有以下内容:
So far I have the following:
// Set the youtube URL
$youtube_url = "www.youtube.com/watch?v=vpfzjcCzdtCk";
if (preg_match("/((http\:\/\/){0,}(www\.){0,}(youtube\.com){1} || (youtu\.be){1}(\/watch\?v\=[^\s]){1})/", $youtube_url) == 1)
{
echo "Valid";
else
{
echo "Invalid";
}
我希望验证Youtube Urls的以下变体:
I wish to validate the following variations of Youtube Urls:
- 有无http://
- 带有和不带有www.
- 使用网址youtube.com和youtu.be
- 必须有/watch?v =
- 必须具有唯一的视频字符串(在上面的示例"vpfzjcCzdtCk"中)
但是,我认为我的逻辑不正确,因为出于某种原因,它为以下内容返回 true :(请注意,我用.co
和不是.com
)
However, I don't think I've got my logic right, because for some reason it returns true for: www.youtube.co/watch?v=vpfzjcCzdtCk
(Notice I've written it incorrectly with .co
and not .com
)
推荐答案
您的正则表达式中有很多冗余(而且倾斜的牙签综合征).不过,这应该会产生结果:
There are a lot of redundancies in this regular expression of yours (and also, the leaning toothpick syndrome). This, though, should produce results:
$rx = '~
^(?:https?://)? # Optional protocol
(?:www[.])? # Optional sub-domain
(?:youtube[.]com/watch[?]v=|youtu[.]be/) # Mandatory domain name (w/ query string in .com)
([^&]{11}) # Video id of 11 characters as capture group 1
~x';
$has_match = preg_match($rx, $url, $matches);
// if matching succeeded, $matches[1] would contain the video ID
一些注意事项:
- 使用波浪号
~
作为分隔符,以避免LTS - 使用
[.]
而不是\.
来提高视觉清晰度并避免LTS. (特殊"字符(例如点.
-在字符类中不起作用(在方括号中)) - 要使正则表达式更具可读性",可以使用
x
修饰符(具有进一步的含义;请参见有关模式修饰符的文档),该文档还允许在正则表达式中添加注释 - 可以使用非捕获组(c8)来抑制捕获.这样可以使表达更有效.
- use the tilde character
~
as delimiter, to avoid LTS - use
[.]
instead of\.
to improve visual legibility and avoid LTS. ("Special" characters - such as the dot.
- have no effect in character classes (within square brackets)) - to make regular expressions more "readable" you can use the
x
modifier (which has further implications; see the docs on Pattern modifiers), which also allows for comments in regular expressions - capturing can be suppressed using non-capturing groups:
(?: <pattern> )
. This makes the expression more efficient.
(可选)要从(或多或少完整的)URL中提取值,您可能要使用 parse_url()
:
Optionally, to extract values from a (more or less complete) URL, you might want to make use of parse_url()
:
$url = 'http://youtube.com/watch?v=VIDEOID';
$parts = parse_url($url);
print_r($parts);
输出:
Array
(
[scheme] => http
[host] => youtube.com
[path] => /watch
[query] => v=VIDEOID
)
验证域名和提取视频ID留给读者练习.
Validating the domain name and extracting the video ID is left as an exercise to the reader.
我屈服于下面的评论之战;多亏了Toni Oriol,正则表达式现在也可以在短(youtu.be)URL上使用.
I gave in to the comment war below; thanks to Toni Oriol, the regular expression now works on short (youtu.be) URLs as well.
这篇关于使用正则表达式验证Youtube URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!