问题描述
是否可以使用单个正则表达式来验证网址并匹配所有部分,我一直在研究一个,到目前为止我想出的是:
Can a single regex be used to valdate urls and match all the parts, I have been working on one and what I have come up with so far is:
(?:(?P<scheme>[a-z]*?)://)?(?:(?P<username>.*?):?(?P<password>.*?)?@)?(?P<hostname>.*?)/(?:(?:(?P<path>.*?)\?)?(?P<file>.*?\.[a-z]{1,6})?(?:(?:(?P<query>.*?)#?)?(?P<fragment>.*?)?)?)?
但是这不起作用,它应该匹配以下所有示例:
however this does not work, it should match all of the following examples:
http://username:[email protected]/path?arg=value#anchor
http://www.domain.com/
http://www.doamin.co.uk/
http://www.yahoo.com/
http://www.google.au/
https://username:[email protected]/
ftp://user:[email protected]/path/
https://www.blah1.subdoamin.doamin.tld/
domain.tld/#anchor
doamin.tld/?query=123
domain.co.uk/
域名.tld
http://www.domain.tld/index.php?var1=blah
http://www.domain.tld/path/to/index.ext
mailto://[email protected]
并为所有组件提供一个命名捕获:
and provide a named capture for all the components:
方案例如.http https ftp ftps callto mailto 和任何其他未列出的
用户名
密码
主机名,包括子域、域和 tld
路径例如/images/profile/
文件名例如 file.ext
查询字符串例如.?foo=bar&bar=foo
片段例如.#锚
将主机名作为唯一的必填字段.
With the hostname as the only mandatory field.
我们可以假设这是来自一个专门要求网址的表单,不会用于查找文本中的链接.
We can assume that this is coming from a form specificly asking for an url, and is not going to be used to find links in text.
推荐答案
/^((?P<scheme>https?|ftp):\/)?\/?((?P<username>.*?)(:(?P<password>.*?)|)@)?(?P<hostname>[^:\/\s]+)(?P<port>:([^\/]*))?(?P<path>(\/\w+)*\/)(?P<filename>[-\w.]+[^#?\s]*)?(?P<query>\?([^#]*))?(?P<fragment>#(.*))?$/
这篇关于使用部分捕获进行 url 验证的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!