问题描述
我有以下内容:
Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
这匹配所有URL,但是我想排除那些以"
或'
字符开头的URL.我一直在尝试使用其他URL来实现.解决方案(正则表达式排除[除非在\ 之前)能够通过它.
This matches all URLs, but I'd like to exclude those that are preceded by the characters "
or '
. I've been trying to achieve this using other solutions (Regex to exclude [ unless preceded by \) but haven't been able to get it to pass.
如果我有这个:
The brown fox www.google.com
我应该得到一场比赛.但是如果我有这个:
I should get a match. But if I have this:
The brown fox <a href="www.google.com">boo</a>
由于"
,我不应该得到比赛.如何实现?
I should not get a match, because of the "
. How can this be achieved?
推荐答案
您需要后面的负向外观:用(?<!['])前缀正则表达式代码>.
You need a negative lookbehind: Prefix your regular expression by (?<!["'])
.
说明:
-
(?<!...)
的意思是:直接在当前位置之前的东西不能与...
匹配./li> -
[']
只是一个字符组,其中包含您要排除的两个字符.
(?<!...)
means: The stuff directly preceding the current position must not match...
.["']
is simply a character group containing the two characters you want to exclude.
注意:在 @"..."
字符串中,通过将双qoutes加倍来转义它们,因此您的代码将显示为:
Note: Inside @"..."
strings, double qoutes are escaped by doubling them, so your code will read:
Regex urlRx = new Regex(@"(?<![""'])((https?|ftp|file)...
在VB中:
Dim urlRx As New Regex("(?<![""'])((https?|ftp|file)...
这篇关于排除以特定字符开头的正则表达式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!