问题描述
我试图写一个正EX pression剥离所有HTML除链接(在< A HREF
和 < / A>
分别标记它不必是100%安全(我不担心注入式攻击或任何东西,因为我解析已被批准并公布到 SWF 电影)。
原来带标签常规EX pression我使用的是< | + GT(\名词);
,我试图修改为≤([^ A] | \ n)的+>
,但当然会允许有一个 在它,而不是一个有它在开始时,用一个空格。
不,应该真正的问题,但如果有人在乎知道我写这的ActionScript 3.0 的一个闪存的影片。
≤(\ / A(=> |?!?\ S *>)。)\ /?.*?>
试试这个。也有类似的p标签的东西。为他们的工作所以不明白为什么不能。使用负向前查找,以检查它不匹配(prefixed具有可选/字符),其中(使用正向前查找)一个(使用可选/ preFIX),之后是>或空格,东西,然后>。这就匹配,直到下一个>字符。将这个在SUBST与
S /<?。?(\ / A(=> |?!?\ S *>))\ / *> //克;
这应该只留下打开和关闭标签
I am trying to write a regular expression to strip all HTML with the exception of links (the <a href
and </a>
tags respectively. It does not have to be 100% secure (I am not worried about injection attacks or anything as I am parsing content that has already been approved and published into a SWF movie).
The original "strip tags" regular expression I'm using was <(.|\n)+?>
, and I tried to modify it to <([^a]|\n)+?>
, but that of course will allow any tag that has an a in it rather than one that has it in the beginning, with a space.
Not that it should really matter, but in case anyone cares to know I am writing this in ActionScript 3.0 for a Flash movie.
<(?!\/?a(?=>|\s.*>))\/?.*?>
Try this. Had something similar for p tags. Worked for them so don't see why not. Uses negative lookahead to check that it doesn't match a (prefixed with an optional / character) where (using positive lookahead) a (with optional / prefix) is followed by a > or a space, stuff and then >. This then matches up until the next > character. Put this in a subst with
s/<(?!\/?a(?=>|\s.*>))\/?.*?>//g;
This should leave only the opening and closing a tags
这篇关于去除所有的HTML标签,除了链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!