问题描述
我正在尝试编写一个正则表达式来去除除链接之外的所有 HTML(分别为 <a href
和 </a>
标签.它不必 100% 安全(我不担心注入攻击或任何事情,因为我正在解析已经批准并发布到 SWF 电影).
I am trying to write a regular expression to strip all HTML with the exception of links (the <a href
and </a>
tags respectively. It does not have to be 100% secure (I am not worried about injection attacks or anything as I am parsing content that has already been approved and published into a SWF movie).
我使用的原始strip tags"正则表达式是 <(.|)+?>
,我尝试将其修改为 <([^a]|)+?>
,但这当然会允许任何带有 a 的标签,而不是开头带有空格的标签.
The original "strip tags" regular expression I'm using was <(.|)+?>
, and I tried to modify it to <([^a]|)+?>
, but that of course will allow any tag that has an a in it rather than one that has it in the beginning, with a space.
并不是说它真的很重要,但如果有人想知道我在 ActionScript 3.0 用于 Flash 电影.
Not that it should really matter, but in case anyone cares to know I am writing this in ActionScript 3.0 for a Flash movie.
推荐答案
<(?!/?a(?=>|s.*>))/?.*?>
试试这个.p 标签有类似的东西.为他们工作,所以不明白为什么不.使用负前瞻来检查它是否与 a(以可选/字符为前缀)不匹配,其中(使用正前瞻)a(带有可选/前缀)后跟 > 或空格、东西然后是 >.然后匹配直到下一个 > 字符.用
Try this. Had something similar for p tags. Worked for them so don't see why not. Uses negative lookahead to check that it doesn't match a (prefixed with an optional / character) where (using positive lookahead) a (with optional / prefix) is followed by a > or a space, stuff and then >. This then matches up until the next > character. Put this in a subst with
s/<(?!/?a(?=>|s.*>))/?.*?>//g;
这应该只留下开始和结束标签
This should leave only the opening and closing a tags
这篇关于去除除链接之外的所有 HTML 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!