问题描述
我为JavaScript的.replace方法使用了几个非常安全的正则表达式模式。输入是一个序列化的DOM字符串,我想删除所有YUI3 classNames和YUI3生成的id属性。
var resourceDOMStr = Y.DataType.XML.format(Y.Node.getDOMNode(this.getIframeDOMContainer())。innerHTML);
alert('unsanitized markup:\\\
\\\
'+ resourceDOMStr);
//删除YUI添加的id和类
//正则表达式去除'id =*'
//正则表达式去除整个类attr:'class =''yui3- * '''
//正则表达式去除className +尾部空格:class ='yui3- *'safeClass
//正则表达式去除className +前导空格:class =safeClass'yui3- *'
resourceDOMStr.replace('','');
alert('sanitized markup:\\\
\\\
'+ resourceDOMStr);
所以是的,我想要干净并删除整个id属性,其值始终以'yui_3'开始,例如; ID = yui_3_3_0_1_1296949124608175
。另外,我想删除整个类属性,如果它唯一的类是一个YUI3生成的className,否则我只想删除YUI3 className和任何前导/尾随空格。生成的classNames将始终以'yui3-'开头,示例;
class =yui3-dd-shim
class =safeClass yui3-dd-shim
class =yui3-dd-shim safeClass
..我不希望'safeClass'被改变,并且我不想建立前导空格或尾随空格,因为生成的替换字符串将被加载,清理并保存很多次。
非常感谢任何头痛解决者。
d
编辑:
< div id =wrap>< h1 id =yui_3_3_0_1_1296942015298202class =yui3-dd-drop>资源1标题< / h1>
< p id =yui_3_3_0_1_1296942015298219class =yui3-dd-drop> Lorem ipsum dolor sit amet,< a href =javacript :; id =yui_3_3_0_1_1296942015298236class =yui3-dd-drop> consectetur adipiscing< / a> ELIT。 Proin et sem leo,sed luctus nisi。 Suspendisse pharetra iaculis laoreet。 Pellentesque vulputate malesuada auctor。 Integer laoreet ultricies nunc facilisis adipiscing。< / p>
< div class =widget revealer>
< p> Revealer widget。< / p>
< script type =text / javascript>
document.RevealerConfig = true;
< / script>
< / div>
< div class =widget quiz safeClassid =safeId>
< p>测验小部件。< / p>
< script type =text / javascript>
document.QuizConfig = true;
< / script>
< / div>
< div class =snippet yui3-dd-dropid =yui_3_3_0_1_1296942015298253>
Vestibulum fermentum,justo id porta suscipit,velit lorem hendrerit nisi,id tincidunt lectus ante quis lacus。 Proin et erat坐amet turpis euismod dictum简历metus。
< div class =widget table>
< p>表格小部件。< / p>
< table width =80%border =1>
< tbody>< tr>
< td> 1< / td>
< td> 2< / td>
< td> 3< / td>
< / tr>
< tr>
< td> 4< / td>
< td> 5< / td>
< td> 6< / td>
< / tr>
< tr>
< td> 7< / td>
< td> 8< / td>
< td> 9< / td>
< / tr>
< / tbody>< / table>
< / div>< / div>
< p id =yui_3_3_0_1_1296942015298270class =yui3-dd-drop> Proin et sem leo,sed luctus nisi。 Suspendisse pharetra iaculis laoreet。 Pellentesque vulputate; laoreet ultricies nunc facilisis adipiscing ultricies nunc。< / p>
< div class =widget table>
< p>表格小部件。< / p>
< table width =80%border =1>
< tbody>< tr>
< td> 1< / td>
< td>
< ul>
< li> 1< / li>
< li> 2< ul>< li id =yui_2_0_0_1>嵌套< / li>< / ul>< / li>
< / ul>
< / td>
< td> 3< / td>
< / tr>
< tr>
< td> 4< / td>
< td> 5< / td>
< td> 6< / td>
< / tr>
< tr>
< td class =yui2-dd-drop yui3-dd-drop> 7< / td>
< td class =yui2-dd-drop yui3-dd-drop> 8< / td>
< td class =yui2-dd-drop yui3-dd-drop> 9< / td>
< / tr>
< / tbody>< / table>
< / div>
< / div>
希望以上都很好,不要你可以试试这个怪物:
var dirty ='class =yui3-dd-shimclass =safeClass yui3-dd-shimclass =yui3-dd -shim safeClass';
var clean = dirty.replace(/ class =yui [0-9] - [^ \s] +| \s?yui [0-9] - [^ \ s] + \ s?| id =yui_ [0-9] [^] +/ gi,'');
对您的样本数据进行测试,似乎可以胜任。
I'm after several, really safe regex patterns for JavaScript's .replace method. The input is a serialized DOM string, and I am wanting to remove all YUI3 classNames and YUI3 generated id attributes.
var resourceDOMStr = Y.DataType.XML.format( Y.Node.getDOMNode(this.getIframeDOMContainer()).innerHTML );
alert('unsanitized markup:\n\n'+resourceDOMStr );
// Remove YUI-added id's and classes
// regex to remove ' id="*"'
// regex to remove entire class attr: ' class="'yui3-*'"'
// regex to remove className + trailing space: class="'yui3-* 'safeClass"
// regex to remove className + leading space: class="safeClass' yui3-*'"
resourceDOMStr.replace('', '');
alert('sanitized markup:\n\n'+resourceDOMStr );
So yeah, I'd like to be clean and remove the entire id attribute, whose value will always begin with 'yui_3', eg; id="yui_3_3_0_1_1296949124608175"
. Also, I want to remove an entire class attribute if the only class it has is a YUI3-generated className, otherwise I just want to remove the YUI3 className and any leading/trailing spaces. The generated classNames will always begin with 'yui3-', examples;
class="yui3-dd-shim"
class="safeClass yui3-dd-shim"
class="yui3-dd-shim safeClass"
...where I don't want 'safeClass' to be altered, and I don't want a build-up of leading/trailing spaces, as the resulting replaced String will be loaded, cleaned and saved many times over.
Many thanks for any headache-solvers.d
EDIT:
<div id="wrap"><h1 id="yui_3_3_0_1_1296942015298202" class="yui3-dd-drop">Resource 1 Title</h1>
<p id="yui_3_3_0_1_1296942015298219" class="yui3-dd-drop">Lorem ipsum dolor sit amet, <a href="javacript:;" id="yui_3_3_0_1_1296942015298236" class="yui3-dd-drop">consectetur adipiscing</a> elit. Proin et sem leo, sed luctus nisi. Suspendisse pharetra iaculis laoreet. Pellentesque vulputate malesuada auctor. Integer laoreet ultricies nunc facilisis adipiscing.</p>
<div class="widget revealer">
<p>Revealer widget.</p>
<script type="text/javascript">
document.RevealerConfig = true;
</script>
</div>
<div class="widget quiz safeClass" id="safeId">
<p>Quiz widget.</p>
<script type="text/javascript">
document.QuizConfig = true;
</script>
</div>
<div class="snippet yui3-dd-drop" id="yui_3_3_0_1_1296942015298253">
Vestibulum fermentum, justo id porta suscipit, velit lorem hendrerit nisi, id tincidunt lectus ante quis lacus. Proin et erat sit amet turpis euismod dictum vitae a metus.
<div class="widget table">
<p>Table widget.</p>
<table width="80%" border="1">
<tbody><tr>
<td>1</td>
<td>2</td>
<td>3</td>
</tr>
<tr>
<td>4</td>
<td>5</td>
<td>6</td>
</tr>
<tr>
<td>7</td>
<td>8</td>
<td>9</td>
</tr>
</tbody></table>
</div></div>
<p id="yui_3_3_0_1_1296942015298270" class="yui3-dd-drop">Proin et sem leo, sed luctus nisi. Suspendisse pharetra iaculis laoreet. Pellentesque vulputate; laoreet ultricies nunc facilisis adipiscing ultricies nunc.</p>
<div class="widget table">
<p>Table widget.</p>
<table width="80%" border="1">
<tbody><tr>
<td>1</td>
<td>
<ul>
<li>1</li>
<li>2<ul><li id="yui_2_0_0_1">nested</li></ul></li>
</ul>
</td>
<td>3</td>
</tr>
<tr>
<td>4</td>
<td>5</td>
<td>6</td>
</tr>
<tr>
<td class="yui2-dd-drop yui3-dd-drop">7</td>
<td class="yui2-dd-drop yui3-dd-drop">8</td>
<td class="yui2-dd-drop yui3-dd-drop">9</td>
</tr>
</tbody></table>
</div>
</div>
Hopefully the above is all good, don't pick it apart too readily - as stated in comment below, its sample html.
You could try this monstrosity:
var dirty = 'class="yui3-dd-shim" class="safeClass yui3-dd-shim" class="yui3-dd-shim safeClass"';
var clean = dirty.replace(/class="yui[0-9]-[^\s]+"|\s?yui[0-9]-[^\s"]+\s?|id="yui_[0-9][^"]+"/gi, '');
Tested it on your sample data, seemed to do the job.
这篇关于安全的正则表达式来清理序列化的DOM?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!