本文介绍了如何去除字符串中的特定标签和特定属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这就是交易,我正在制作一个项目来帮助人们教HTML。自然,我害怕Scumbag Steve(见图1)。因此,我想阻止所有 HTML标签标签, 白名单 中批准的除外。在批准的HTML标记中,还想删除有害的属性。如 onload
和 onmouseover
。此外,根据白名单 。
我想过正则表达式,但我很确定它是邪恶,对工作没什么帮助。
任何人都可以给我一个正确的方向吗?
提前致谢。
图1.
解决方案
require_once'library / HTMLPurifier。 auto.php;
$ config = HTMLPurifier_Config :: createDefault();
//这个是需要的,否则就会导致
//被认为是有害的,像input一样会被自动删除
$ config-> set('HTML.Trusted',true) ;
//这行代表只有input,p,div会被接受
$ config-> set('HTML.AllowedElements','input,p,div');
//为每个标记设置属性
$ config-> set('HTML.AllowedAttributes','input.type,input.name,p.id,div.style') ;
//更广泛的管理属性和元素的方式...查看文档
// http://htmlpurifier.org/live/configdoc/plain.html
$ def = $ config-> getHTMLDefinition(true);
$ def-> addAttribute('input','type','Enum#text');
$ def-> addAttribute('input','name','Text');
//调用...
$ purifier = new HTMLPurifier($ config);
//显示...
$ html = $ purifier-> purify($ raw_html);
Here's the deal, I'm making a project to help teach HTML to people. Naturally, I'm afraid of that Scumbag Steve (see figure 1).
So I wanted to block ALL HTML tags, except those approved on a very specific whitelist.
Out of those approved HTML tags, I want to remove harmful attributes as well. Such as onload
and onmouseover
. Also, according to a whitelist.
I've thought of regex, but I'm pretty sure it's evil and not very helpful for the job.
Could anyone give me a nudge in the right direction?
Thanks in advance.
Fig 1.
解决方案
require_once 'library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
// this one is needed cause otherwise stuff
// considered harmful like input's will automatically be deleted
$config->set('HTML.Trusted', true);
// this line say that only input, p, div will be accepted
$config->set('HTML.AllowedElements', 'input,p,div');
// set attributes for each tag
$config->set('HTML.AllowedAttributes', 'input.type,input.name,p.id,div.style');
// more extensive way of manage attribute and elements... see the docs
// http://htmlpurifier.org/live/configdoc/plain.html
$def = $config->getHTMLDefinition(true);
$def->addAttribute('input', 'type', 'Enum#text');
$def->addAttribute('input', 'name', 'Text');
// call...
$purifier = new HTMLPurifier($config);
// display...
$html = $purifier->purify($raw_html);
这篇关于如何去除字符串中的特定标签和特定属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!