问题描述
我有一个博客系统,用户必须将内容输入html文本区域,包括html标签,例如< p>
.这存储在数据库中.如果随后使用php将此输入回显到网页,那么我如何才能转义输出以防XSS,但要保留html标记的含义,以便正确设置博客文章的格式?如果我使用 htmlentities($ blog_content)
,它将按字面意义将html标签打印到页面上,因此您会看到< p>您好,这是博客</p>
.
I have a blog system, and the user has to input content into an html text area, including html tags such as <p>
. This is stored in a database. If this input is then echoed to a web page using php, how can I escape the output to protect against XSS, but preserve the meaning of the html tags, so that the blog post is formatted correctly? If I use htmlentities($blog_content)
it literally prints the html tags to the page, so you see <p>hello this is a blog</p>
.
这可能吗?
推荐答案
您想要的是选择性过滤或消毒.换句话说,您想要允许 some HTML,但不允许其他可能是恶意的标签.这是一项非常棘手的业务,尤其是因为HTML语法非常复杂,并且过于简单的清理尝试容易产生错误,无论如何都允许通过格式错误的HTML注入标签.
What you want is selective filtering or sanitization. In other words, you want to allow some HTML, but not other, possibly malicious tags. This is very tricky business, especially since HTML syntax is very complex and overly simple sanitization attempts are prone to errors which allow injection of tags through malformed HTML anyway.
如果可能的话,您应该完全不要让用户提交HTML.使用一种特殊的标记语言,例如Wiki标记,Markdown,BBcodes或类似标记.
If possible, you should stay away from letting your users submit HTML at all. Use a special markup language like Wiki markup, Markdown, BBcodes or similar.
如果您确定自己在做什么,则应该选择一个良好的,经过良好测试的,健壮的库,它提供了此类清理功能.我唯一知道 HTML Purifier .
If you are sure what you're doing, you should choose a good, well tested, robust library that provides such sanitization functions. HTML Purifier is the only one I know that fits this description.
这篇关于如何安全输出包含HTML标签的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!