问题描述
我试图使用< cfxml>
创建一个xml对象。我使用 XMLFormat()
格式化所有数据。
在XML中有一些无效的字符,如'»'。我将这个字符添加到xml的doctype如下:
<!ENTITY raquo»
HTML文本格式不太好,但大多数都适用于我的代码。但在一些文本中有一些控制字符。我收到以下错误:
在文档的元素内容中找到无效的XML字符(Unicode:0x13)。
我试图添加unicode到doctype,我试过这个。两个都不工作...
这里有效的cfscript代码清理我们的XML,有两个方法,一个清除较高的国际字符,并且只清除较低的ASCII字符,这会破坏我们的XML,如果您发现更多字符,只需扩展过滤规则。
< cfscript>
function cleanHighAscii(text){
var buffer = createObject(java,java.lang.StringBuffer)。
var pattern = createObject(java,java.util.regex.Pattern)compile(javaCast(string,[^ \x00 -\x7F]));
var matcher = pattern.Matcher(javaCast(string,text));
while(matcher.find()){
var value = matcher.group();
var asciiValue = asc(value);
if((asciiValue == 8220)OR(asciiValue == 8221))
value =;
else if((asciiValue == 8216)||(asciiValue == 8217))
value =';
else if(asciiValue == 8230)
value =...;
else
value =& ### asciiValue#;;
matcher.AppendReplacement(buffer,javaCast(string,value));
}
matcher.AppendTail(buffer);
return buffer.ToString();
}
function removeSubAscii(text){
return rereplaceNoCase(text,\x1A,& ### 26#所有);
}
function XMLSafe(text){
text = cleanHighAscii(text);
text = removeSubAscii(text);
return text;
}
< / cfscript>
其他posisbilty是用户CF10函数encodeForXML():
或直接使用CF10附带的ESAPI,或者从OWASP网站:
<$ $ p>
var esapi = createObject(java,org.owasp.esapi.ESAPI);
var esapiEncoder = esapi.encoder();
return esapiEncoder.encodeForXML(text);
I'm trying to create an xml object using <cfxml>
. I formatted all the data with XMLFormat()
.In XML there are some invalid characters like '»'. I added this chars to the xml doctype as follow:
<!ENTITY raquo "»">
The HTML text is not very well formatted, but most of it works with my code. But in some texts there are some control chars. I'm getting the following error:
An invalid XML character (Unicode: 0x13) was found in the element content of the document.
I tried to add the unicode to the doctype and I tried this solution. Both didn't work...
Here's valid cfscript code which cleans up our XML, there are two methods, one which clears higher international characters, and one which clears only lower ASCII character which was breaking our XML, if you find more characters, just expand filter rules.
<cfscript>
function cleanHighAscii(text){
var buffer = createObject("java", "java.lang.StringBuffer").init();
var pattern = createObject("java", "java.util.regex.Pattern").compile(javaCast( "string", "[^\x00-\x7F]" ));
var matcher = pattern.Matcher(javaCast( "string", text));
while(matcher.find()){
var value = matcher.group();
var asciiValue = asc(value);
if ((asciiValue == 8220) OR (asciiValue == 8221))
value = """";
else if ((asciiValue == 8216) || (asciiValue == 8217))
value = "'";
else if (asciiValue == 8230)
value = "...";
else
value = "&###asciiValue#;";
matcher.AppendReplacement(buffer, javaCast( "string", value ));
}
matcher.AppendTail(buffer);
return buffer.ToString();
}
function removeSubAscii(text){
return rereplaceNoCase(text, "\x1A","&###26#;", "all");
}
function XMLSafe(text){
text = cleanHighAscii(text);
text = removeSubAscii(text);
return text;
}
</cfscript>
Other posisbilty is to user CF10 funciton encodeForXML():
https://learn.adobe.com/wiki/display/coldfusionen/EncodeForXML
Or use ESAPI which comes with CF10 directly or add ESAPI jars to your older CF from OWASP site https://www.owasp.org/index.php/ESAPI_Overview :
var esapi = createObject("java", "org.owasp.esapi.ESAPI");
var esapiEncoder = esapi.encoder();
return esapiEncoder.encodeForXML(text);
这篇关于ColdFusion:无效的XML控制字符(十六进制)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!