非法字符在xml文档

非法字符在xml文档

本文介绍了非法字符在xml文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个从数据生成的XML文件出数据库的计划。总之code将执行以下操作:

I have a program that is generating Xml Files from data out of a database. In short code it does the following:

string dsn = "a db connection string";
XmlDocument d = new XmlDocument();
using (SqlConnection con = new SqlConnection(dsn)) {
    con.Open();
    string sql = "select id as Id, comment as Comment from Test where ... ";
    using (SqlCommand cmd = new SqlCommand(sql, con)) {
        DataSet ds = new DataSet("EXPORT");
        SqlDataAdapter da = new SqlDataAdapter(cmd);
        da.Fill(ds, "Test");
        d.LoadXml(ds.GetXml());
    }
}
d.Save(@"c:\test.xml");

当我看一下XML文件所包含的无效字符和放大器; #×1 A;

When I have a look at the xml file it contains the invalid character & # x 1 A ;

<EXPORT>
  <Test>
    <Id>2</Id>
    <Comment> Keyboard NB&#x1A;5 linked</Comment>
  </Test>
</EXPORT>

这xml文件不能被火狐浏览器中打开话说无效字符...

This xml file cannot be opened by firefox browser saying invalid character ...

这是实体被保留在ISO 8859-1和CP1252,不应该被浏览器渲染。但是,为什么XmlDocument的输出XML无法解析为有效的 - 或者是它只是无法通过浏览器或使用Excel等进口解析有效的XML文档...是否有摆脱了保留无效字符或在某种程度上编码它们的简单的方法是浏览器没有出了问题?

That Entity is reserved in ISO 8859-1 and CP1252 and should not be rendered by browsers. But why does XmlDocument output xml that cannot be parsed as valid - or is it a valid xml document that just cannot be parsed by Browsers or imported by Excel and so on ...Is there a easy way of getting rid of that reserved 'invalid characters' or encoding them in a way that Browsers do not have a Problem with it?

非常感谢您的意见和的窍门

Many thanks for your opinion and tipps

推荐答案

不是所有的字符都重新presentable的XML。

Not all characters are representable in XML.

在XML 1.0,没有任何的字符少于0x20的值可以被使用,除了用于TAB(×09),LF(的0x0A)和CR符(0x0D)。

In XML 1.0, none of the characters with values less than 0x20 can be used, except for TAB (0x09), LF (0x0A) and CR (0x0D).

在XML 1.1,几乎任何东西,除了NUL(0×00)都可以使用。

In XML 1.1, just about anything except NUL (0x00) can be used.

如果您需要使用XML 1.1,,并选择的接收程序支持XML 1.1(不是很多做),那么你就可以逃脱0x1A的如&放大器;#26 ; &放大器;#X1A;

If you have the option to use XML 1.1, and the receiving program supports XML 1.1 (not many do), then you can escape the 0x1A as &#26; or &#x1A;.

包装在 CDATA 不是解决办法要么; CDATA 只是为了方便逃逸字符组不同于标准和放大器; -mechanism

Wrapping it in CDATA is not a solution either; CDATA is just a convenience for escaping groups of characters differently than the standard &-mechanism.

否则,你将需要之前先将其删除序列化。

Otherwise, you will need to remove it prior to serializing.

这篇关于非法字符在xml文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 23:11