问题描述
我正在尝试使用 XmlSerializer 反序列化 .NET 的 XML 文档注释的输出.作为参考,xml 文档的输出如下所示:
I am trying to deserialize the output of .NET's XML doc comment using an XmlSerializer. For reference, the output of xml documentation looks like:
<?xml version="1.0"?>
<doc>
<assembly>
<name>Apt.Lib.Data.Product</name>
</assembly>
<members>
<member name="P:MyNamespace.MyType.MyProperty">
<summary>See <see cref="T:MyNamespace.MyOthertype"/> for more info</summary>
</member>
...
</members>
</doc>
我用来生成序列化程序的对象是:
The object I'm using to generate the serializer is:
[XmlRoot("doc")]
public class XmlDocumentation
{
public static readonly XmlSerializer Serializer = new XmlSerializer(typeof(XmlDocumentation));
[XmlElement("assembly")]
public AssemblyName Assembly { get; set; }
[XmlArray("members")]
[XmlArrayItem("member")]
public List<Member> Members { get; set; }
public class AssemblyName
{
[XmlElement("name")]
public string Name { get; set; }
}
public class Member
{
[XmlAttribute("name")]
public string Name { get; set; }
[XmlElement("summary")]
public string Summary { get; set; }
}
}
问题在于序列化程序遇到嵌入的 see cref 标记时.在这种情况下,序列化程序会抛出以下异常:
The problem is when the serializer encounters the embedded see cref tag. In that case the serializer throws the following exception:
System.InvalidOperationException : XML 文档中存在错误(147, 27).----> System.Xml.XmlException:意外的节点类型元素.ReadElementString 方法只能在元素上调用简单或空洞的内容.第 147 行,位置 27.
如何在反序列化过程中将摘要标签的全部内容捕获为字符串?
How can I capture the entire content of the summary tag as a string during deserialization?
推荐答案
cref 标签本身包含非法字符.特别是 <, > 不能嵌入到 XML 元素的内容中.您应该在序列化或反序列化字符串之前对其进行清理.
The cref tag itself contains illegal characters. Specifically <, > can't be embedded in the contents of an XML element. You should sanitize the strings before they are serialized or deserialized.
如果您需要能够将特定规则应用于某些字符的转义或替换方式,您可以执行以下操作:
You can do something like this if you need to be able to apply specific rules to how certain characters are escaped or substituted:
string ScrubString(string dirty)
{
char[] charArray = dirty.ToCharArray();
StringBuilder strBldr = new StringBuilder(dirty.Length);
for (int i = 0; i < charArray.Length; i++)
{
if(IsXmlSafe(charArray[i]))
{
strBldr.Append(charArray[i]);
}
else
{
//do something to escape or replace that character.
}
}
retrun strBldr.ToString();
}
bool IsXmlSafe(char c)
{
int charInt = Convert.ToInt32(c);
return charInt == 9
|| charInt == 13
|| (charInt >= 32 && charInt <= 9728)
|| (charInt >= 9983 && charInt <= 55295)
|| (charInt >= 57344 && charInt <= 65533)
|| (charInt >= 65536 && charInt <= 1114111);
}
您还可以使用此处的一些方法来使用正则表达式删除任何非法字符:
You can also use some of the approaches here to just remove any illegal character using regex:
这篇关于C# 如何反序列化嵌入在文本中的 xml 标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!