问题描述
我正在处理一些代码以读取包含XML声明的XML片段,例如<?xml version="1.0" encoding="utf-8"?>
并解析编码.从 MSDN 中,我应该能够为此:
I'm working on some code to read an XML fragment which contains an XML declaration, e.g. <?xml version="1.0" encoding="utf-8"?>
and parse the encoding. From MSDN, I should be able to do it like this:
var nt = new NameTable();
var mgr = new XmlNamespaceManager(nt);
var context = new XmlParserContext(null, mgr, null, XmlSpace.None);
var reader = new System.Xml.XmlTextReader(@"<?xml version=""1.0"" encoding=""UTF-8""?>",
System.Xml.XmlNodeType.XmlDeclaration, context);
但是,在对System.Xml.XmlTextReader
构造函数的调用中出现System.Xml.XmlException
并出现错误消息:
However, I'm getting a System.Xml.XmlException
on the call to the System.Xml.XmlTextReader
constructor with an error message:
我已经用引号对这个错误进行了搜索-找到的结果恰好为零(现在有一个结果:该帖子)-并且没有引号,这没有任何用处.我还查看了 MSDN XmlNodeType ,并且没有说明它不受支持.
I've googled this error in quotes -- exactly zero results found (edit: now there's one result: this post) -- and without quotes, which yields nothing useful. I've also looked at MSDN for the XmlNodeType, and it doesn't say anything about it not being supported.
我在这里想念什么? 如何从XML声明片段中获取XmlTextReader
实例?
What am I missing here? How can I get an XmlTextReader
instance from an XML declaration fragment?
请注意,我的目标只是确定部分构建的XML文档的编码,在此我假设它至少包含一个声明节点.因此,我正在尝试获取reader.Encoding
.如果还有其他方法可以做到,那么我很乐意.
Note, my goal here is just to determine the encoding of a partially-built XML document where I'm making the assumption that it at least contains a declaration node; thus, I'm trying to get reader.Encoding
. If there's another way to do that, I'm open to that.
目前,我正在使用正则表达式手动解析声明,这不是最好的方法.
At present, I'm parsing the declaration manually using regex, which is not the best approach.
推荐答案
更新:从XML文档或XML片段获取编码:
这是一种使用 XmlReader.Create .
private static string GetXmlEncoding(string xmlString)
{
if (string.IsNullOrWhiteSpace(xmlString)) throw new ArgumentException("The provided string value is null or empty.");
using (var stringReader = new StringReader(xmlString))
{
var settings = new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment };
using (var xmlReader = XmlReader.Create(stringReader, settings))
{
if (!xmlReader.Read()) throw new ArgumentException(
"The provided XML string does not contain enough data to be valid XML (see https://msdn.microsoft.com/en-us/library/system.xml.xmlreader.read)");
var result = xmlReader.GetAttribute("encoding");
return result;
}
}
}
以下是输出,其中包含完整的XML片段:
Here's the output, with a full and fragment XML:
如果要使用System.Text.Encoding,可以将代码修改为如下所示:
If you want to have System.Text.Encoding, you can modify the code to look like this:
private static Encoding GetXmlEncoding(string xmlString)
{
using (StringReader stringReader = new StringReader(xmlString))
{
var settings = new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment };
var reader = XmlReader.Create(stringReader, settings);
reader.Read();
var encoding = reader.GetAttribute("encoding");
var result = Encoding.GetEncoding(encoding);
return result;
}
}
旧答案:
如您所述, XmlTextReader的Encoding 属性包含编码.
As you mentioned, XmlTextReader's Encoding-property contains the encoding.
以下是完整的控制台应用程序的源代码,希望对您有用:
Here's a full Console app's source code which hopefully is useful:
class Program
{
static void Main(string[] args)
{
var asciiXML = @"<?xml version=""1.0"" encoding=""ASCII""?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>";
var utf8XML = @"<?xml version=""1.0"" encoding=""UTF-8""?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>";
var asciiResult = GetXmlEncoding(asciiXML);
var utfResult = GetXmlEncoding(utf8XML);
Console.WriteLine(asciiResult);
Console.WriteLine(utfResult);
Console.ReadLine();
}
private static Encoding GetXmlEncoding(string s)
{
var stream = new MemoryStream(Encoding.UTF8.GetBytes(s));
using (var xmlreader = new XmlTextReader(stream))
{
xmlreader.MoveToContent();
var encoding = xmlreader.Encoding;
return encoding;
}
}
}
这是程序的输出:
如果您知道XML仅包含声明,也许您可以添加一个空的根?例如:
If you know that the XML only contains the declaration, maybe you can add an empty root? So for example:
var fragmentResult = GetXmlEncoding(xmlFragment + "<root/>");
这篇关于从XML声明片段获取XML编码:部分内容解析不支持XmlDeclaration的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!