问题描述
我们的C ++应用程序从如下所示的XML文件中读取配置数据:
Our C++ application reads configuration data from XML files that look something like this:
<data>
<value id="FOO1" name="foo1" size="10" description="the foo" ... />
<value id="FOO2" name="foo2" size="10" description="the other foo" ... />
...
<value id="FOO300" name="foo300" size="10" description="the last foo" ... />
</data>
完整的应用程序配置包含约2500个这些XML文件(转换为超过150万个键/值属性对). XML文件来自许多不同的来源/团队,并且已针对模式进行了验证.但是,有时<value/>
节点看起来像这样:
The complete application configuration consist of ~2500 of these XML files (which translates into more than 1.5 million key/value attribute pairs). The XML files come from many different sources/teams and are validated against a schema. However, sometimes the <value/>
nodes look like this:
<value name="bar1" id="BAR1" description="the bar" size="20" ... />
或者这个:
<value id="BAT1" description="the bat" name="bat1" size="25" ... />
为了快速完成此过程,我们使用 Expat 来解析XML文档. Expat将属性公开为数组-像这样:
To make this process fast, we are using Expat to parse the XML documents. Expat exposes the attributes as an array - like this:
void ExpatParser::StartElement(const XML_Char* name, const XML_Char** atts)
{
// The attributes are stored in an array of XML_Char* where:
// the nth element is the 'key'
// the n+1 element is the value
// the final element is NULL
for (int i = 0; atts[i]; i += 2)
{
std::string key = atts[i];
std::string value = atts[i + 1];
ProcessAttribute (key, value);
}
}
这将所有责任赋予我们的ProcessAttribute()
函数以读取键"并决定如何处理该值. 对应用程序进行性能分析表明,在XML解析总时间中,约有40%的时间是通过名称/字符串来处理这些属性的.
This puts all the responsibility onto our ProcessAttribute()
function to read the 'key' and decide what to do with the value. Profiling the app has shown that ~40% of the total XML Parsing time is dealing with these attributes by name/string.
如果我可以保证/强制执行属性的顺序,则整个过程可能会大大加快(对于初学者,ProcessAttribute()
中没有字符串比较).例如,如果'id'属性始终为 第一个属性,我们可以直接对其进行处理:
The overall process could be sped up dramatically if I could guarantee/enforce the order of the attributes (for starters, no string comparisons in ProcessAttribute()
). For example, if 'id' attribute was always the 1st attribute we could deal with it directly:
void ExpatParser::StartElement(const XML_Char* name, const XML_Char** atts)
{
// The attributes are stored in an array of XML_Char* where:
// the nth element is the 'key'
// the n+1 element is the value
// the final element is NULL
ProcessID (atts[1]);
ProcessName (atts[3]);
//etc.
}
根据W3C模式规范,我可以在XML模式中使用<xs:sequence>
来强制元素的顺序-但它似乎不适用于属性-也许我使用的方式不正确:
According to the W3C schema specs, I can use <xs:sequence>
in an XML schema to enforce the order of elements - but it doesn't seem to work for attributes - or perhaps I'm using it incorrectly:
<xs:element name="data">
<xs:complexType>
<xs:sequence>
<xs:element name="value" type="value_type" minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="value_type">
<!-- This doesn't work -->
<xs:sequence>
<xs:attribute name="id" type="xs:string" />
<xs:attribute name="name" type="xs:string" />
<xs:attribute name="description" type="xs:string" />
</xs:sequence>
</xs:complexType>
有没有一种方法可以在XML文档中强制执行属性顺序?如果答案是否",那么谁能建议一个不会带来巨大运行时性能损失的替代方案?
Is there a way to enforce attribute order in an XML document? If the answer is "no" - could anyone perhaps suggest a alternative that wouldn't carry a huge runtime performance penalty?
推荐答案
根据xml规范
您可以在第3.1节
这篇关于我可以使用模式强制执行XML属性的顺序吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!