问题描述
我有这样的xml文件
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book id="101">3.1256
<auth-name>Idris Polk</auth-name>
<auth id="a1">The author is a Professor of Physics at MIT</auth>
<ph ll="p1">336451234</ph> <ph ll="p2">336051294</ph> <mail>[email protected]</mail> <ph ll="p3">336133291</ph>
</book>
<book id="105">4.2250
<auth-name>Andre Olo</auth-name>
<auth id="a2">The research fellow at NSF</auth>
<ph ll="p101">336200316</ph>, <ph ll="p102">336151093</ph>, <ph ll="p103">336151094</ph>, <mail>[email protected]</mail> <ph ll="p111">336900336</ph>, <ph ll="p112">336154094</ph>, <ph ll="p113">336151098</ph>, <mail>[email protected]</mail>
</book>
<ebook id="1">4.2350
<auth-name>John Bart</auth-name>
<auth id="ae1">The research fellow at Caltech</auth>
<ph ll="p50">336200313</ph>, <ph ll="p51">336151090</ph>, <ph ll="p52">336851091</ph>, <ph ll="p53">336151097</ph>, <mail>[email protected]</mail> <ph ll="p111">336000311</ph>, <ph ll="p112">336224094</ph>
</ebook>
...
</books>
当有两个以上的节点ph
由空格分隔或由逗号分隔时,如何将具有特定父节点的属性ll
的节点ph
收集到集合中空格?如果任何其他字符/节点(或任何类型的字符串)落在一个ph
节点与下一个ph
节点之间,则该字符/节点将不被包含在集合中.前任.如果<book id="...">
节点包含<ph ll="1">...</ph> <ph ll="2">...</ph> <mail>...<mail> <ph ll="3">...</ph>
形式的ph
节点,则不会将其添加到集合中,但是,如果它们按<ph ll="1">...</ph> <ph ll="2">...</ph> <ph ll="3">...</ph> <mail>...<mail>
顺序排列,则应将<ph ll="1">...</ph> <ph ll="2">...</ph> <ph ll="3">...</ph>
作为单个元素添加到该集合,因为在给定的父节点中有2个以上的ph
节点仅由空格分隔..
How do I get the nodes ph
with attribute ll
of a particular parent node to a collection when there are more than 2 of the nodes ph
which are either separated by a whitespace or separated by a comma and a whitespace? If any other character/node(or any type of string) falls between one ph
node and the next ph
node then that will not be taken in the collection. e.x. if a <book id="...">
node contains ph
nodes in the fashion <ph ll="1">...</ph> <ph ll="2">...</ph> <mail>...<mail> <ph ll="3">...</ph>
then it won't be added to the collection, however if they are in the order <ph ll="1">...</ph> <ph ll="2">...</ph> <ph ll="3">...</ph> <mail>...<mail>
then <ph ll="1">...</ph> <ph ll="2">...</ph> <ph ll="3">...</ph>
should be added as a single element to the collection as there are more than 2 ph
nodes only separated by a whitespace in a given parent node..
显然很简单
var cls=doc.Descendants("ph")
.Where(Attribute("ll"));
不会.有人可以帮忙吗?
won't do. Can anyone help?
推荐答案
请尝试以下代码.我使用了xml linq以及帮助方法. :
Try code below. I used xml linq along with a help method. :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = @"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
var books = doc.Descendants("books").Elements().Select(x => new { book = x, sequence = TestChildren(x) }).Where(x => x.sequence != null).ToList();
string results = string.Join("\n", books.SelectMany(x => x.sequence).Select((x, i) => (i + 1).ToString() + ") " + string.Join("", x.Select(y => y.ToString()))));
Console.WriteLine(results);
Console.ReadLine();
}
static List<List<XElement>> TestChildren(XElement book)
{
List<List<XElement>> results = null;
List<XElement> children = book.Elements().ToList();
// get lls, make -1 if not ph
List<int> lls = children.Select(x => x.Name.LocalName != "ph" ? -1 : int.Parse(((string)x.Attribute("ll")).Substring(1))).ToList();
//check for 3 in a row incrementing
int startIndex = -1;
int numberInSequence = 0;
for (int i = 0; i < lls.Count() - 3; i++)
{
//test for 3 in a row
if ((lls[i] + 1 == lls[i + 1]) && (lls[i] + 2 == lls[i + 2]))
{
//if first sequency found set start index and lenght to 3
if (startIndex == -1)
{
startIndex = i;
numberInSequence = 3;
}
else
{
//increase length if more than 3
numberInSequence++;
}
}
else
{
//if a sequence has been found add to results
if (numberInSequence >= 3)
{
List<XElement> sequence = new List<XElement>(children.Skip(startIndex).Take(numberInSequence).ToList());
if (results == null) results = new List<List<XElement>>();
results.Add(sequence);
startIndex = -1;
numberInSequence = 0;
}
}
}
if (numberInSequence >= 3)
{
List<XElement> sequence = new List<XElement>(children.Skip(startIndex).Take(numberInSequence).ToList());
if (results == null) results = new List<List<XElement>>();
results.Add(sequence);
}
return results;
}
}
}
这篇关于如何获取具有相同名称和相同属性名称的节点进行收集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!