问题描述
我有,我从Web服务拉着XML数据的字符串。这些数据是丑陋的,并已在XML的名称标记一些无效字符。例如,我可以看到类似这样的:
I have a string with xml data that I pulled from a web service. The data is ugly and has some invalid chars in the Name tags of the xml. For example, I may see something like:
<Author>Scott the Coder</Author><Address#>My address</Address#>
在地址名称字段#是无效的。我要寻找一个正则表达式将从名称标签删除所有无效字符但保留所有字符在XML中的价值部分。换句话说,我想用正则表达式来remvove字符只能从打开名称标签和关闭名称标签。 。一切应保持不变的情况
The # in the Address name field is invalid. I am looking for a regular expression that will remove all the invalid chars from the name tags BUT leave all the chars in the Value section of the xml. In other words, I want to use RegEx to remvove chars only from the opening name tags and closing name tags. Everything else should remaing the same.
我没有全部无效字符呢,但是这将让我开始:#{}及()
I don't have all the invalid chars yet, but this will get me started: #{}&()
是否有可能做什么,我试图做?
Is it possible to do what I am trying to do?
推荐答案
我有一个简单的表格有两个文本区域和一个按钮。这似乎这样的伎俩
I had a simple form with two text areas and one button. This seems to do the trick.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Text.RegularExpressions;
namespace WindowsFormsApplication3
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
Regex r = new Regex(@"(?<=\<\w+)[#\{\}\(\)\&](?=\>)|(?<=\</\w+)[#\{\}\(\)\&](?=\>)");
textBox2.Text = r.Replace(textBox1.Text, new MatchEvaluator(deleteMatch));
}
string deleteMatch(Match m) { return ""; }
}
}
这篇关于除去XML名称标签无效字符 - 正则表达式C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!