html - 用REGEX替换HTML文本中的每个双引号

我正在用ASP.NET编写一个web应用程序。我需要正则表达式的帮助。我需要两个表达式，第一个可以帮助我获取并最终用单引号替换HTML标记中的每个双引号字符，第二个可以用"获取并替换HTML标记中不包含的每个双引号字符。
例如：
This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>
应该这样改变。
This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <a href='http://wwww.site-to-nowhere.com' target='_blank'>link</a>
我试过以下表达式：

"([^<>]*?)"(?=[^>]+?<)

但问题是它无法捕获"Another wonderful ong text"可能是因为它位于标记的旁边。
你能帮我解决这个问题吗？或者，在.NET中是否有其他解决此替换问题的解决方案？

最佳答案

Don't use regex to parse HTML。我可以推荐HtmlAgilityPack：

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);  // html is your HTML-string
var textNodes = doc.DocumentNode.SelectNodes("//text()");
foreach (HtmlAgilityPack.HtmlTextNode node in textNodes)
{
    node.Text = node.Text.Replace("\"", "&quot;");
}
StringWriter sw = new StringWriter();
doc.Save(sw);
string result = sw.ToString();

我已经用你的HTML示例测试过了，这是（期望的）结果：

<p>This is a &quot;wonderful long text&quot;. &quot;Another wonderful ong text&quot;</p> At least it should be. Here we have a <a href="http://wwww.site-to-nowhere.com" target="_blank">link</a>