问题描述
我正在寻找一种最有效的方式来接受一个字符串并将其标记为一个数组,以将任何 HTML 标记组分开.
I am looking for the most efficient way to accept a string and token ize it into an array separating out any HTML tag groups.
Example Input (String):
"I can format my text so that <strong>This is bold</strong> and this is not."
Desired Output (String[] array):
"I can format my text so that",
"<strong>",
"This is bold",
"</strong>",
"and this is not."
Alternate Output Just As Good(String[] array):
"I",
"can",
"format",
"my",
"text",
"so",
"that",
"<strong>",
"This",
"is",
"bold",
"</strong>",
"and",
"this",
"is",
"not."
我不确定解决此问题的最佳方法.任何帮助将不胜感激.
I am unsure as to the best way to approach this problem. Any help would be appreciated.
推荐答案
您可以使用带有一组零长度断言的 Regex.Split()
来分割位置,然后是 <
或前面有 >
:
You can use Regex.Split()
with a set of zero-length assertions to split in places followed by <
or preceded by >
:
string input = "I can format my text so that <strong>This is bold</strong> and this is not.";
string[] output = Regex.Split(input, "(?=<)|(?<=>)");
(?=pattern)
被称为先行断言,确保遵循 pattern
.(?<=pattern)
是一个后视断言,相同的概念,但在位置之前查看字符
(?=pattern)
is known as a look-ahead assertion, ensuring that pattern
follows.(?<=pattern)
is a look-behind assertion, same concept but looking at characters before the position
这篇关于将字符串标记或拆分为文本 &Html 标签项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!