我正在尝试根据空格或某些符号(当前为*_-<>)拆分字符串。我将给出一些输入和输出示例:

"Hello how are you" -> [ "Hello", " ", "how", " ", "are", " ", "you" ]

"Hello *how* are *you*" -> [ "Hello", " ", "*how*", " ", "are", " ", "*you*" ]

"Hello *how*are_you_" -> [ "Hello", " ", "*how*", "are", "_you_" ]

"*how*are _you_ \*doing*_today_ hm?" -> [ "*how*", "are", " ", "_you_", " ", "\*doing*", "_today_", " ", "hm?"


不幸的是,分割空间会将*how*_are_这样的情况变成数组中的单个项目,而不是多个项目。

我也尝试过使用正则表达式进行拆分,但是不幸的是,它不保留每个单词周围的符号。

抱歉,这有点令人困惑。有解决这个问题的好方法吗?

最佳答案

除了使用split之外,一种选择是使用.match:要么匹配一个符号,然后匹配非该符号的字符,再匹配该符号,或者匹配非空格,非符号字符:



// Put the dash first, because it will be put into a character set:
const delims = '-*_<>';

// Construct a pattern like:
// ([-*_<>])(?:(?!\1).)+\1| |[^-*_<> ]+

const patternStr = String.raw
`([${delims}])(?:(?!\1).)+\1| |[^${delims} ]+`
const pattern = new RegExp(patternStr, 'g');

const doMatch = str => str.match(pattern);
console.log(doMatch("Hello how are you"));
console.log(doMatch("Hello *how*are_you_"));
console.log(doMatch("*how*are _you_ \*doing*_today_ hm?"));





([-*_<>])(?:(?!\1).)+\1|[^-*_<> ]+表示:


([-*_<>])(?:(?!\1).)+\1-第一次交替:


([-*_<>])-匹配并捕获初始定界符
(?:(?!\1).)+-后跟任何不是该初始定界符的字符
\1-再次跟随该初始定界符

\s第二轮:匹配一个空格
[^-*_<> ]+-第三轮:匹配任何不是定界符或空格的内容

10-07 13:10
查看更多