本文介绍了解析字符串:提取单词和短语 [JavaScript]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在以空格分隔的术语列表中支持精确的短语(括在引号中).因此,用空格字符分割相应的字符串是不够的.

I need to support exact phrases (enclosed in quotes) in an otherwise space-separated list of terms.Thus splitting the respective string by the space-character is not sufficient anymore.

示例:

input : 'foo bar "lorem ipsum" baz'
output: ['foo', 'bar', 'lorem ipsum', 'baz']

我想知道这是否可以通过单个 RegEx 来实现,而不是执行复杂的解析或拆分和重新连接操作.

I wonder whether this could be achieved with a single RegEx, rather than performing complex parsing or split-and-rejoin operations.

任何帮助将不胜感激!

推荐答案

var str = 'foo bar "lorem ipsum" baz';  
var results = str.match(/("[^"]+"|[^"s]+)/g);

... 返回您要查找的数组.
但请注意:

... returns the array you're looking for.
Note, however:

  • 包含边界引号,因此可以在结果中使用 replace(/^"([^"]+)"$/,"$1") 删除.
  • 引号之间的空格将保持不变.所以,如果 loremipsum 之间有三个空格,它们就会出现在结果中.您可以通过对结果运行 replace(/s+/," ") 来解决此问题.
  • 如果在 ipsum 之后没有结束 "(即错误引用的短语),您将得到:['foo', 'bar', 'lorem', 'ipsum', 'baz']
  • Bounding quotes are included, so can be removed with replace(/^"([^"]+)"$/,"$1") on the results.
  • Spaces between the quotes will stay intact. So, if there are three spaces between lorem and ipsum, they'll be in the result. You can fix this by running replace(/s+/," ") on the results.
  • If there's no closing " after ipsum (i.e. an incorrectly-quoted phrase) you'll end up with: ['foo', 'bar', 'lorem', 'ipsum', 'baz']

这篇关于解析字符串:提取单词和短语 [JavaScript]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-10 05:34