本文介绍了解析字符串:提取单词和短语 [JavaScript]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要在以空格分隔的术语列表中支持精确的短语(括在引号中).因此,用空格字符分割相应的字符串是不够的.
I need to support exact phrases (enclosed in quotes) in an otherwise space-separated list of terms.Thus splitting the respective string by the space-character is not sufficient anymore.
示例:
input : 'foo bar "lorem ipsum" baz'
output: ['foo', 'bar', 'lorem ipsum', 'baz']
我想知道这是否可以通过单个 RegEx 来实现,而不是执行复杂的解析或拆分和重新连接操作.
I wonder whether this could be achieved with a single RegEx, rather than performing complex parsing or split-and-rejoin operations.
任何帮助将不胜感激!
推荐答案
var str = 'foo bar "lorem ipsum" baz';
var results = str.match(/("[^"]+"|[^"s]+)/g);
... 返回您要查找的数组.
但请注意:
... returns the array you're looking for.
Note, however:
- 包含边界引号,因此可以在结果中使用
replace(/^"([^"]+)"$/,"$1")
删除. - 引号之间的空格将保持不变.所以,如果
lorem
和ipsum
之间有三个空格,它们就会出现在结果中.您可以通过对结果运行replace(/s+/," ")
来解决此问题. - 如果在
ipsum
之后没有结束"
(即错误引用的短语),您将得到:['foo', 'bar', 'lorem', 'ipsum', 'baz']
- Bounding quotes are included, so can be removed with
replace(/^"([^"]+)"$/,"$1")
on the results. - Spaces between the quotes will stay intact. So, if there are three spaces between
lorem
andipsum
, they'll be in the result. You can fix this by runningreplace(/s+/," ")
on the results. - If there's no closing
"
afteripsum
(i.e. an incorrectly-quoted phrase) you'll end up with:['foo', 'bar', 'lorem', 'ipsum', 'baz']
这篇关于解析字符串:提取单词和短语 [JavaScript]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!