正则表达式匹配不在引号中的单个单词

正则表达式匹配不在引号中的单个单词

本文介绍了正则表达式匹配不在引号中的单个单词/字符集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望编写一个正则表达式 (C#) 来匹配没有被引号包围的单词.一个示例输入字符串是:

I'm looking to write a regex (C#) that will match words that aren't surrounded by quotes. An example input string would be:

dbo.test line_length "quoted words" notquoted

dbo.test line_length "quoted words" notquoted

这需要匹配

dbo.test

line_length

line_length

未引用

因此 3 个单独的匹配项和引用的单词"不匹配.引用的短语可以在输入中的任何位置...开头、中间、结尾等.

So 3 separate matches and "quoted words" is not matched. The quoted phrase could be anywhere in the input...beginning, middle, end, etc.

我无法想出一个正则表达式来匹配不在引号中的单词,其中引号中可能有空格......我已经能够匹配如下内容:hello "world" 并且只得到你好.

I haven't been able to come up with a regex that matches words not in quotes where there could be a space in the quotes...I've been able to match something like: hello "world" and only get hello.

有没有办法编写我正在尝试的正则表达式?

Is there a way to write the regex I'm trying to?

推荐答案

有两种方法可以解决这个问题,具体取决于您想对输出做什么.

There are two ways to tackle this, depending on what you want to do with the output.

首先,匹配(但不捕获)引号内的任何文本.(这是特别匹配你不想要的东西.)使用 | 管道,使用捕获组选择您想要保留的所有内容.

First, match (but don't capture) any text within quotation marks. (This is specifically matching the stuff that you DON'T want.)Using the | pipe, use capture groups to select everything that you DO want to keep.

示例:

".*?"|(\b\S+\b)

您可以在此处查看示例.

使用环视的另一种选择是专门从单词的开头向后看,以确保 " 不会出现在那里:

The other option, using look-arounds, is to specifically look backward from the beginning of the words to ensure that the " doesn't appear there:

(?<!")(\b\S+\b)(?!")

您可以在此处看到这一点.

当您开始使用多个单词时,这可能会出现问题,但这应该会让您走上正轨,并且您可以指出这些方法中的一种是否比另一种更适合您.

This may have a problem when you start using multiple words, but this should get you on the right track, and you can indicate whether one of these methods works better for you than the other.

这篇关于正则表达式匹配不在引号中的单个单词/字符集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 21:50