本文介绍了Java正则表达式(java.util.regex).搜索美元符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个搜索字符串.当它包含美元符号时,我希望此后捕获所有字符,但不包括点或后续的美元符号.后者将构成后续的匹配项.因此,对于这些搜索字符串中的任何一个...:

I have a search string.When it contains a dollar symbol, I want to capture all characters thereafter, but not include the dot, or a subsequent dollar symbol.. The latter would constitute a subsequent match.So for either of these search strings...:

"/bla/$V_N.$XYZ.bla";
"/bla/$V_N.$XYZ;

我想返回:

  • V_N
  • XYZ

如果搜索字符串包含百分号,我还想返回这对百分号之间的内容.

If the search string contains percent symbols, I also want to return what's between the pair of % symbols.

以下正则表达式似乎可以解决问题.

The following regex seems do the trick for that.

 "%([^%]*?)%";

推断:

  • 以%开头和结尾,
  • 具有捕获组-()
  • 具有一个字符类,其中包含除%符号之外的任何字符(插入符号表示不是字符)
  • 重复-但不是贪婪*?

某些语言允许捕获组使用%1%2,Java则使用backslash\number语法.因此,该字符串将编译并生成输出.

Where some languages allow %1, %2, for capture groups, Java uses backslash\number syntax instead. So, this string compiles and generates output.

我怀疑美元符号和点需要转义,因为它们是特殊符号:

I suspect the dollar symbol and dot need escaping, as they are special symbols:

  • $通常是字符串的结尾
  • .是任何字符的元序列.
  • $ is usually end of string
  • . is a meta sequence for any character.

我尝试使用双反斜杠符号..

I have tried using double backslash symbols.. \

  • 两者都是字符类,例如[^\\.\\$%]
  • 并使用 OR'd 表示法%|\\$
  • Both as character classes .e.g. [^\\.\\$%]
  • and using OR'd notation %|\\$

试图结合这种逻辑,似乎什么也做不了.

in attempts to combine this logic and can't seem to get anything to play ball.

我想知道是否还有另一双眼睛可以看到如何解决这个难题!

到目前为止我的尝试:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
  public static void main(String[] args) {
        String search = "/bla/$V_N.$XYZ.bla";
        String pattern = "([%\\$])([^%\\.\\$]*?)\\1?";
  /* Either % or $ in first capture group ([%\\$])
   * Second capture group - anything except %, dot or dollar sign
   * non greedy group ( *?)
   * then a backreference to an optional first capture group \\1?
   * Have to use two \, since you escape \ in a Java string.
   */
        Pattern r = Pattern.compile(pattern);
        Matcher m = r.matcher(search);
        List<String> results = new ArrayList<String>();
          while (m.find())
        {
          for (int i = 0; i<= m.groupCount(); i++) {
                results.add(m.group(i));
          }
        }
        for (String result : results) {
          System.out.println(result);
        }
  }
}

以下链接可能会有所帮助:

The following links may be helpful:

  • An interactive Java playground where you can experiment and copy/paste code.
  • Regex101
  • Java RegexTester
  • Java backreferences (The optional backreference \\1 in the Regex).
  • Link that summarises Regex special characters often found in languages
  • Java Regex book EPub link
  • Regex Info Website
  • Matcher class in the Javadocs

推荐答案

您可以使用

String search = "/bla/$V_N.$XYZ.bla";
String pattern = "[%$]([^%.$]*)";
Matcher matcher = Pattern.compile(pattern).matcher(search);
while (matcher.find()){
    System.out.println(matcher.group(1));
} // => V_N, XYZ

请参见 Java演示 regex演示.

注意

  • 在模式末尾不需要可选的\1?.由于它是可选的,所以它不限制匹配上下文并且是多余的(因为否定的字符类既不能匹配$也不能匹配%)
  • [%$]([^%.$]*)匹配%$,然后将零或更大的值捕获到组1中%.$以外的其他字符.您仅需要组1的值,因此使用matcher.group(1).
  • 字符类 中,两个和$都不是特殊的,因此,它们不需要在[%.$][%$]中进行转义.
  • You do not need an optional \1? at the end of the pattern. As it is optional, it does not restrict match context and is redundant (as the negated character class cannot already match neither $ nor%)
  • [%$]([^%.$]*) matches % or $, then captures into Group 1 any zero or morechars other than %, . and $. You only need Group 1 value, hence, matcher.group(1) is used.
  • In a character class, neither . nor $ are special, thus, they do not need escaping in [%.$] or [%$].

这篇关于Java正则表达式(java.util.regex).搜索美元符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 08:45