为什么在某些风格的外观工作中没有有限的重复？

本文介绍了为什么在某些风格的外观工作中没有有限的重复？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从 dd / mm / yy 格式的日期解析中间的2位数字，但也允许日期和月份的单位数字。

这就是我想出的：

 （？< = ^ [ \] {1,2} \ /）[\ d] {1,2}

我想要一个1位或2位数 [\d] {1,2} ，带有1或2位数字，斜线 ^ [ \d] {1,2} \ / 之前。

这对许多组合都不起作用，我测试过 10/10/10 ， 11/12/13 等......

但令我惊讶的是（？< = ^ \\\\\\）/ [\ d] {1,2} 工作了。

但 [\d] {1,2} 如果也应匹配\\\\ 没错，或者我错了？

解决方案

关于后备支持
h2>

主要的正则表达式风格对于lookbehind有不同的支持;有些限制，有些甚至根本不支持。

Javascript：不支持

Python：仅限固定长度

Java：仅限有限长度

.NET：无限制

参考文献

Python

在Python中，只支持固定长度的lookbehind，原始模式会引发错误，因为 \d {1,2} 显然没有固定的长度。您可以通过在两个不同的固定长度的后视镜上交替来修复这个，例如，类似这样的事情：

 （？< = ^ \\\\ /）\\\ {1,2} | （？< = ^ \d\d\ /）\d {1,2}

或许您可以将两个lookbehinds作为非捕获组的替代品：

 （？:(？< = ^ \d\ /）|（？< = ^ \\\\ /））\d {1,2}

（请注意，您可以使用 \d 而不使用括号）。

也就是说，使用捕获组可能要简单得多：

  ^ \d { 1,2} \ /（\d {1,2}）

请注意返回什么如果您只有一个组，则组1捕获。捕获组比后观更受支持，并且通常会导致更易读的模式（例如在这种情况下）。

此片段说明了以上所有要点：

  p = re.compile（r'（？:(？< = ^ \\\\）|（？ < = ^ \d\d\ /））\\\ {1,2}'）
 
 print（p.findall（12/34/56））＃ [34]
 print（p.findall（1/23/45））＃[23]
 
p = re.compile（r'^ \d {1 ，2} \ /（\d {1,2}）'）
 
 print（p.findall（12/34/56））＃[34]
 print（p.findall（1/23/45））＃[23]
 
p = re.compile（r'（？< = ^ \d {1,2 } \ /）\d {1,2}'）
＃raise错误（look-behind需要固定宽度模式）

参考文献

，，，

Java

Java仅支持有限长度的lookbehind，因此您可以使用 \d { 1,2} 就像在原始模式中一样。以下代码段演示了这一点：

  String text = 
12/34/56 date\\\
 + 
1/23/45另一个日期\ n; 
 
模式p = Pattern.compile（（？m）（？< = ^ \\d {1,2} /）\\d {1,2}） ; 
 Matcher m = p.matcher（text）; 
 while（m.find（））{
 System.out.println（m.group（））; 
} //34，23

请注意（？m）是嵌入的，以便 ^ 匹配每一行的开头。另请注意，由于 \ 是字符串文字的转义字符，因此必须将\\写入在Java中获得一个反斜杠。

C-Sharp

C＃支持lookbehind的完整正则表达式。以下代码段显示了如何在lookbehind上使用 + 重复：

  var text = @
 1/23/45 
 12/34/56 
 123/45/67 
 1234/56/78 
; 
 
正则表达式r =新正则表达式（@（？m）（？< = ^ \d + /）\d {1,2}）; 
 foreach（匹配m in r.Matches（text））{
 Console.WriteLine（m）; 
} //23，34，45，56

请注意，与Java不同，在C＃中，您可以使用，这样您就不必转义 \ 。

为了完整性，以下是您在C＃中使用捕获组选项的方法：

 正则表达式r =新正则表达式（@（？m）^ \d + /（\d {1,2}））; 
 foreach（匹配m在r.Matches（文本））{
 Console.WriteLine（Matched [+ m +]; month =+ m.Groups [1]）; 
}

鉴于之前的文字，打印：

 匹配[1/23];月= 23 
配对[12/34];月= 34 
配对[123/45];月= 45 
配对[1234/56];月= 56

On lookbehind support

Major regex flavors have varying supports for lookbehind differently; some imposes certain restrictions, and some doesn't even support it at all.

Javascript: not supported
Python: fixed length only
Java: finite length only
.NET: no restriction

References

regular-expressions.info/Flavor comparison

Python

In Python, where only fixed length lookbehind is supported, your original pattern raises an error because \d{1,2} obviously does not have a fixed length. You can "fix" this by alternating on two different fixed-length lookbehinds, e.g. something like this:

(?<=^\d\/)\d{1,2}|(?<=^\d\d\/)\d{1,2}

Or perhaps you can put both lookbehinds as alternates of a non-capturing group:

(?:(?<=^\d\/)|(?<=^\d\d\/))\d{1,2}

(note that you can just use \d without the brackets).

That said, it's probably much simpler to use a capturing group instead:

^\d{1,2}\/(\d{1,2})

Note that findall returns what group 1 captures if you only have one group. Capturing group is more widely supported than lookbehind, and often leads to a more readable pattern (such as in this case).

This snippet illustrates all of the above points:

p = re.compile(r'(?:(?<=^\d\/)|(?<=^\d\d\/))\d{1,2}')

print(p.findall("12/34/56"))   # "[34]"
print(p.findall("1/23/45"))    # "[23]"

p = re.compile(r'^\d{1,2}\/(\d{1,2})')

print(p.findall("12/34/56"))   # "[34]"
print(p.findall("1/23/45"))    # "[23]"

p = re.compile(r'(?<=^\d{1,2}\/)\d{1,2}')
# raise error("look-behind requires fixed-width pattern")

References

regular-expressions.info/Lookarounds, Character classes, Alternation, Capturing groups

Java

Java supports only finite-length lookbehind, so you can use \d{1,2} like in the original pattern. This is demonstrated by the following snippet:

    String text =
        "12/34/56 date\n" +
        "1/23/45 another date\n";

    Pattern p = Pattern.compile("(?m)(?<=^\\d{1,2}/)\\d{1,2}");
    Matcher m = p.matcher(text);
    while (m.find()) {
        System.out.println(m.group());
    } // "34", "23"

Note that (?m) is the embedded Pattern.MULTILINE so that ^ matches the start of every line. Note also that since \ is an escape character for string literals, you must write "\\" to get one backslash in Java.

C-Sharp

C# supports full regex on lookbehind. The following snippet shows how you can use + repetition on a lookbehind:

var text = @"
1/23/45
12/34/56
123/45/67
1234/56/78
";

Regex r = new Regex(@"(?m)(?<=^\d+/)\d{1,2}");
foreach (Match m in r.Matches(text)) {
  Console.WriteLine(m);
} // "23", "34", "45", "56"

Note that unlike Java, in C# you can use @-quoted string so that you don't have to escape \.

For completeness, here's how you'd use the capturing group option in C#:

Regex r = new Regex(@"(?m)^\d+/(\d{1,2})");
foreach (Match m in r.Matches(text)) {
  Console.WriteLine("Matched [" + m + "]; month = " + m.Groups[1]);
}

Given the previous text, this prints:

Matched [1/23]; month = 23
Matched [12/34]; month = 34
Matched [123/45]; month = 45
Matched [1234/56]; month = 56

lookbehind