问题描述
我找到了来提取camelCase或TitleCase表达式的一部分。
I found a brilliant RegEx to extract the part of a camelCase or TitleCase expression.
(?<!^)(?=[A-Z])
按预期工作:
- value - > value
- camelValue - > camel / Value
- TitleValue - > Title / Value
例如使用Java:
String s = "loremIpsum";
words = s.split("(?<!^)(?=[A-Z])");
//words equals words = new String[]{"lorem","Ipsum"}
我的问题是它在某些情况下不起作用:
My problem is that it does not work in some cases:
- 案例1:VALUE - > V / A / L / U / E
- 案例2:eclipseRCPExt - > eclipse / R / C / P / Ext
在我看来,结果应该是:
To my mind, the result shoud be:
- 案例1:VALUE
- 案例2:eclipse / RCP / Ext
换句话说,给定n个大写字符:
In other words, given n uppercase chars:
- 如果n个字符后跟小写字符,则组应为:(n-1个字符)/(第n个字符+低字符)
- 如果n个字符在最后,该组应该是:(n个字符)。
关于如何改进这个的任何想法正则表达式?
Any idea on how to improve this regex?
推荐答案
以下正则表达式适用于以上所有示例:
The following regex works for all of the above examples:
public static void main(String[] args)
{
for (String w : "camelValue".split("(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])")) {
System.out.println(w);
}
}
它的作用是强迫负面的观察不仅忽略匹配字符串的开头,但也要忽略大写字母前面有另一个大写字母的匹配项。这会处理像VALUE这样的情况。
It works by forcing the negative lookbehind to not only ignore matches at the start of the string, but to also ignore matches where a capital letter is preceded by another capital letter. This handles cases like "VALUE".
由于未能在RPC和Ext之间拆分,正则表达式的第一部分本身在eclipseRCPExt上失败。这是第二个子句的目的:(?<!^)(?= [AZ] [az]
。此子句允许在每个大写字母之前进行拆分后跟一个小写字母,除了在字符串的开头。
The first part of the regex on its own fails on "eclipseRCPExt" by failing to split between "RPC" and "Ext". This is the purpose of the second clause: (?<!^)(?=[A-Z][a-z]
. This clause allows a split before every capital letter that is followed by a lowercase letter, except at the start of the string.
这篇关于RegEx分裂camelCase或TitleCase(高级)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!