public static final String POEM=
"Twas brilling, and the slithy toves\n" +
"Did gyre and gimble in the wabe.\n"+
"All mimsy were the borogoves,\n" +
"And the mome rathsoutgrable.\n\n"+
"Beware the Jabberwork, my son,\n"+
"The jaws that bite, the claws that catch.\n"+
"Beware hte Jubjub bird, and shun\n"+
"The frumious Bandersnatch.";
public static void main(String[] args) {
// TODO Auto-generated method stub
Matcher m= Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$")
.matcher(POEM);
while (m.find()) {
for (int j = 0; j <= m.groupCount(); j++) {
System.out.print("["+ m.group(j)+ "]");
}
System.out.println();
}
}
output:
[the slithy toves][the][slithy toves][slithy][toves]
[in the wabe.][in][the wabe.][the][wabe.]
[were the borogoves,][were][the borogoves,][the][borogoves,]
[the mome rathsoutgrable.][the][mome rathsoutgrable.][mome][rathsoutgrable.]
[Jabberwork, my son,][Jabberwork,][my son,][my][son,]
[claws that catch.][claws][that catch.][that][catch.]
[bird, and shun][bird,][and shun][and][shun]
[The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.]
解析:
m.groupCout():匹配器匹配的组的总数,不包括0组。
m.group(j):匹配的第j组的值。group(0)是整个表达式
(?m):多行模式
\S:非空白字符
\s:空白字符 ==[ \t\n\x0B\f\r]
2.Matcher.find() vs .lookingAt() vs .matchers()
package com.westward; import java.util.regex.Matcher;
import java.util.regex.Pattern; public class Demo31 {
public static String input=
"As long as there is injustice, whenever a\n"+
"Targathian baby cries out.wherever a distress\n" +
"signal sounds among the stars ... We'll be there.\n"+
"This fine ship, and this fine crew ...\n" +
"Never give up! Never surrender!"; private static class Display{
private boolean regexPrinted= false;
private String regex;
Display(String regex) {
this.regex= regex;
}
void display(String message){
if (!regexPrinted) {
System.out.println(regex);
regexPrinted = true;
}
System.out.println(message);
}
}
static void examine(String s,String regex){
Display d= new Display(regex);
Pattern p= Pattern.compile(regex);
Matcher m= p.matcher(s);
while (m.find()) {
d.display("find() '"+ m.group() +
"' start= "+m.start()+ " end= "+ m.end());
}
if (m.lookingAt()) {
d.display("lookingAt() '"+ m.group() +
"' start= "+m.start()+ " end= "+ m.end());
}
if (m.matches()) {
d.display("matches() '"+ m.group() +
"' start= "+m.start()+ " end= "+ m.end());
}
}
public static void main(String[] args) {
for (String in : input.split("\n")) {
System.out.println("input :"+ in);
for (String regex : new String[]{"\\w*ere\\w*",
"\\w*ever","T\\w+","Never.*?!"}) {
examine(in, regex);
}
}
}
}
output:
input :As long as there is injustice, whenever a
\w*ere\w*
find() 'there' start= 11 end= 16
\w*ever
find() 'whenever' start= 31 end= 39
input :Targathian baby cries out.wherever a distress
\w*ere\w*
find() 'wherever' start= 26 end= 34
\w*ever
find() 'wherever' start= 26 end= 34
T\w+
find() 'Targathian' start= 0 end= 10
lookingAt() 'Targathian' start= 0 end= 10
input :signal sounds among the stars ... We'll be there.
\w*ere\w*
find() 'there' start= 43 end= 48
input :This fine ship, and this fine crew ...
T\w+
find() 'This' start= 0 end= 4
lookingAt() 'This' start= 0 end= 4
input :Never give up! Never surrender!
\w*ever
find() 'Never' start= 0 end= 5
find() 'Never' start= 15 end= 20
lookingAt() 'Never' start= 0 end= 5
Never.*?!
find() 'Never give up!' start= 0 end= 14
find() 'Never surrender!' start= 15 end= 31
lookingAt() 'Never give up!' start= 0 end= 14
matches() 'Never give up! Never surrender!' start= 0 end= 31
总结:
Matcher.find():匹配字符串的任意位置
Matcher.lookingAt():匹配字符串的开始位置
Matcher.matchers():匹配整个字符串,String.matchers()底层就是调用的它。
3.Pattern标记 (Pattern的几个成员变量)
public static void main(String[] args) {
// TODO Auto-generated method stub
Pattern p= Pattern.compile("^java", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
Matcher m= p.matcher(
"java has regex\nJava has regex\n"+
"JAVA has pretty good regular expressions\n"+
"Regular expressions are in Java");
while (m.find()) {
System.out.println(m.group(0));
// System.out.println(m.group());//the same
}
}
output:
java
Java
JAVA
总结:不同的Pattern标记可以用 或| 来连接。
Pattern.CASE_INSENSITIVE(?i):字母大小写不敏感
Pattern.MULTILINE(?m):多行模式