在JDK 1.4中,Java增加了对正则表达式的支持。
java与正则相关的工具主要在java.util.regex包中;此包中主要有两个类:Pattern、Matcher。
Pattern
声明:public final class Pattern implements java.io.Serializable
Pattern类有final 修饰,可知他不能被子类继承。
含义:模式类,正则表达式的编译表示形式。
注意:此类的实例是不可变的,可供多个并发线程安全使用。
字段:
public static final int UNIX_LINES = 0x01; /**
* 启用不区分大小写的匹配。*/
public static final int CASE_INSENSITIVE = 0x02; /**
* 模式中允许空白和注释。
*/
public static final int COMMENTS = 0x04; /**
* 启用多行模式。
*/
public static final int MULTILINE = 0x08; /**
* 启用模式的字面值解析。*/
public static final int LITERAL = 0x10; /**
* 启用 dotall 模式。
*/
public static final int DOTALL = 0x20; /**
* 启用 Unicode 感知的大小写折叠。*/
public static final int UNICODE_CASE = 0x40; /**
* 启用规范等价。
*/
public static final int CANON_EQ = 0x80;
private static final long serialVersionUID = 5073258162644648461L; /**
* The original regular-expression pattern string.
*/
private String pattern; /**
* The original pattern flags.
*/
private int flags; /**
* Boolean indicating this Pattern is compiled; this is necessary in order
* to lazily compile deserialized Patterns.
*/
private transient volatile boolean compiled = false; /**
* The normalized pattern string.
*/
private transient String normalizedPattern; /**
* The starting point of state machine for the find operation. This allows
* a match to start anywhere in the input.
*/
transient Node root; /**
* The root of object tree for a match operation. The pattern is matched
* at the beginning. This may include a find that uses BnM or a First
* node.
*/
transient Node matchRoot; /**
* Temporary storage used by parsing pattern slice.
*/
transient int[] buffer; /**
* Temporary storage used while parsing group references.
*/
transient GroupHead[] groupNodes; /**
* Temporary null terminated code point array used by pattern compiling.
*/
private transient int[] temp; /**
* The number of capturing groups in this Pattern. Used by matchers to
* allocate storage needed to perform a match.此模式中的捕获组的数目。
*/
transient int capturingGroupCount; /**
* The local variable count used by parsing tree. Used by matchers to
* allocate storage needed to perform a match.
*/
transient int localCount; /**
* Index into the pattern string that keeps track of how much has been
* parsed.
*/
private transient int cursor; /**
* Holds the length of the pattern string.
*/
private transient int patternLength;
组和捕获
捕获组可以通过从左到右计算其开括号来编号。
在表达式 ((A)(B(C))) 中,存在四个组:
1 | ABC |
2 | A |
3 | BC |
4 | C |
组零始终代表整个表达式。
构造器:
private Pattern(String p, int f) {
pattern = p;
flags = f; // Reset group index count
capturingGroupCount = 1;
localCount = 0; if (pattern.length() > 0) {
compile();
} else {
root = new Start(lastAccept);
matchRoot = lastAccept;
}
}
构造器是私有的,可知不能通过new创建Pattern对象。
如何得到Pattern类的实例?
查阅所有方法后发现:
public static Pattern compile(String regex) {
return new Pattern(regex, 0);
}
public static Pattern compile(String regex, int flags) {
return new Pattern(regex, flags);
}
可知是通过Pattern调用静态方法compile返回Pattern实例。