java - Java Regex，匹配的段落以x开头，以y结尾

我正在尝试使用scanner.next（Pattern p）方法来挑选大文本文件的一部分，这些文件的首字母为bob，而首字母为jim。例如：

hello hello jimbob jimhellohellobob hellojim hellobob

将next()三次，返回"jimbob"，"jimhellohellobob"和"jim hellobob"

但最好不要"jimbob jimhellohellobob hellojim hellobob"，即它在开始和结束之间的允许文本中排除单词“ jim”。

我很喜欢Regex，更不用说Java regex了，所以我没有很多运气。这是我目前所在的位置：

String test = "hello hello jimbob jimhellohellobob hellojim hellobob ";


    Pattern p = Pattern.compile(".*jim.*bob.*");
    Scanner s = new Scanner(test);
    String temp;

    while(s.hasNext(p)){
        temp = s.next(p);
        System.out.println(temp);
    }

这没有打印任何内容。有什么想法我要去哪里吗？

最佳答案

您使用了错误的类。要查找所有出现的内容或正则表达式，您需要使用Matcher及其find方法。同样，由于在开头和结尾使用.*，您当前的正则表达式可以接受任何包含jim和bob的字符串。另外，.*是贪婪的，因此对于hello jimbob hello bob模式jim.*bob*这样的数据将匹配jimbob hello bob而不是仅jimbob部分。要制作.* reluctant，您需要像?一样在其后添加.*?。

所以你的代码应该看起来更像

Pattern p = Pattern.compile("jim.*?bob"); //depending on what you want you may
                                          //also need to add word boundary `\\b`
Matcher m = p.matcher(yourText);
while(m.find()){
    System.out.println(m.group());
}