java - 如何在Java中的字符串中找到整个单词

我有一个字符串，必须针对不同的关键字进行解析。
例如，我有字符串:

“我将在123woods与您会面”

我的关键字是

'123woods'
'树木'

每当有比赛和地点时，我都应报告。也应考虑多次出现。但是，对于这个，我应该只在123woods上进行比赛，而不是在树林上。这消除了使用String.contains()方法。另外，我应该能够有一个关键字列表/集合并同时检查它们的出现。在此示例中，如果我有“123woods”和“come”，则应该出现两次。在大型文本上，方法执行应该会比较快。

我的想法是使用StringTokenizer，但是我不确定它是否会运行良好。有什么建议么？

最佳答案

下面的示例基于您的评论。它使用关键字列表，它将使用单词边界在给定的字符串中进行搜索。它使用Apache Commons Lang的StringUtils构建正则表达式并打印匹配的组。

String text = "I will come and meet you at the woods 123woods and all the woods";

List<String> tokens = new ArrayList<String>();
tokens.add("123woods");
tokens.add("woods");

String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

如果您正在寻找更高的性能，可以看看StringSearch:Java中的高性能模式匹配算法。