我正在编写一个程序,该程序需要检查一组字符是否存在。我的代码当前为:
String checkerLoop = "ForeclosureResutls_CaseNum_";
Pattern checkerLoopPattern = Pattern.compile("(?<="+Pattern.quote(checkerLoop)+").*?(?="+checkerNumber+")");
Matcher checkerLoopMatcher = checkerLoopPattern.matcher(scraper.getPage().getWebResponse().getContentAsString());
while (checkerLoopMatcher.find()) {
checker = true;
}
我需要查找的句子是“ ForeclosureResutls_CaseNum _” + checkerNumber,其中检查器编号是int。我尝试根据先前的代码编写此代码,以便在两组之间找到一组字符,因此我相信这可能就是为什么此代码无法正常工作的原因。
示例输入字符串如下:
<a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_0" href="javascript:__doPostBack('ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl02$lbCaseNum','')" style="display:inline-block;width:100px;">CV-13-798497</a>
</td><td align="center">488-05-029</td><td align="center">I</td><td align="center">01/02/2013</td>
</tr><tr style="background-color:Gainsboro;">
<td align="left">UNKNOWN HEIRS, ETC OF D.C. RUFUS, ET AL </td><td align="left">10603 HAMPDEN AVENUE</td><td align="center">CLEVELAND</td><td align="center">44108-0000</td><td align="center">
<a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_1" href="javascript:__doPostBack('ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl03$lbCaseNum','')" style="display:inline-block;width:100px;">CV-13-798498</a>
</td><td align="center">109-16-094</td><td align="center">A</td><td align="center">01/02/2013</td>
</tr><tr style="background-color:LightGrey;">
<td align="left">SHARECE MILLER, ET AL </td><td align="left">13514 ALVIN AVENUE</td><td align="center">GARFIELD HTS</td><td align="center">44105-0000</td><td align="center">
<a id="SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_2" href="javascript:__doPostBack('ctl00$Shee
最佳答案
好的,这就是我所拥有的。我没有完全满足您的要求,但这对于使您走上正确的道路应该很好。
首先,在此演示数据中根本找不到ForeclosureResutls_CaseNum_
。 ForeclosureResutls_lbCaseNum
是的,所以这就是我的意思。
另外,我忽略了checkerNumber
并假设您要检查任何数字,因为此输入中有3个,并且我不知道您的数字是如何得出的。因此,\\d
。
据我所知,考虑到您需要做的事情,您在帖子中使用的正则表达式很疯狂。相比之下,我所用的是微不足道的。
尝试这个:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
<P>{@code java ParseForclosureResultsXmpl}</P>
**/
public class ParseForclosureResultsXmpl {
public static final void main(String[] igno_red) {
String sLS = System.getProperty("line.separator", "\n");
StringBuilder sdInput = new StringBuilder().
append("<a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_0\" href=\"javascript:__doPostBack('ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl02$lbCaseNum','')\" style=\"display:inline-block;width:100px;\">CV-13-798497</a>").append(sLS).
append(" </td><td align=\"center\">488-05-029</td><td align=\"center\">I</td><td align=\"center\">01/02/2013</td>").append(sLS).
append(" </tr><tr style=\"background-color:Gainsboro;\">").append(sLS).
append(" <td align=\"left\">UNKNOWN HEIRS, ETC OF D.C. RUFUS, ET AL </td><td align=\"left\">10603 HAMPDEN AVENUE</td><td align=\"center\">CLEVELAND</td><td align=\"center\">44108-0000</td><td align=\"center\">").append(sLS).
append(" <a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_1\" href=\"javascript:__doPostBack('ctl00$SheetContentPlaceHolder$ctl00$gvForeclosureResutls$ctl03$lbCaseNum','')\" style=\"display:inline-block;width:100px;\">CV-13-798498</a>").append(sLS).
append(" </td><td align=\"center\">109-16-094</td><td align=\"center\">A</td><td align=\"center\">01/02/2013</td>").append(sLS).
append(" </tr><tr style=\"background-color:LightGrey;\">").append(sLS).
append(" <td align=\"left\">SHARECE MILLER, ET AL </td><td align=\"left\">13514 ALVIN AVENUE</td><td align=\"center\">GARFIELD HTS</td><td align=\"center\">44105-0000</td><td align=\"center\">").append(sLS).
append(" <a id=\"SheetContentPlaceHolder_ctl00_gvForeclosureResutls_lbCaseNum_2\" href=\"javascript:__doPostBack('ctl00$Shee").append(sLS);
String sRqdValuePrefix = "ForeclosureResutls_lbCaseNum_";
Pattern checkerLoopPattern = Pattern.compile(sRqdValuePrefix + "\\d");
Matcher m = checkerLoopPattern.matcher(""); //Unused. so the matcher can be reused in the loop.
int iLn = 0;
String[] asInput = sdInput.toString().split(sLS);
for(String s : asInput) {
iLn++; //1st iteration: Was zero, now 1
//Resuing matcher instead of retrieving new one from Pattern each iteration
m.reset(s);
if(m.find()) {
int iCheckerNumber = Integer.parseInt(s.substring(m.start() + sRqdValuePrefix.length(), m.end()));
System.out.println("Found on line " + iLn + ", at index " + m.start() + " with checker number " + iCheckerNumber);
}
}
}
}
输出:
[C:\java_code\]java ParseForclosureResultsXmpl
Found on line 1, at index 39 with checker number 0
Found on line 5, at index 57 with checker number 1
Found on line 9, at index 57 with checker number 2
问任何问题。