我目前正尝试取大文本并逐个句子拆分文本。我有将文本拆分为单独句子的代码,但是它当然也包含空格。我需要在代码中添加什么以使其省略句子数组中的空格?
String [] sentence
ArrayList <String> sentenceList = new ArrayList <String> ();
try {
Scanner sentenceScanner = new Scanner (new File("data/" + fileName));
while (sentenceScanner.hasNextLine()) {
sentenceList.add (sentenceScanner.nextLine());
}
sentenceScanner.close();
} catch (FileNotFoundException e) {
System.out.println ("File Not Found");
}
for (int r = 0; r < sentenceArray.length; r++) {
sentence = sentenceArray [r].split ("(?<=[.!?])\\s*");
for (int i = 0; i < sentence.length; i++) {
System.out.println (sentence [i]);
}
}
最佳答案
最好在读取输入数据时进行过滤:
while (sentenceScanner.hasNextLine()) {
String line = sentenceScanner.nextLine().trim();
if (!line.isEmpty()) {
sentenceList.add (line);
}
}