我目前正尝试取大文本并逐个句子拆分文本。我有将文本拆分为单独句子的代码,但是它当然也包含空格。我需要在代码中添加什么以使其省略句子数组中的空格?

    String [] sentence
    ArrayList <String> sentenceList = new ArrayList <String> ();
    try {
        Scanner sentenceScanner = new Scanner (new File("data/" + fileName));
        while (sentenceScanner.hasNextLine()) {
            sentenceList.add (sentenceScanner.nextLine());
        }
        sentenceScanner.close();
    } catch (FileNotFoundException e) {
        System.out.println ("File Not Found");
    }

    for (int r = 0; r < sentenceArray.length; r++) {
        sentence = sentenceArray [r].split ("(?<=[.!?])\\s*");
        for (int i = 0; i < sentence.length; i++) {
            System.out.println (sentence [i]);
        }
    }

最佳答案

最好在读取输入数据时进行过滤:

while (sentenceScanner.hasNextLine()) {
    String line = sentenceScanner.nextLine().trim();
    if (!line.isEmpty()) {
        sentenceList.add (line);
    }
}

10-06 09:06