我有以下输入文本:
"rd_tagged_text"
" Amt<SPLIT>
\nSecurity<SPLIT> B<SPLIT> Px<SPLIT> A<SPLIT> Px<SPLIT> B<SPLIT> YTW<SPLIT> A<SPLIT> YTW<SPLIT> B<SPLIT> ZS<SPLIT> A<SPLIT> ZS<SPLIT> Out<SPLIT> S&am<SPLIT> Mood<SPLIT> Note<SPLIT>
\n--------------------------------------------------------------------------------<SPLIT>
\nAltice<SPLIT> France<SPLIT>
\nNUMFP<SPLIT> 4.875<SPLIT> 19<SPLIT> 99.875<SPLIT>-<SPLIT>100.375<SPLIT> 4.909<SPLIT>/<SPLIT>4.752<SPLIT> 371.<SPLIT>/<SPLIT>371.<SPLIT> 2.4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
\nNUMFP<SPLIT> 6<SPLIT> 22<SPLIT> 102.000<SPLIT>-<SPLIT>102.500<SPLIT> 5.559<SPLIT>/<SPLIT>5.450<SPLIT> 422.<SPLIT>/<SPLIT>411.<SPLIT> 4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
\nNUMFP<SPLIT> 6.25<SPLIT> 24<SPLIT> 103.000<SPLIT>-<SPLIT>103.750<SPLIT> 5.741<SPLIT>/<SPLIT>5.616<SPLIT> 420.<SPLIT>/<SPLIT>407.<SPLIT> 1.375M<SPLIT> B+<SPLIT> Ba3<SPLIT>
\nAltice<SPLIT> S.A.<SPLIT>
\nATCNA<SPLIT> 7.75<SPLIT> 22<SPLIT> 103.250<SPLIT>-<SPLIT>104.000<SPLIT> 7.005<SPLIT>/<SPLIT>6.837<SPLIT> 568.<SPLIT>/551.<SPLIT> 2.9MMM<SPLIT> B<SPLIT> B3<SPLIT>
\nATCNA<SPLIT> 7.625<SPLIT> 25<SPLIT> 101.875<SPLIT>-<SPLIT>102.375<SPLIT> 7.309<SPLIT>/<SPLIT>7.227<SPLIT> 573.<SPLIT>/<SPLIT>565.<SPLIT> 1.48MM<SPLIT> N.A.<SPLIT> B3e<SPLIT>
\n
\n{IMGR<GO>}<SPLIT>
\n "
" Amt<SPLIT>
现在,我想解析文本,以便没有引号,没有\ n,开头没有空格并且没有空行。
我用了这个:
public static void main(String[] args) throws Exception {
CSVReader reader = new CSVReader(new FileReader("rawtext.txt"),',', '"', 1);
String csv = "ParsedRawText.txt";
CSVWriter writer = new CSVWriter(new FileWriter(csv),CSVWriter.NO_ESCAPE_CHARACTER,CSVWriter.NO_QUOTE_CHARACTER);
//Read all rows at once
List<String[]> allRows = reader.readAll();
for(String[] output : allRows) {
//get current row
String[] parsedRow=new String[output.length];
for(int i=0;i<output.length;i++){
parsedRow[i]=output[i].replaceAll("(?m)^n", "").trim();
System.out.println(parsedRow[i]);
}
//write line
writer.writeNext(parsedRow);
}
writer.close();
}
我的结果是:
Amt<SPLIT>
Security<SPLIT> B<SPLIT> Px<SPLIT> A<SPLIT> Px<SPLIT> B<SPLIT> YTW<SPLIT> A<SPLIT> YTW<SPLIT> B<SPLIT> ZS<SPLIT> A<SPLIT> ZS<SPLIT> Out<SPLIT> S&am<SPLIT> Mood<SPLIT> Note<SPLIT>
--------------------------------------------------------------------------------<SPLIT>
Altice<SPLIT> France<SPLIT>
NUMFP<SPLIT> 4.875<SPLIT> 19<SPLIT> 99.875<SPLIT>-<SPLIT>100.375<SPLIT> 4.909<SPLIT>/<SPLIT>4.752<SPLIT> 371.<SPLIT>/<SPLIT>371.<SPLIT> 2.4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6<SPLIT> 22<SPLIT> 102.000<SPLIT>-<SPLIT>102.500<SPLIT> 5.559<SPLIT>/<SPLIT>5.450<SPLIT> 422.<SPLIT>/<SPLIT>411.<SPLIT> 4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6.25<SPLIT> 24<SPLIT> 103.000<SPLIT>-<SPLIT>103.750<SPLIT> 5.741<SPLIT>/<SPLIT>5.616<SPLIT> 420.<SPLIT>/<SPLIT>407.<SPLIT> 1.375M<SPLIT> B+<SPLIT> Ba3<SPLIT>
Altice<SPLIT> S.A.<SPLIT>
ATCNA<SPLIT> 7.75<SPLIT> 22<SPLIT> 103.250<SPLIT>-<SPLIT>104.000<SPLIT> 7.005<SPLIT>/<SPLIT>6.837<SPLIT> 568.<SPLIT>/551.<SPLIT> 2.9MMM<SPLIT> B<SPLIT> B3<SPLIT>
ATCNA<SPLIT> 7.625<SPLIT> 25<SPLIT> 101.875<SPLIT>-<SPLIT>102.375<SPLIT> 7.309<SPLIT>/<SPLIT>7.227<SPLIT> 573.<SPLIT>/<SPLIT>565.<SPLIT> 1.48MM<SPLIT> N.A.<SPLIT> B3e<SPLIT>
{IMGR<GO>}<SPLIT>
Amt<SPLIT>
因此,唯一不起作用的是最后第三行中的空行。
有人知道如何解决这个问题吗?
期望的结果:
Amt<SPLIT>
Security<SPLIT> B<SPLIT> Px<SPLIT> A<SPLIT> Px<SPLIT> B<SPLIT> YTW<SPLIT> A<SPLIT> YTW<SPLIT> B<SPLIT> ZS<SPLIT> A<SPLIT> ZS<SPLIT> Out<SPLIT> S&am<SPLIT> Mood<SPLIT> Note<SPLIT>
--------------------------------------------------------------------------------<SPLIT>
Altice<SPLIT> France<SPLIT>
NUMFP<SPLIT> 4.875<SPLIT> 19<SPLIT> 99.875<SPLIT>-<SPLIT>100.375<SPLIT> 4.909<SPLIT>/<SPLIT>4.752<SPLIT> 371.<SPLIT>/<SPLIT>371.<SPLIT> 2.4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6<SPLIT> 22<SPLIT> 102.000<SPLIT>-<SPLIT>102.500<SPLIT> 5.559<SPLIT>/<SPLIT>5.450<SPLIT> 422.<SPLIT>/<SPLIT>411.<SPLIT> 4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6.25<SPLIT> 24<SPLIT> 103.000<SPLIT>-<SPLIT>103.750<SPLIT> 5.741<SPLIT>/<SPLIT>5.616<SPLIT> 420.<SPLIT>/<SPLIT>407.<SPLIT> 1.375M<SPLIT> B+<SPLIT> Ba3<SPLIT>
Altice<SPLIT> S.A.<SPLIT>
ATCNA<SPLIT> 7.75<SPLIT> 22<SPLIT> 103.250<SPLIT>-<SPLIT>104.000<SPLIT> 7.005<SPLIT>/<SPLIT>6.837<SPLIT> 568.<SPLIT>/551.<SPLIT> 2.9MMM<SPLIT> B<SPLIT> B3<SPLIT>
ATCNA<SPLIT> 7.625<SPLIT> 25<SPLIT> 101.875<SPLIT>-<SPLIT>102.375<SPLIT> 7.309<SPLIT>/<SPLIT>7.227<SPLIT> 573.<SPLIT>/<SPLIT>565.<SPLIT> 1.48MM<SPLIT> N.A.<SPLIT> B3e<SPLIT>
{IMGR<GO>}<SPLIT>
Amt<SPLIT>
Avinash解决方案的结果:
Amt<SPLIT>
Security<SPLIT> B<SPLIT> Px<SPLIT> A<SPLIT> Px<SPLIT> B<SPLIT> YTW<SPLIT> A<SPLIT> YTW<SPLIT> B<SPLIT> ZS<SPLIT> A<SPLIT> ZS<SPLIT> Out<SPLIT> S&am<SPLIT> Mood<SPLIT> Note<SPLIT>
--------------------------------------------------------------------------------<SPLIT>
Altice<SPLIT> France<SPLIT>
NUMFP<SPLIT> 4.875<SPLIT> 19<SPLIT> 99.875<SPLIT>-<SPLIT>100.375<SPLIT> 4.909<SPLIT>/<SPLIT>4.752<SPLIT> 371.<SPLIT>/<SPLIT>371.<SPLIT> 2.4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6<SPLIT> 22<SPLIT> 102.000<SPLIT>-<SPLIT>102.500<SPLIT> 5.559<SPLIT>/<SPLIT>5.450<SPLIT> 422.<SPLIT>/<SPLIT>411.<SPLIT> 4MMM<SPLIT> B+<SPLIT> Ba3<SPLIT>
NUMFP<SPLIT> 6.25<SPLIT> 24<SPLIT> 103.000<SPLIT>-<SPLIT>103.750<SPLIT> 5.741<SPLIT>/<SPLIT>5.616<SPLIT> 420.<SPLIT>/<SPLIT>407.<SPLIT> 1.375M<SPLIT> B+<SPLIT> Ba3<SPLIT>
Altice<SPLIT> S.A.<SPLIT>
ATCNA<SPLIT> 7.75<SPLIT> 22<SPLIT> 103.250<SPLIT>-<SPLIT>104.000<SPLIT> 7.005<SPLIT>/<SPLIT>6.837<SPLIT> 568.<SPLIT>/551.<SPLIT> 2.9MMM<SPLIT> B<SPLIT> B3<SPLIT>
ATCNA<SPLIT> 7.625<SPLIT> 25<SPLIT> 101.875<SPLIT>-<SPLIT>102.375<SPLIT> 7.309<SPLIT>/<SPLIT>7.227<SPLIT> 573.<SPLIT>/<SPLIT>565.<SPLIT> 1.48MM<SPLIT> N.A.<SPLIT> B3e<SPLIT>n{IMGR<GO>}<SPLIT>
Amt<SPLIT>
最佳答案
只需添加另一个replaceAll
函数。
parsedRow[i]=output[i].replaceAll("(?m)^n", "").replaceAll("[\\r\\n][\\r\\n]+", "\\n").trim();
要么
parsedRow[i] = output[i].replaceAll("(?m)^n", "").replaceAll("(?m)([\\r\\n])[\\r\\n]+|^ +| +$", "$1");
关于java - Java Regex替换字符和空行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/28631407/