我有一个像这样的文本文件,我想从该文本文件中解析信息。
#title キミと☆Are You Ready?
#artist トライクロニカ
#mobile deresimu
#easy 0
#normal 22
#hard 27
#tag SHOW BY ROCK!!
#preset all
我用这段代码来解析它。
File infoFile = new File(dir, "info.txt");
//parse info.txt
String songName="?";
String artist = "?";
int difficulties[] = new int[5];
try {
BufferedReader br = new BufferedReader(new FileReader(infoFile));
String line = br.readLine();
while (line != null) {
Log.v(TAG, "line=" + line);
//I hate BOM!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/*
<a href="http://www.faqs.org/rfcs/rfc3629.html">RFC 3629 - UTF-8, a transformation format of ISO 10646</a>
*
* <p>The
* <a href="http://www.unicode.org/unicode/faq/utf_bom.html">Unicode FAQ</a>
* defines 5 types of BOMs:<ul>
* <li><pre>00 00 FE FF = UTF-32, big-endian</pre></li>
* <li><pre>FF FE 00 00 = UTF-32, little-endian</pre></li>
* <li><pre>FE FF = UTF-16, big-endian</pre></li>
* <li><pre>FF FE = UTF-16, little-endian</pre></li>
* <li><pre>EF BB BF = UTF-8</pre></li>
* </ul></p>
*
* https://stackoverflow.com/questions/1835430/byte-order-mark-screws-up-file-reading-in-java
*/
line=line.replace("\u00EF\u00BB\u00BF", "");
line=line.replace("\u0000 \u0000 \u00FE \u00FF","");
line=line.replace("\u00FF \u00FE \u0000 \u0000","");
line=line.replace("\u00FE \u00FF","");
line=line.replace("\u00FF \u00FE","");
if (line.startsWith("#title")) {
Log.v(TAG, "startswith");
line = line.replace("#title ", "").trim();
songName = line;
} else if (line.startsWith("#artist")) {
line = line.replace("#artist ", "").trim();
artist = line;
} else if (line.startsWith("#easy")) {
difficulties[0] = Integer.parseInt(line.replace("#easy ", "").trim());
} else if (line.startsWith("#normal")) {
difficulties[1] = Integer.parseInt(line.replace("#normal ", "").trim());
} else if (line.startsWith("#hard")) {
difficulties[2] = Integer.parseInt(line.replace("#hard ", "").trim());
} else if (line.startsWith("#master")) {
difficulties[3] = Integer.parseInt(line.replace("#master ", "").trim());
} else if (line.startsWith("#apex")) {
difficulties[4] = Integer.parseInt(line.replace("#apex ", "").trim());
continue;
}
line = br.readLine();
}
} catch (IOException | NumberFormatException e) {
throw new RuntimeException(e);
}
//info.txt parse done.
Log.v(TAG, "Info.txt parse done.");
Log.v(TAG, "Song name=" + songName);
Log.v(TAG, "Difficulties=" + Arrays.toString(difficulties));
Log.v(TAG, "Artist=" + artist);
Log.v(TAG, "Folder=" + dir.getName());
解析所有其他行都可以,除了第一行。
if (line.startsWith("#title")) {
对于给定的文本文件似乎永远都不成立。当我将
startsWith
更改为contains
时,它可以工作。首先,我认为这是一个BOM问题,因此我添加了5行删除BOM序列。但是它没有用。变量
songName
始终为“?”当我在第一行中使用startsWith
时。为什么此代码不能与
#title
匹配的任何线索?谢谢。
Logcat输出:
2019-03-10 23:00:22.872 23600-23600/sma.rhythmtapper V/NoteFile: line=#title キミと☆Are You Ready?
2019-03-10 23:00:22.872 23600-23600/sma.rhythmtapper V/NoteFile: line=#artist トライクロニカ
2019-03-10 23:00:22.872 23600-23600/sma.rhythmtapper V/NoteFile: line=#mobile deresimu
2019-03-10 23:00:22.873 23600-23600/sma.rhythmtapper V/NoteFile: line=#easy 0
2019-03-10 23:00:22.873 23600-23600/sma.rhythmtapper V/NoteFile: line=#normal 22
2019-03-10 23:00:22.873 23600-23600/sma.rhythmtapper V/NoteFile: line=#hard 27
2019-03-10 23:00:22.874 23600-23600/sma.rhythmtapper V/NoteFile: line=#tag SHOW BY ROCK!!
2019-03-10 23:00:22.876 23600-23600/sma.rhythmtapper V/NoteFile: line=#preset all
2019-03-10 23:00:22.876 23600-23600/sma.rhythmtapper V/NoteFile: Info.txt parse done.
2019-03-10 23:00:22.876 23600-23600/sma.rhythmtapper V/NoteFile: Song name=?
2019-03-10 23:00:22.877 23600-23600/sma.rhythmtapper V/NoteFile: Difficulties=[0, 22, 27, 0, 0]
2019-03-10 23:00:22.877 23600-23600/sma.rhythmtapper V/NoteFile: Artist=トライクロニカ
2019-03-10 23:00:22.877 23600-23600/sma.rhythmtapper V/NoteFile: Folder=キミと☆Are You Ready?
编辑
我通过将字节序列打印到logcat找到了问题所在。
它说:
“ #titleキミと☆您准备好了吗?”-> [-17,-69,-65,35,116,105,116,108,101,32,-29,-126,-83,-29,- 125,-97,-29,-127,-88,-30,-104,-122、65、114、101、32、89、111、117、32、82、101、97、100、121,- 17,-68,-97]
“ #title”-> [35,116,105,116,108,101]
因此,我需要从
line
变量中删除-17,-69,-65。在不使用外部库的情况下如何实现目标? 最佳答案
怀疑BOM导致了问题是真的。
另外,我将BOM删除代码更改为此:
line=line.replace("\uEFBB\u00BF", "");
line=line.replace("\u0000\uFEFF","");
line=line.replace("\uFFFE\u0000","");
line=line.replace("\uFEFF","");
line=line.replace("\uFFFE","");
要注意
空白
\ u00EF!=字节0xEF
感谢所有试图帮助我的人,并希望其他可能遇到相同问题的人也能从这篇文章中获得帮助。