问题描述
我想知道文本文件中每行的偏移量.
I want to know the offset of every line present in a text file.
目前我已经尝试过,
path=FileSystems.getDefault().getPath(".",filename);
br=Files.newBufferedReader(path_doc_title_index_path, Charset.defaultCharset());
int offset=0; //offset of first line.
String strline=br.readline();
offset+=strline.length()+1; //offset of second line
通过这种方式,我可以遍历整个文件来了解整个文本文件中行首的偏移量.但是,如果我使用RandomAccessFile
来搜索文件并使用通过上述方法计算出的偏移量来访问一行,那么我发现自己处于某一行的中间.看来偏移量是不正确的.
In this way I can loop through entire file to know offset of begining of lines in entire text file. But if I use RandomAccessFile
to seek through file and access a line using offset calulated by above method then I found myself in the middle of some line. That is it seems that offset are not correct.
怎么了?这种方法计算偏移量不正确吗?有什么更好,更快捷的方法吗?
What's wrong? Is this method incorrect to calculate offset? Any better and fast methods please?
推荐答案
您的代码仅适用于ASCII编码的文本.由于某些字符需要一个以上的字节,因此您必须更改以下行
Your code will only work for ASCII encoded text. Since some characters need more than one byte, you have to change following line
offset += strline.length() + 1;
到
offset += strline.getBytes(Charset.defaultCharset()).length + 1;
正如我在问题下方的评论中所述,您必须指定文件的正确编码.例如. Charset.forName("UTF-8")
在这里以及在何处初始化BufferedReader
.
As stated in my comments below your question, you have to specifiy the correct encoding of your file. E.g. Charset.forName("UTF-8")
here and also where you initialize your BufferedReader
.
这篇关于如何在Java中的文本文件中知道行首的偏移量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!