问题描述
我需要非常了解 Java 和内存问题的人的建议.我有一个大文件(大约 1.5GB),我需要将此文件切成许多(例如 100 个小文件)较小的文件.
I need the advice from someone who knows Java very well and the memory issues.I have a large file (something like 1.5GB) and I need to cut this file in many (100 small files for example) smaller files.
我通常知道如何做到这一点(使用 BufferedReader
),但我想知道您是否有任何关于内存的建议,或者如何更快地做到这一点.
I know generally how to do it (using a BufferedReader
), but I would like to know if you have any advice regarding the memory, or tips how to do it faster.
我的文件包含文本,它不是二进制文件,每行大约有 20 个字符.
My file contains text, it is not binary and I have about 20 character per line.
推荐答案
首先,如果您的文件包含二进制数据,那么使用 BufferedReader
将是一个大错误(因为您会将数据转换为字符串,这是不必要的,很容易破坏数据);你应该使用 BufferedInputStream
代替.如果它是文本数据并且您需要沿换行符拆分它,那么使用 BufferedReader
就可以了(假设文件包含合理长度的行).
First, if your file contains binary data, then using BufferedReader
would be a big mistake (because you would be converting the data to String, which is unnecessary and could easily corrupt the data); you should use a BufferedInputStream
instead. If it's text data and you need to split it along linebreaks, then using BufferedReader
is OK (assuming the file contains lines of a sensible length).
关于内存,如果您使用大小合适的缓冲区应该不会有任何问题(我会使用至少 1MB 来确保 HD 主要进行顺序读取和写入).
Regarding memory, there shouldn't be any problem if you use a decently sized buffer (I'd use at least 1MB to make sure the HD is doing mostly sequential reading and writing).
如果速度有问题,您可以查看 java.nio
包 - 据说这些包比 java.io
、
If speed turns out to be a problem, you could have a look at the java.nio
packages - those are supposedly faster than java.io
,
这篇关于用Java读取大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!