问题描述
我目前正在研究String concat选项以及它们对整体性能的惩罚。我的测试用例创造了令我心烦意乱的结果,我不确定我是否忽视了某些东西。
I'm currently looking into String concat options and the penalty they have on the overall performance. And my test-case creates results that blow my mind, I'm not sure if I'm overlooking something.
这是交易:做 java中的something+somethingElse
将在每次完成时(在编译时)创建一个新的 StringBuilder
。
Here is the deal: Doing "something"+"somethingElse"
in java will (at compile-time) create a new StringBuilder
every time this is done.
对于我的测试用例,我正在从我的HDD加载一个文件 1661行示例数据(经典的Lorem Ipsum)。这个问题不是关于I / O性能,而是关于不同字符串连接方法的性能。
For my test-case, I'm loading a file from my HDD that has 1661 lines of example data (classic "Lorem Ipsum"). This question is not about the I/O performance, but about the performance of the different string concat methods.
public class InefficientStringConcat {
public static void main(String[] agrs) throws Exception{
// Get a file with example data:
System.out.println("Starting benchmark");
// Read an measure:
for (int i = 0; i < 10; i++){
BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(new File("data.txt")))
);
long start = System.currentTimeMillis();
// Un-comment method to test:
//inefficientRead(in);
//betterRead(in);
long end = System.currentTimeMillis();
System.out.println("Took "+(end-start)+"ms");
in.close();
}
}
public static String betterRead(BufferedReader in) throws IOException{
StringBuilder b = new StringBuilder();
String line;
while ((line = in.readLine()) != null){
b.append(line);
}
return b.toString();
}
public static String inefficientRead(BufferedReader in) throws IOException {
String everything = "", line;
while ((line = in.readLine()) != null){
everything += line;
}
return everything;
}
}
如您所见,设置相同两个测试。以下是结果:
As you can see, the setup is the same for both tests. Here are the results:
使用 inefficientRead()
-method :
Using inefficientRead()
-method:
Starting benchmark
#1 Took 658ms
#2 Took 590ms
#3 Took 569ms
#4 Took 567ms
#5 Took 562ms
#6 Took 570ms
#7 Took 563ms
#8 Took 568ms
#9 Took 560ms
#10 Took 568ms
使用 betterRead()
-method
Using betterRead()
-method
Starting benchmark
#1 Took 42ms
#2 Took 10ms
#3 Took 5ms
#4 Took 7ms
#5 Took 16ms
#6 Took 3ms
#7 Took 4ms
#8 Took 5ms
#9 Took 5ms
#10 Took 13ms
我正在运行测试没有额外的参数到 java
- 命令。我正在运行和Sun JDK 7:
I'm running the tests with no extra parameters to the java
-command. I'm running a MacMini3,1 from early 2009 and Sun JDK 7:
[luke@BlackBox ~]$ java -version
java version "1.7.0_09"
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) Client VM (build 23.5-b02, mixed mode)
这让我觉得非常重要。我在测量这个时做错了什么,或者这应该发生?
This strikes me as a very heavy difference. Am I doing something wrong in measuring this, or is this supposed to happen?
推荐答案
它应该发生。使用重复字符串连接构造长字符串是一种已知的性能反模式:每个连接必须创建一个新字符串,其中包含原始字符串的副本以及复制附加字符串的。你最终得到了O(N )的表现。当您使用 StringBuilder
时,大部分时间只是将附加字符串复制到缓冲区中。有时缓冲区需要耗尽空间并需要扩展(通过将现有数据复制到新缓冲区中),但这种情况不会经常发生(由于缓冲区扩展策略)。
It's supposed to happen. Constructing a long string using repeated string concatenation is a known performance anti-pattern: every concatenation has to create a new string with a copy of the original string and also a copy of the additional string. You end up with O(N) performance. When you use StringBuilder
, most of the time you're just copying the additional string into a buffer. Occasionally the buffer will need to run out space and need to be expanded (by copying the existing data into a new buffer) but that doesn't happen often (due to the buffer expansion strategy).
有关详细信息,请参阅我的 - 这是一个非常古老的文章,所以早于 StringBuilder
,但基本面没有改变。 (基本上 StringBuilder
类似于 StringBuffer
,但没有同步。)
See my article on string concatenation for details - it's a very old article, so predates StringBuilder
, but the fundamentals haven't changed. (Basically StringBuilder
is like StringBuffer
, but without synchronization.)
这篇关于字符串连接真的那么慢吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!