我在InputStream中执行简单的行号计算(NewLines#10的计算数)

for (int i = 0; i < readBytes ; i++) {
    if ( b[ i + off ] == 10 ) {                     // New Line (10)
        rowCount++;
    }
}


我可以更快吗?没有一个字节的迭代?
可能我正在寻找一些能够使用CPU特定指令(simd / sse)的类。

所有代码:

@Override
public int read(byte[] b, int off, int len) throws IOException {

    int readBytes = in.read(b, off, len);

    for (int i = 0; i < readBytes ; i++) {
        hadBytes = true;                                // at least once we read something
        lastByteIsNewLine = false;
        if ( b[ i + off ] == 10 ) {                     // New Line (10)
            rowCount++;
            lastByteIsNewLine = (i == readBytes - 1);   // last byte in buffer was the newline
        }
    }

    if ( hadBytes && readBytes == -1 && ! lastByteIsNewLine ) {   // file is not empty + EOF + last byte was not NewLine
        rowCount++;
    }

    return readBytes;
}

最佳答案

在我的系统上,仅将lastByteIsNewLinehasBytes部分移出循环会导致〜10%的改善*:

  public int read(byte[] b, int off, int len) throws IOException {

    int readBytes = in.read(b, off, len);

    for (int i = 0; i < readBytes ; i++) {
      if ( b[ i + off ] == 10 ) {
        rowCount++;
      }
    }
    hadBytes |= readBytes > 0;
    lastByteIsNewLine = (readBytes > 0 ? b[readBytes+off-1] == 10 : false);

    if ( hadBytes && readBytes == -1 && ! lastByteIsNewLine ) {
      rowCount++;
    }

    return readBytes;
  }


*从填充有任意文本的ByteArrayInputStream读取的10MB缓冲区上的1000次迭代为6000毫秒vs 6700毫秒,可进行1,000次迭代。

09-04 08:47