问题描述
我是Java的并发编程的新手。
I'm new to concurrent programming in java.
我需要读取,分析和处理一个极其快速增长的日志文件,所以我得到
快。
我的想法是读取文件(逐行)和匹配相关行我想要
将这些行传递到单独的线程,可以做进一步处理的行。
我在以下示例代码中调用这些线程IOThread。
I need to read, analyze and process an extremely fast growing logfile, so I got to befast.My idea was to read the file (line by line) and upon matching a relevant line I want topass those lines to separate threads that can do further processing on the line.I called these threads "IOThread" in the following example code.
我的问题是,IOthread.run()中的BufferedReader readline从不返回。
什么是在线程内读取Stream的工作方式?
有没有比下面更好的方法?
My problem is that the BufferedReader readline in IOthread.run() apparently never returns.What is a working way to read the Stream inside the thread?Are there any better approaches than the one below?
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
class IOThread extends Thread {
private InputStream is;
private int t;
public IOThread(InputStream is, int t) {
this.is = is;
this.t = t;
System.out.println("iothread<" + t + ">.init");
}
public void run() {
try {
System.out.println("iothread<" + t + ">.run");
String line;
BufferedReader streamReader = new BufferedReader(new InputStreamReader(is));
while ((line = streamReader.readLine()) != null) {
System.out.println("iothread<" + t + "> got line " + line);
}
System.out.println("iothread " + t + " end run");
} catch (Exception e) {
e.printStackTrace();
}
}
}
public class Stm {
public Stm(String filePath) {
System.out.println("start");
try {
BufferedReader reader = new BufferedReader(new FileReader(filePath));
PipedOutputStream po1 = new PipedOutputStream();
PipedOutputStream po2 = new PipedOutputStream();
PipedInputStream pi1 = new PipedInputStream(po1);
PipedInputStream pi2 = new PipedInputStream(po2);
IOThread it1 = new IOThread(pi1,1);
IOThread it2 = new IOThread(pi2,2);
it1.start();
it2.start();
// it1.join();
// it2.join();
String line;
while ((line = reader.readLine()) != null) {
System.out.println("got line " + line);
if (line.contains("aaa")) {
System.out.println("passing to thread 1: " + line);
po1.write(line.getBytes());
} else if (line.contains("bbb")) {
System.out.println("passing to thread 2: " + line);
po2.write(line.getBytes());
}
}
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
new Stm(args[0]);
}
}
输入文件示例如下:
line 1
line 2
line 3 aaa ...
line 4
line 5 bbb ...
line 6 aaa ...
line 7
line 8 bbb ...
line 9 bbb ...
line 10
以输入文件的文件名作为参数调用上述代码。
Call the above code with the filename of the input file as argument.
推荐答案
恕我直言。为处理内容创建多个线程,而不是从文件中读取数据。当从文件读取数据时,无论如何,你是瓶颈,所以拥有多个线程将没有任何区别。最简单的解决方案是在给定的线程中尽可能快地读取行,并将行存储在共享队列中。
IMHO you have got it backwards. Create multiple threads for "processing" stuff and not for reading data from the file. When reading data from file, you are anyways bottlenecked so having multiple threads won't make any difference. The simplest solution is to read lines as fast as you can in a given thread and store the lines in a shared queue. This queue can then be accessed by any number of threads to do the relevant processing.
这样,你可以做并发处理的东西,而I / O或阅读器线程正忙于读/等待数据。如果可能,在阅读器线程中将逻辑保持为最小。只读那些行,让工作者线程做真正的繁重东西(匹配模式,进一步处理等)。
This way, you can actually do concurrent processing stuff while the I/O or reader thread is busy reading/waiting for the data. If possible, keep the "logic" to a minimum in the reader thread. Just read those lines and let the worker threads do the real heavy lifting stuff (matching pattern, further processing etc.). Just go with a thread safe queue and you should be kosher.
编辑:使用,基于数组或链表。
Use some variant of the BlockingQueue
, either array based or linked list based.
这篇关于BufferedReader读取线程中的readline的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!