问题描述
我已经创建了一个接受提供的输入并返回记录流的解析库。然后一个程序调用该库并处理结果。在我的情况下,我的程序正在使用类似
I have created a parsing library that accepts a provided input and returns a stream of Records. A program then calls this library and processes the results. In my case, my program is using something like
recordStream.forEach(r -> insertIntoDB(r));
可以提供给解析库的输入类型之一是一个平面文件,有一个标题行。因此,解析库可以配置为跳过标题行。如果配置了一个标题行,它会向返回值添加一个skip(n)元素。
One of the types of input that can be provided to the parsing library is a flat file, which may have a header row. As such, the parsing library can be configured to skip a header row. If a header row is configured, it adds a skip(n) element to the return, e.g.
Files.lines(input)**.skip(1)**.parallel().map(r -> createRecord(r));
解析库返回生成的流。
但是,似乎跳过,并行和forEach不能很好地一起玩最终的程序员必须调用forEachOrdered,但是设计不好将这个要求放在程序员身上,期望他们知道他们必须使用forEachOrdered如果处理输入类型的文件与标题行。
But, it seems that skip, parallel and forEach do not play nicely togetherThe end programmer must instead invoke forEachOrdered, but it is poor design to put this requirement on the programmer, to expect them to know they must use forEachOrdered if dealing with an input type of a file with a header row.
如何在必要时强制执行有序的需求,在返回的流链的构建中,返回一个完整的功能流到程序编写者,而不是具有隐藏限制的流?是否将流包装在另一个流中?
How can I enforce the ordered requirement myself when necessary, within the construction of the returned stream chain, to return a fully functional stream to the program writer, instead of a stream with hidden limitations? Is the answer to wrap the stream in another stream?
推荐答案
forEachOrdered
不必因为 skip()
,而是因为您的Stream并行。即使流是并行的,流将跳过第一个元素,如文档中所示:
forEachOrdered
is necessary not because of the skip()
, but because your Stream is parallel. Even if the stream is parallel, the stream will skip the first element, as indicated in the documentation:
明确记载, forEach
不必然尊重秩序。不要使用 forEachOrdered
,当您关心订单时,只是滥用Stream API:
It's clearly documented that forEach
doesn't necessarily respect the order. Not using forEachOrdered
when you care about the order is just a misuse of the Stream API:
我不会从库中返回并行流。我会返回一个顺序的(其中forEach会尊重订单),并让调用者调用 parallel()
并假设后果如果要。
I would not return a parallel stream from the library. I would return a sequential one (where forEach would respect the order), and let the caller call parallel()
and assume the consequences if it wants to.
默认使用并行流是一个。
Using a parallel stream by default is a bad idea.
这篇关于如何设计可能使用skip的返回流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!