问题描述
我想使用 PubSub 订阅
作为有界源,以最大限度地降低流管道一直运行的成本.在 具有无限源的批处理管道但没有解决方案.我遇到了这个答案 What PipelineRunners 说我们可以将 UnboundedSource
转换为 BoundedSource
以使用 withMaxNumRecords
进行测试.是否可以在此处使用 PubSubIO 作为输入,或者是否可以将 PubSubIO
读取unboundedSource
?
I wanted to use PubSub subscription
as a bounded source to minimise cost of streaming pipeline running all the time.Similar question was asked before Batch Pipeline with Unbounded Source but no solution. I came across this answer What PipelineRunners which says we can turn the UnboundedSource
into a BoundedSource
for testing using withMaxNumRecords
.Is it possible to use PubSubIO as input here or Is there a way to convert PubSubIO
read to unboundedSource
?
UnboundedSource<String> unboundedSource = .; // How to Use PubSub here?
PCollection<String> boundedPubsubCollection =
p.apply(Read.from(unboundedSource).withMaxNumRecords(10));
推荐答案
目前 PubSubIO 并不很好地支持这一点,而且对于Beam 模型"来说有点奇怪.一些选项:
This is currently not well supported by PubSubIO, and it's a little odd for 'the Beam model'. Some options:
- 您是否尝试过启动管道并定期排空管道?
- 如果这不起作用,您应该在 Beam 邮件列表或 问题跟踪器.
这篇关于有没有办法将 PubSubIO 读取转换为 UnboundedSource 源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!