本文介绍了有没有办法将 PubSubIO 读取转换为 UnboundedSource 源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 PubSub 订阅 作为有界源,以最大限度地降低流管道一直运行的成本.在 具有无限源的批处理管道但没有解决方案.我遇到了这个答案 What PipelineRunners 说我们可以将 UnboundedSource 转换为 BoundedSource 以使用 withMaxNumRecords 进行测试.是否可以在此处使用 PubSubIO 作为输入,或者是否可以将 PubSubIO 读取unboundedSource?

I wanted to use PubSub subscription as a bounded source to minimise cost of streaming pipeline running all the time.Similar question was asked before Batch Pipeline with Unbounded Source but no solution. I came across this answer What PipelineRunners which says we can turn the UnboundedSource into a BoundedSource for testing using withMaxNumRecords.Is it possible to use PubSubIO as input here or Is there a way to convert PubSubIO read to unboundedSource?

UnboundedSource<String> unboundedSource  = .; // How to Use PubSub here?
PCollection<String> boundedPubsubCollection =
    p.apply(Read.from(unboundedSource).withMaxNumRecords(10));

推荐答案

目前 PubSubIO 并不很好地支持这一点,而且对于Beam 模型"来说有点奇怪.一些选项:

This is currently not well supported by PubSubIO, and it's a little odd for 'the Beam model'. Some options:

  1. 您是否尝试过启动管道并定期排空管道?
  2. 如果这不起作用,您应该在 Beam 邮件列表或 问题跟踪器.

这篇关于有没有办法将 PubSubIO 读取转换为 UnboundedSource 源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-23 01:27