本文介绍了当AWS KCL processRecords失败时,如何“标记”?应该重新处理记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用其API基于AWS KCL的AWS DynamoStream。

I'm working with AWS DynamoStream which his API is based on the AWS KCL.

在我收到无法处理的记录并且想要这些记录的情况下稍后再提供以允许对其进行重新处理。例如,我试图将它们保存到远程数据库,并且有时会遇到网络问题。

In cases I received records which I failed to process and I want those records to be available later to allow reprocessing of them. For instance I'm trying to save them to a remote DB and I experience network issues sometime.

我的问题是:


  1. 我可以通过某种方式使用Checkpointer指示我没有处理过记录吗?

  2. 我应该只是避免执行Checkpointer.checkpoint()?如果我仍在下一个 processRecords 调用中使用它,是否会有任何效果?

  3. 也许我可以使用任何例外

  1. Can I use the Checkpointer in some way to indicate I Didn't handled the records?
  2. Should I just avoid executing Checkpointer.checkpoint()? will it have any effect if I still use it in the next call of processRecords?
  3. Is there maybe any exception I may use for that purpose?


推荐答案

KCL不提供这种内置的重新驱动机制-一次processRecords返回(无论是引发异常还是成功返回),即使内部失败,也会将这些记录视为已处理并继续运行。

KCL does not provide this sort of built-in redrive mechanism - once processRecords returns (whether it threw an exception or returned successfully), it considers those records as processed and moves on, even if internally it failed.

如果要重新处理某些记录在稍后的记录中,您需要捕获这些记录并将它们存储在其他位置,以便以后进行重新处理(显然,警告是它们不会从流的其余部分开始按顺序进行处理)。

If you want to reprocess some records at a later point, you need to capture those records and store them somewhere else for reprocessing attempt later (with the obvious caveat that they won't be processed in order from the rest of the stream).

最简单的解决方案是让记录处理器逻辑识别失败的记录(在返回KCL之前),然后将它们发送到SQS队列。 strong>这样,记录就不会丢失,并且可以在您闲暇时进行处理(或由消耗SQS队列的其他进程使用,可能还有DLQ机制来处理重复的故障/放弃情况)。

The simplest solution for this is to have your record processor logic identify the failed records (before returning to KCL) and send them to an SQS queue. Then, the records aren't lost, and they're available for processing at your leisure (or by another process consuming the SQS queue, possibly with a DLQ mechanism for handling repeated failures / give-up scenarios).

回答您的特定问题:


  1. 没有。检查点只是说:我已经走了这么远,不要看检查点之前的事情。

  2. 认为检查点就像一个全局状态。设置完成后,它将包含之前的所有内容。您也不需要检查对processRecords的每次调用-您可以每隔X秒或每条Y记录进行检查,等等。

  3. 不在KCL级别上-您可以使用特殊异常在内部键入内容,并在返回Kinesis之前在您的processRecords外部级别进行捕获。或者,您也可以捕获所有异常-这取决于您以及您对重新驱动逻辑的具体要求。

这篇关于当AWS KCL processRecords失败时,如何“标记”?应该重新处理记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-31 06:11