如何确保风暴不会将消息两次写入本地文件

如何确保风暴不会将消息两次写入本地文件

本文介绍了如何确保风暴不会将消息两次写入本地文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我构建了一个 topo 来通过 kafka 获取消息,然后 grep 一些关键字,如果合适,写入本地文件.

I build a topo to get messages from by kafka, and then grep some keyword, if fit, write to local file.

我使用storm-kafka的OpaqueTridentKafkaSpout来保证元组不会遗漏或重复,但考虑一种情况:在向本地文件写入消息时,发生一些错误(例如,空间不足).此时,有些消息已经写入本地文件,有些则没有,如果spout重新发送消息,消息将被写入两次.

I use OpaqueTridentKafkaSpout of storm-kafka to ensure the tuple will not miss or repeat, but consider one situation: when writing message to local file, and some error occur (for example, not enough space). At this moment, some messages have written to local file, and others not, if the spout resend the message, the message will be write twice.

如何处理?

推荐答案

很简单.写入文件的代码需要做以下事情:

It's simple. The code that writes to the file needs to do the following:

1) 确认元组 - 仅当写入文件成功时.2)使元组失败 - 如果写入文件不成功.

1) Ack the tuple - Only if the write to a file is successful.2) Fail the tuple - If the write to a file was NOT successful.

对于所有被确认的元组,Kafka spout 不会重新发送它们.失败的元组将被 spout 重置.

For all tuples that were ack'd, Kafka spout will NOT resend them. Failed tuples will be reset by the spout.

这篇关于如何确保风暴不会将消息两次写入本地文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 17:00