问题描述
根据此处
为什么Flume来源需要识别或理解消息的格式?它所做的只是将邮件转发到其中一个频道.
Why does a Flume source need to recognize or understand the format of the message? While all it does it does is to forward the message to one of the channel.
推荐答案
根据我的了解,Flume将传输数据封装在由报头和有效负载构成的事件包中(传输数据).从文档中:
Since what I've learnt, Flume encapsulate the transfering data in an event packet made by an header and a payload (the transfering data). From the documentation:
在您引用文档之前.
您指定的格式是事件包的格式,而不是数据的格式.
The format you specify is the format of the event packet, not the format of your data.
让我们假设您有这个代理人:
Let's suppose you have this agent:
plain_to_avro_translator.sources = plain-source avro-source
plain_to_avro_translator.sinks = avro-sink local-file-sink
plain_to_avro_translator.channels = mem-channel1 mem-channel2
plain_to_avro_translator.sources.plain-source.channels = mem-channel1
plain_to_avro_translator.sources.plain-source.type = exec
plain_to_avro_translator.sources.plain-source.restart = true
plain_to_avro_translator.sources.plain-source.restartThrottle = 40000
plain_to_avro_translator.sources.plain-source.command = cat /home/user/data.log
plain_to_avro_translator.sinks.avro-sink.channel = mem-channel1
plain_to_avro_translator.sinks.avro-sink.type = thrift
plain_to_avro_translator.sinks.avro-sink.hostname = 192.168.200.43
plain_to_avro_translator.sinks.avro-sink.port = 6000
plain_to_avro_translator.channels.mem-channel1.type = memory
plain_to_avro_translator.channels.mem-channel1.capacity = 100
plain_to_avro_translator.channels.mem-channel1.transactionCapacity = 100
plain_to_avro_translator.sources.avro-source.channels = mem-channel2
plain_to_avro_translator.sources.avro-source.type = thrift
plain_to_avro_translator.sources.avro-source.bind = 0.0.0.0
plain_to_avro_translator.sources.avro-source.port = 6000
plain_to_avro_translator.channels.mem-channel2.type = memory
plain_to_avro_translator.channels.mem-channel2.capacity = 100
plain_to_avro_translator.channels.mem-channel2.transactionCapacity = 100
plain_to_avro_translator.sinks.local-file-sink.channel = mem-channel2
plain_to_avro_translator.sinks.local-file-sink.type = file_roll
plain_to_avro_translator.sinks.local-file-sink.sink.directory = /home/user/flume_output
这将毫无问题,并且不依赖于data.log格式(您可以编写所需的任何格式的内容).如果您尝试将avro-sink类型设置为avro而不是节俭,则会从avro-source收到错误消息,因为它期望节俭格式事件.
This will work with no problems and is not dependant from the data.log format (you can write whatever you need and in whatever format). If you try to set the avro-sink type to avro instead of thrift, you will get errors from avro-source because it expects thrift format event.
接收器和源需要知道如何解析事件包.
Sink and source needs to know how to parse event packet.
希望我一切都好.如果我错了,请任何人纠正我.
Hope I got it well. Please anyone correct me if I am wrong.
这篇关于为什么Flume来源需要识别消息的格式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!