问题描述
我正在测试将水槽加载到hHase中的水槽,并考虑由于水槽和水槽之间的速度差距而使用水槽的选择器和接收器进行并行数据加载.
I am testing flume to load data into hHase and thinking about parallel data loading with using flume's selector and inteceptor, because of speed gap between source and sink.
所以,我想要做的是水槽
So, what I want to do with flume are
-
使用拦截器的regex_extractor类型创建事件的标题
creating Event's header with interceptors's regex_extractor type
带有标头的多路复用事件,具有选择器的多路复用类型的两个以上频道
multiplexing Event with header to more than two channels with selector's multiplexing type
在一个源通道接收器中.
in one source-channel-sink.
并尝试了如下配置.
agent.sources = tailsrc
agent.channels = mem1 mem2
agent.sinks = std1 std2
agent.sources.tailsrc.type = exec
agent.sources.tailsrc.command = tail -F /home/flumeuser/test/in.txt
agent.sources.tailsrc.batchSize = 1
agent.sources.tailsrc.interceptors = i1
agent.sources.tailsrc.interceptors.i1.type = regex_extractor
agent.sources.tailsrc.interceptors.i1.regex = ^(\\d)
agent.sources.tailsrc.interceptors.i1.serializers = t1
agent.sources.tailsrc.interceptors.i1.serializers.t1.name = type
agent.sources.tailsrc.selector.type = multiplexing
agent.sources.tailsrc.selector.header = type
agent.sources.tailsrc.selector.mapping.1 = mem1
agent.sources.tailsrc.selector.mapping.2 = mem2
agent.sinks.std1.type = file_roll
agent.sinks.std1.channel = mem1
agent.sinks.std1.batchSize = 1
agent.sinks.std1.sink.directory = /var/log/flumeout/1
agent.sinks.std1.rollInterval = 0
agent.sinks.std2.type = file_roll
agent.sinks.std2.channel = mem2
agent.sinks.std2.batchSize = 1
agent.sinks.std2.sink.directory = /var/log/flumeout/2
agent.sinks.std2.rollInterval = 0
agent.channels.mem1.type = memory
agent.channels.mem1.capacity = 100
agent.channels.mem2.type = memory
agent.channels.mem2.capacity = 100
但是,它不起作用!
当删除选择器部分时,在flume的日志中有一些拦截器调试消息.但是,当选择器和拦截器是在一起的,有什么.
when selector part is removed, there are some interceptor debugging message in flume's log.but when selector and interceptor are together, there are nothing.
有什么错误的表达或我错过的东西吗?
Is there any wrong expression or something I missed?
感谢您的阅读. :)
推荐答案
我找到了.
在水槽日志中,有如下警告消息.
In the flume log, there are warning message as below.
2013-10-10 16:34:20,514 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)] Removed tailsrc due to Failed to configure component!
所以我已经在行下面附加了
so I had attached below line
agent.sources.tailsrc.channels = mem1 mem2
然后就可以了!!!!
and then It works!!!!
这篇关于如何在水槽中一起使用regex_extractor选择器和多路复用拦截器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!