本文介绍了如何设置HTTP源以测试Flume设置?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Flume和Hadoop的新手。我们正在开发一个BI模块,我们可以将来自不同服务器的所有日志存储在HDFS中。



为此,我使用Flume。我刚开始尝试。成功创建节点,但现在我愿意设置一个HTTP源和一个接收器,它将通过HTTP将传入请求写入本地文件。



任何建议?



在此先感谢/

解决方案

希望这可以帮助您开始。我在我的机器上测试了一些问题,现在没有时间完全排除故障,但我会做到这一点......



假设你现在有了Flume并运行,这应该是你的flume.conf文件需要看起来像使用HTTP POST源和本地文件接收器(注意:这是去往本地文件,而不是HDFS)。

  ##########新代理商########## 
#flume-ng代理商-f /etc/flume/conf/flume.httptest.conf -n httpagent


#slagent = SysLogAgent
############ ###################
httpagent.sources = http-source
httpagent.sinks = local-file-sink
httpagent.channels = ch3

#定义/配置源(多端口似乎支持更新的东西)
##################### ##########
httpagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
httpagent.sources.http-source.channels = ch3
httpagent.sources.http-source.port = 81


#本地Fi le Sink
###############################
httpagent.sinks.local-file-sink .type = file_roll
httpagent.sinks.local-file-sink.channel = ch3
httpagent.sinks.local-file-sink.sink.directory = / root / Desktop / http_test
httpagent .sinks.local-file-sink.rollInterval = 5

#Channels
######################## #######
httpagent.channels.ch3.type =内存
httpagent.channels.ch3.capacity = 1000

用第二行的命令启动Flume。根据需要调整它(port,sink.directory和rollInterval)。这是一个相当简单的最低配置文件,有更多的选项可用,请查看Flume用户指南。现在,就这一点而言,代理商开始运行并为我运行良好。



以下是我没有时间测试的内容。 HTTP代理默认情况下接受JSON格式的数据。您应该能够通过发送带有如下形式的cURL请求来测试此代理:

  curl -X POST -H'Content-Type:application / json; charset = UTF-8'-d'{username:xyz,password:123}'http://yourdomain.com:81/ 

-X将请求设置为POST,-H发送标头,-d发送数据(有效的json),然后发送主机:端口。对我来说,问题是我得到一个错误:

  WARN http.HTTPSource:收到来自客户端的错误请求。 org.apache.flume.source.http.HTTPBadRequestException:请求具有无效的JSON语法。 

在我的Flume客户端中,无效的JSON?所以有些东西被发送错了。 Flume源正在接收数据,但出现错误的事实显示。无论你有什么,只要它是一个有效的格式,发布应该工作。


I am a newbie to Flume and Hadoop. We are developing a BI module where we can store all the logs from different servers in HDFS.

For this I am using Flume. I just started trying it out. Succesfully created a node but now I am willing to setup a HTTP source and a sink that will write incoming requests over HTTP to local file.

Any suggesstions?

Thanks in Advance/

解决方案

Hopefully this helps you get started. I'm having some problems testing this on my machine and don't have time to fully troubleshoot it right now, but I'll get to that...

Assuming you have Flume up and running right now, this should be what your flume.conf file needs to look like to use an HTTP POST source and local file sink (note: this goes to a local file, not HDFS)

########## NEW AGENT ########## 
# flume-ng agent -f /etc/flume/conf/flume.httptest.conf -n httpagent
# 

# slagent = SysLogAgent
###############################
httpagent.sources = http-source
httpagent.sinks = local-file-sink
httpagent.channels = ch3

# Define / Configure Source (multiport seems to support newer "stuff")
###############################
httpagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
httpagent.sources.http-source.channels = ch3
httpagent.sources.http-source.port = 81


# Local File Sink
###############################
httpagent.sinks.local-file-sink.type = file_roll
httpagent.sinks.local-file-sink.channel = ch3
httpagent.sinks.local-file-sink.sink.directory = /root/Desktop/http_test
httpagent.sinks.local-file-sink.rollInterval = 5

# Channels
###############################
httpagent.channels.ch3.type = memory
httpagent.channels.ch3.capacity = 1000

Start Flume with the command on the second line. Tweak it for your needs (port, sink.directory, and rollInterval especially). This is a pretty bare minimum config file, there are more options availible, check out the Flume User Guide. Now, as far as this goes, the agent starts and runs fine for me....

Here's what I don't have time to test. The HTTP agent, by default, accepts data in JSON format. You -should- be able to test this agent by sending a cURL request with a form something like this:

curl -X POST -H 'Content-Type: application/json; charset=UTF-8' -d '{"username":"xyz","password":"123"}' http://yourdomain.com:81/

-X sets the request to POST, -H sends headers, -d sends data (valid json), and then the host:port. The problem for me is that I get an error:

WARN http.HTTPSource: Received bad request from client. org.apache.flume.source.http.HTTPBadRequestException: Request has invalid JSON Syntax.

in my Flume client, invalid JSON? So something is being sent wrong. The fact that an error is popping up though shows the Flume source is receiving data. Whatever you have that's POSTing should work as long as it's in a valid format.

这篇关于如何设置HTTP源以测试Flume设置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-24 18:33