问题描述
agent.json中包含以下内容
I have the following in the agent.json
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "",
"flows": [
{
"filePattern": "/home/ec2-user/ETLdata/contracts/Delta.csv",
"kinesisStream": "ETL-rawdata-stream",
"partitionKeyOption": "RANDOM",
"dataProcessingOptions": [
{
"optionName": "CSVTOJSON",
"customFieldNames": [ "field1", "field2"],
"delimiter": ","
}
]
}
]
}
当我将指定文件添加到文件夹时,实际上什么也没发生.我只在日志中看到以下内容.为什么根本不解析文件.有人知道吗?
When I add the specified file to the folder, literally nothing happens. I only see the below in the logs. Why is it not parsing the file at all. Does anyone have any idea?
更新:当我将文件模式设置为/tmp/delta.csv时,它可以工作.看起来像是权限问题,但日志中没有错误.
update: It works when I make the file pattern as /tmp/delta.csv. Looks like a permission issue but no errors in the logs.
推荐答案
我遇到了类似的问题,我可以通过执行以下操作来解决它:
I had a similar issue, I was able to solve it by doing the following:
-
将要发送到kinesis firehose流(一堆CSV文件)的数据从〜/ec2-user/out-data移动到另一个目录:
moving the data to be sent to the kinesis firehose stream (a bunch of CSV files) from ~/ec2-user/out-data to another directory:
mv *.csv /tmp/out-data
编辑agent.json文件,以便代理从文件开头开始读取-这是我的agent.json文件:
edit the agent.json file so that the agent starts reading at the beginning of the file- here is my agent.json file:
{
"cloudwatch.emitMetrics": true,
"firehose.endpoint": "firehose.eu-west-1.amazonaws.com",
"flows": [
{
"filePattern": "/tmp/out-data/trx_headers_2017*",
"deliveryStream": "TestDeliveryStream",
"initialPosition": "START_OF_FILE"
}
]
}
我的猜测是您的Delta.csv文件正在被写入,因此kinesis代理正在检查文件的末尾并没有找到新的记录,如果添加"initialPosition" : "START_OF_FILE"
修复程序,它将在文件的开头开始解析
my guess is that your Delta.csv file is being written to so the kinesis agent is checking the end of the file and finding no new records, if you add the "initialPosition" : "START_OF_FILE"
fix it will start parsing at the beginning of file.
这篇关于Kinesis代理未解析文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!