输入json到logstash - 配置问题？

本文介绍了输入json到logstash - 配置问题？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下的json输入，我想要转储到logstash（并最终在elasticsearch / kibana中搜索/仪表板）。

i have the following json input that i want to dump to logstash (and eventually search/dashboard in elasticsearch/kibana).

{"vulnerabilities":[
    {"ip":"10.1.1.1","dns":"z.acme.com","vid":"12345"},
    {"ip":"10.1.1.2","dns":"y.acme.com","vid":"12345"},
    {"ip":"10.1.1.3","dns":"x.acme.com","vid":"12345"}
]}

我正在使用以下logstash配置

i'm using the following logstash configuration

input {
  file {
    path => "/tmp/logdump/*"
    type => "assets"
    codec => "json"
  }
}
output {
  stdout { codec => rubydebug }
  elasticsearch { host => localhost }
}

输出

{
       "message" => "{\"vulnerabilities\":[\r",
      "@version" => "1",
    "@timestamp" => "2014-10-30T23:41:19.788Z",
          "type" => "assets",
          "host" => "av12612sn00-pn9",
          "path" => "/tmp/logdump/stack3.json"
}
{
       "message" => "{\"ip\":\"10.1.1.30\",\"dns\":\"z.acme.com\",\"vid\":\"12345\"},\r",
      "@version" => "1",
    "@timestamp" => "2014-10-30T23:41:19.838Z",
          "type" => "assets",
          "host" => "av12612sn00-pn9",
          "path" => "/tmp/logdump/stack3.json"
}
{
       "message" => "{\"ip\":\"10.1.1.31\",\"dns\":\"y.acme.com\",\"vid\":\"12345\"},\r",
      "@version" => "1",
    "@timestamp" => "2014-10-30T23:41:19.870Z",
          "type" => "shellshock",
          "host" => "av1261wag2sn00-pn9",
          "path" => "/tmp/logdump/stack3.json"
}
{
            "ip" => "10.1.1.32",
           "dns" => "x.acme.com",
           "vid" => "12345",
      "@version" => "1",
    "@timestamp" => "2014-10-30T23:41:19.884Z",
          "type" => "assets",
          "host" => "av12612sn00-pn9",
          "path" => "/tmp/logdump/stack3.json"
}

显然logstash正在处理每一行作为一个事件，它认为 {漏洞：[是一个事件，我猜猜2个后续节点上的逗号逗号解析，最后一个节点看起来正确。我如何告诉logstash解析漏洞数组中的事件，并忽略该行末尾的逗号？

obviously logstash is treating each line as an event and it thinks {"vulnerabilities":[ is an event and i'm guessing the trailing commas on the 2 subsequent nodes mess up the parsing, and the last node appears coorrect. how do i tell logstash to parse the events inside the vulnerabilities array and to ignore the commas at the end of the line?

已更新：2014-11-05
遵循Magnus的建议，我添加了json过滤器，它的工作正常。但是，它不会正确解析json的最后一行，而不指定 start_position =>文件输入块中的开始。任何想法为什么不呢？我知道它默认解析为底部，但是会预期mutate / gsub会顺利处理吗？

Updated: 2014-11-05Following Magnus' recommendations, I added the json filter and it's working perfectly. However, it would not parse the last line of the json correctly without specifying start_position => "beginning" in the file input block. Any ideas why not? I know it parses bottom up by default but would anticipate the mutate/gsub would handle this smoothly?

file {
    path => "/tmp/logdump/*"
    type => "assets"
    start_position => "beginning"
  }
}
filter {
  if [message] =~ /^\[?{"ip":/ {
    mutate {
      gsub => [
        "message", "^\[{", "{",
        "message", "},?\]?$", "}"
      ]
    }
    json {
      source => "message"
      remove_field => ["message"]
    }
  }
}
output {
  stdout { codec => rubydebug }
  elasticsearch { host => localhost }
}

推荐答案

你可以跳过json编解码器，并使用多行过滤器将消息加入单个字符串，您可以将其提供给json filter.filter {

You could skip the json codec and use a multiline filter to join the message into a single string that you can feed to the json filter.filter {

filter {
  multiline {
    pattern => '^{"vulnerabilities":\['
    negate => true
    what => "previous"
  }
  json {
    source => "message"
  }
}

但是，这会产生以下不需要的结果：

However, this produces the following unwanted results:

{
            "message" => "<omitted for brevity>",
           "@version" => "1",
         "@timestamp" => "2014-10-31T06:48:15.589Z",
               "host" => "name-of-your-host",
               "tags" => [
        [0] "multiline"
    ],
    "vulnerabilities" => [
        [0] {
             "ip" => "10.1.1.1",
            "dns" => "z.acme.com",
            "vid" => "12345"
        },
        [1] {
             "ip" => "10.1.1.2",
            "dns" => "y.acme.com",
            "vid" => "12345"
        },
        [2] {
             "ip" => "10.1.1.3",
            "dns" => "x.acme.com",
            "vid" => "12345"
        }
    ]
}

除非有固定数量的漏洞数组中的元素我不认为我们可以做到这一点（不使用红宝石过滤器）。

Unless there's a fixed number of elements in the vulnerabilities array I don't think there's much we can do with this (without resorting to the ruby filter).

如何应用json过滤到看起来像我们想要的线条，放下其余的线条你的问题并不清楚所有的日志是否都是这样的，所以这可能不会那么有用。

How about just applying the json filter to lines that look like what we want and drop the rest? Your question doesn't make it clear whether all of the log looks like this so this may not be so useful.

filter {
  if [message] =~ /^\s+{"ip":/ {
    # Remove trailing commas
    mutate {
      gsub => ["message", ",$", ""]
    }
    json {
      source => "message"
      remove_field => ["message"]
    }
  } else {
    drop {}
  }
}

这篇关于输入json到logstash - 配置问题？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！