fluentd 推送 mariadb audit log

说明:

mariadb audit log是 mariadb 的审计日志

目的是把日志拆分成 tab 键分隔的字段

直接附上 fluentd 配置文件

<system>

  log_level error

</system>

<source>

  @type tail

  path /data/mysql_audit/*

  limit_recently_modified 86400
  open_on_every_update true

  tag mysql_audit

  read_from_head true

  pos_file /tmp/fluentd.pos

  <parse>

    @type multiline

    format_firstline /^\d{8}/

    format1 /^(?<Date>\d{8}) (?<Hour>\d{2}):(?<Min>\d{2}):(?<Sec>\d{2}),(?<host>[^,]+),(?<user>[^,]+),(?<ip>[^,]+),(?<connid>[^,]+),(?<queryid>[^,]+),(?<action>[^,]+),(?<db>[^,]+),(?<message>.*),(?<retcode>\d+)$/

  </parse>

</source>

<filter mysql_audit>

  @type grep

  <regexp>

    key action

    pattern QUERY

  </regexp>

  <exclude>

    key user

    pattern lagou_status

  </exclude>

  <exclude>

    key db

    pattern information_schema

  </exclude>

</filter>

<filter mysql_audit>

  @type record_transformer

  enable_ruby

  <record>

    message ${record["message"].gsub(/\s/, ' ')}

    message ${record["message"].gsub(/\s+/, ' ')}

  </record>

</filter>

<match mysql_audit>

  #@type stdout

  @type webhdfs

  host oss-hadoop-namenode-bjc-002

  path /mysql_audit/${Date}/${host}_${Hour}

  append true

  compress gzip

  <format>

    @type csv

    fields Date,Hour,Min,Sec,host,user,ip,action,db,message,retcode

    delimiter '    '

  </format>

  <buffer host,Date,Hour>

    @type memory

    flush_interval 20s

  </buffer>

</match>

fluentd 比 logstash 内存占用大大下降

分析同样的日志 logstash 占用700M, fluentd 占用35M

不过 cpu 占用相当,对于日志量大的机器 cpu 到100%

看来对日志做正则过滤很损耗 cpu