- grok {
- match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:elb_name} %{IP:elb_client_ip}:%{INT:elb_client_port:int} (?:%{IP:elb_backend_ip}:%{NUMBER:elb_backend_port:int}|-) (?:%{NUMBER:request_processing_time:float}|-1) (?:%{NUMBER:backend_processing_time:float}|-1) (?:%{NUMBER:requestresponse_processing_time:float}|-1) %{INT:elb_status_code:int} %{INT:backend_status_code:int} %{INT:elb_received_bytes:int} %{INT:elb_sent_bytes:int} \"%{GREEDYDATA:elb_request}\" \"%{GREEDYDATA:userAgent}\" (?:%{NOTSPACE:elb_sslcipher}|-) (?:%{NOTSPACE:elb_sslprotocol}|-)"}
- }
测试样本如下,
2017-03-01T02:20:13.897023Z cf 88.99.90.240:56230 - -1 -1 -1 504 0 0 0 "GET https://habib0987.xxx.com:443/value.php HTTP/1.1" "Mozilla/4.0 (compatible; cron-job.org; http://cron-job.org/abuse/)" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2
2017-03-01T02:24:06.194449Z cf 167.89.125.231:61926 10.10.81.5:80 0.000049 0.000447 0.000026 400 400 522 15 "POST http://embroker.xxx.com:80/events HTTP/1.1" "SendGrid Event API" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2
2017-03-01T02:14:34.610166Z cf 54.87.226.214:32068 - -1 -1 -1 504 0 0 0 "GET http://ayubkhan786.xxx.com:80/value.php HTTP/1.1" "-" - -
此外ruby code和event结合在某些情况下可以结合使用替代grok,grok比较耗费资源。
https://www.elastic.co/blog/do-you-grok-grok
多说一个经验,
> name = ['client','servername','url']
name.zip(("chrome web http://www.google.com").split())
=> [["client", "chrome"], ["servername", "web"], ["url", "http://www.google.com"]]
> Hash[name.zip(("chrome web http://www.google.com").split())]
=> {"client"=>"chrome", "servername"=>"web", "url"=>"http://www.google.com"}
结合https://www.elastic.co/guide/en/logstash/current/event-api.html
可以用ruby这个filter来get 某个filed然后分割,再set回去,适用于grok的pattern非常难的情况。示例如下
filter {
ruby {
init =>"key = ['x','y'......]"
code =>"
new_event = Hash[key.zip(event.get('YOUR_FIELD').split(' '))]
event.set(new_event)
"
}
}