问题描述
所以我使用标准的 ELK 堆栈来分析 Apache 访问日志,它运行良好,但我希望使用 KV 过滤器将 URL 参数分解为字段,以便我编写更好的查询.
So I'm using a standard ELK stack to analyse Apache access logs, which is working well, but I'm looking to break out URL parameters as fields, using the KV filter, in order to allow me to write better queries.
我的问题是我正在分析的那个应用程序具有缓存破坏"动态生成的参数,这会导致数以万计的字段",每个都出现一次.ElasticSearch 似乎在这方面遇到了严重的问题,它们对我没有价值,所以我想删除它们.下面是一个模式的例子
My problem is that that app I'm analysing has 'cache-busting' dynamically generated parameters, which leads to tens of thousands of 'fields', each occurring once. ElasticSearch seems have severe trouble with this and they have no value to me, so I'd like to remove them. Below is an example of the pattern
GET/page?rand123PQY=ABC&other_var=somethingGET/page?rand987ZDQ=DEF&other_var=something
在上面的示例中,我要删除的参数以rand"开头.目前我的 logstash.conf 使用 grok 从访问日志中提取字段,然后使用 kv 提取查询字符串参数:
In the example above, the parameters I want to remove start 'rand'. Currently my logstash.conf uses grok to extract fields from the access logs, followed by kv to extract Query string parameters:
筛选 {神通{路径 =>/var/log/apache/access.log"类型 =>阿帕奇访问"}千伏{field_split =>&?"}}有什么方法可以过滤掉与模式 rand[A-Z0-9]*=[A-Z0-9]*
匹配的任何字段?我见过的大多数示例都是按确切名称定位字段,我不能使用.我确实想知道如何将请求字段正则表达式转换为一个新字段,在其上运行 KV,然后将其删除.那行得通吗?
filter { grok { path => "/var/log/apache/access.log" type => "apache-access" } kv { field_split => "&?" }}
Is there a way I can filter out any fields matching the pattern rand[A-Z0-9]*=[A-Z0-9]*
? Most examples I've seen are targeting fields by exact name, which I cannot use. I did wonder about regexing the request field into a new field, running KV on that, then removing it. Would that work?
推荐答案
如果您感兴趣的字段集是已知且明确定义的,您可以设置 target
对于 kv 过滤器,使用 并删除带有嵌套键/值对的字段.我想这几乎就是你最后建议的.
If the set of fields that you are interested in is known and well-defined you could set target
for the kv filter, move the interesting fields to the top level of the message with a mutate filter and delete the field with the nested key/value pairs. I think this is pretty much what you suggested at the end.
或者,您可以使用 ruby 过滤器:
filter {
ruby {
code => "
event.to_hash.keys.each { |k|
if k.start_with?('rand')
event.remove(k)
end
}
"
}
}
这篇关于按模式从事件中删除字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!