ELK 日志分析
1. 为什么用到 ELK
一般我们需要进行日志分析场景:直接在日志文件中 grep、 awk 就可以获得自己想要的信息。
但在规模较大的场景中,此方法效率低下,面临问题包括日志量太大如何归档、文本搜索太慢怎么办、
如何多维度查询。需要集中化的日志管理,所有服务器上的日志收集汇总。常见解决思路是建立集中
式日志收集系统,将所有节点上的日志统一收集,管理,访问。
一般大型系统是一个分布式部署的架构,不同的服务模块部署在不同的服务器上,问题出现时,
大部分情况需要根据问题暴露的关键信息,定位到具体的服务器和服务模块,构建一套集中式日志系
统,可以提高定位问题的效率。
一个完整的集中式日志系统,需要包含以下几个主要特点:
收集-能够采集多种来源的日志数据
传输-能够稳定的把日志数据传输到中央系统
存储-如何存储日志数据
分析-可以支持 UI 分析
警告-能够提供错误报告,监控机制 ELK 提供了一整套解决方案,并且都是开源软件,之间互相配
合使用,完美衔接,高效的满足了很多场合的应用。目前主流的一种日志系统。
2.ELK简介
ELK 是三个开源软件的缩写,分别表示: Elasticsearch , Logstash, Kibana , 它们都是开源软件。新增了
一个 FileBeat,它是一个轻量级的日志收集处理工具(Agent), Filebeat 占用资源少,适合于在各个服务
器上搜集日志后传输给 Logstash,官方也推荐此工具。
Elasticsearch 是个开源分布式搜索引擎,提供搜集、分析、存储数据三大功能。它的特点有:分布式,
零配置,自动发现,索引自动分片,索引副本机制, restful 风格接口,多数据源,自动搜索负载等。
Logstash 主要是用来日志的搜集、分析、过滤日志的工具,支持大量的数据获取方式。一般工作方式
为 c/s 架构, client 端安装在需要收集日志的主机上, server 端负责将收到的各节点日志进行过滤、修
改等操作在一并发往 elasticsearch 上去。
Kibana 也是一个开源和免费的工具, Kibana 可以为 Logstash 和 ElasticSearch 提供的日志分析友好
的 Web 界面,可以帮助汇总、分析和搜索重要数据日志。
Filebeat 隶属于 Beats。目前 Beats 包含四种工具:
Packetbeat(搜集网络流量数据)
Topbeat(搜集系统、进程和文件系统级别的 CPU 和内存使用情况等数据)
Filebeat(搜集文件数据)
Winlogbeat(搜集 Windows 事件日志数据)
3. 实验部署
本次部署的是 filebeats(客户端), logstash+elasticsearch+kibana(服务端)组成的架构。
业务请求到达 nginx-server 机器上的 Nginx; Nginx 响应请求,并在 access.log 文件中增加访问记
录; FileBeat 搜集新增的日志,通过 LogStash 的 5044 端口上传日志; LogStash 将日志信息通过本
机的 9200 端口传入到 ElasticSerach; 搜索日志的用户通过浏览器访问 Kibana,服务器端口是 5601;
Kibana 通过 9200 端口访问 ElasticSerach;
实验环境:
本次部署的是单点 ELK 用了两台机器(CentOS7)
ELK 服务端: 192.168.180.113
Nginx 客户端: 192.168.180.112
1. 准备工作:
配置好网络 yum 源
# wget http://mirrors.aliyun.com/repo/Centos-7.repo # wget http://mirrors.aliyun.com/repo/epel-7.repo 关闭防火墙: systemctl stop(disable) firewalld 关闭 SELinux: SELINUX=disabled
2. 下载并安装软件包:
# mkdir /elk;cd /elk # wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.3.tar.gz # wget https://artifacts.elastic.co/downloads/logstash/logstash-6.2.3.tar.gz # wget https://artifacts.elastic.co/downloads/kibana/kibana-6.2.3-linux-x86_64.tar.gz
全部解压缩,并复制到/usr/local/目录下
3. 安装 JDK(java)环境工具:
# yum -y install java-1.8*
4. 配置 elasticsearch:
1) 新建 elasticsearch 用户并启动(用 elasticsearch 普通用户启动)
# useradd es # chown -R elasticsearch.elasticsearch /mnt/elasticsearch-6.2.3/
[root@topcheer elasticsearch-6.2.3]# su - es 上一次登录:日 9月 29 22:53:43 CST 2019pts/1 上 [es@topcheer ~]$ cd /mnt [es@topcheer mnt]$ ll 总用量 4175516 drwxr-xr-x 9 root root 160 10月 4 20:30 apache-tomcat-7.0.70 drwxr-xr-x 9 root root 160 11月 4 01:25 apache-tomcat-7.0.70_1 -rw-r--r-- 1 root root 8924465 10月 4 20:30 apache-tomcat-7.0.70.tar.gz -rw-r--r-- 1 root root 540 9月 21 22:14 Dockerfile drwxr-xr-x 9 es es 155 9月 21 22:58 elasticsearch-6.2.3 -rw-r--r-- 1 root root 29050159 9月 21 22:45 elasticsearch-6.2.3.tar.gz -rw-r--r-- 1 root root 128 9月 29 21:47 elasticsearch.yml -rw-r--r-- 1 root root 780598784 9月 29 21:27 es.tar -rw-r--r-- 1 root root 412774002 5月 30 2018 gitlab-ce-10.8.2-ce.0.el7.x86_64.rpm drwxr-xr-x 2 root root 100 11月 4 01:08 harbor -rw-r--r-- 1 root root 552897681 11月 4 01:07 harbor-offline-installer-v1.8.0.tgz -rw-r--r-- 1 root root 619113214 11月 4 00:38 harbor-offline-installer-v1.9.1.tgz -rw-r--r-- 1 root root 78245883 9月 30 14:25 jenkins.war drwxrwxr-x 12 wgr wgr 249 9月 21 23:34 kibana-6.2.3-linux-x86_64 -rw-r--r-- 1 root root 83426328 9月 21 22:45 kibana-6.2.3-linux-x86_64.tar.gz -rw-r--r-- 1 root root 768809984 9月 29 21:29 kibana.tar -rw-r--r-- 1 root root 17446309 9月 21 22:12 logstash-0.0.1-SNAPSHOT.jar drwxr-xr-x 12 root root 289 9月 22 12:32 logstash-6.2.3 -rw-r--r-- 1 root root 138221072 9月 21 22:45 logstash-6.2.3.tar.gz -rw-r--r-- 1 root root 677771264 9月 29 21:31 logstash.tar drwxrwxr-x 3 root root 48 8月 4 2018 __MACOSX drwxr-xr-x 10 root root 171 10月 24 23:34 nacos -rw-r--r-- 1 root root 44275341 10月 24 23:25 nacos.tar.gz drwxr-xr-x 9 es es 186 10月 4 21:41 nginx-1.12.2 -rw-r--r-- 1 root root 981687 10月 4 20:30 nginx-1.12.2.tar.gz drwxr-xr-x 9 1169 1169 12288 10月 4 21:40 pcre-8.37 -rw-r--r-- 1 root root 2041593 10月 4 20:30 pcre-8.37.tar.gz drwxr-xr-x 19 root root 4096 11月 19 21:41 Python-3.7.0 -rw-r--r-- 1 root root 26047619 11月 19 21:33 Python-3.7.0.zip drwxr-xr-x 11 es es 4096 10月 6 16:52 zookeeper-3.4.10 -rw-r--r-- 1 root root 35042811 10月 6 15:27 zookeeper-3.4.10.tar.gz [es@topcheer mnt]$ cd elasticsearch-6.2.3/ [es@topcheer elasticsearch-6.2.3]$ ll 总用量 220 drwxr-xr-x 2 es es 4096 9月 21 22:49 bin drwxr-xr-x 2 es es 75 9月 29 22:48 config drwxrwxr-x 3 es es 19 9月 21 22:58 data drwxr-xr-x 2 es es 4096 3月 13 2018 lib -rw-r--r-- 1 es es 11358 3月 13 2018 LICENSE.txt drwxr-xr-x 2 es es 268 9月 29 22:44 logs drwxr-xr-x 16 es es 289 3月 13 2018 modules -rw-r--r-- 1 es es 191887 3月 13 2018 NOTICE.txt drwxr-xr-x 2 es es 6 3月 13 2018 plugins -rw-r--r-- 1 es es 9268 3月 13 2018 README.textile [es@topcheer elasticsearch-6.2.3]$ ./bin/elasticsearch -d
2) 查看进程是否启动成功(等待一下)
[es@topcheer elasticsearch-6.2.3]$ lsof -i:9200 [es@topcheer elasticsearch-6.2.3]$ netstat -antp (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN - tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN - tcp 0 0 192.168.180.113:22 192.168.180.1:65135 ESTABLISHED - tcp 0 0 192.168.180.113:22 192.168.180.1:65130 ESTABLISHED - tcp 0 1 192.168.180.113:44876 192.168.77.129:2377 SYN_SENT - tcp6 0 0 :::2377 :::* LISTEN - tcp6 0 0 :::27017 :::* LISTEN - tcp6 0 0 :::8009 :::* LISTEN - tcp6 0 0 :::7946 :::* LISTEN - tcp6 0 0 :::3306 :::* LISTEN - tcp6 0 0 :::6061 :::* LISTEN - tcp6 0 0 :::6062 :::* LISTEN - tcp6 0 0 :::111 :::* LISTEN - tcp6 0 0 :::80 :::* LISTEN - tcp6 0 0 :::8080 :::* LISTEN - tcp6 0 0 :::6000 :::* LISTEN - tcp6 0 0 :::22 :::* LISTEN - tcp6 0 0 ::1:631 :::* LISTEN - tcp6 0 0 :::23 :::* LISTEN - tcp6 0 0 ::1:25 :::* LISTEN - tcp6 0 0 ::1:6010 :::* LISTEN - tcp6 0 0 :::8091 :::* LISTEN - tcp6 0 0 127.0.0.1:8005 :::* LISTEN
3) 若出现错误可以查看日志
[es@topcheer elasticsearch-6.2.3]$ cd logs/ [es@topcheer logs]$ ll 总用量 60 -rw-rw-r-- 1 es es 2836 9月 22 00:01 elasticsearch-2019-09-21-1.log.gz -rw-rw-r-- 1 es es 1470 9月 29 22:44 elasticsearch-2019-09-22-1.log.gz -rw-rw-r-- 1 es es 2956 12月 2 10:06 elasticsearch-2019-09-29-1.log.gz -rw-rw-r-- 1 es es 1593 9月 22 10:21 elasticsearch_deprecation.log -rw-rw-r-- 1 es es 0 9月 21 22:58 elasticsearch_index_indexing_slowlog.log -rw-rw-r-- 1 es es 0 9月 21 22:58 elasticsearch_index_search_slowlog.log -rw-rw-r-- 1 es es 4214 12月 2 10:06 elasticsearch.log -rw-rw-r-- 1 es es 34796 12月 2 10:06 gc.log.0.current [es@topcheer logs]$ cat elasticsearch.log [2019-12-02T10:06:40,446][INFO ][o.e.n.Node ] [] initializing ... [2019-12-02T10:06:40,891][INFO ][o.e.e.NodeEnvironment ] [Wq-vR0u] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [3gb], net total_space [26.9gb], types [rootfs] [2019-12-02T10:06:40,892][INFO ][o.e.e.NodeEnvironment ] [Wq-vR0u] heap size [990.7mb], compressed ordinary object pointers [true] [2019-12-02T10:06:41,530][INFO ][o.e.n.Node ] node name [Wq-vR0u] derived from node ID [Wq-vR0uFTRGwj2HQ0-vyqw]; set [node.name] to override [2019-12-02T10:06:41,531][INFO ][o.e.n.Node ] version[6.2.3], pid[33585], build[c59ff00/2018-03-13T10:06:29.741383Z], OS[Linux/3.10.0-957.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_181/25.181-b13] [2019-12-02T10:06:41,531][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.JndCl1Qt, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/mnt/elasticsearch-6.2.3, -Des.path.conf=/mnt/elasticsearch-6.2.3/config] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [aggs-matrix-stats] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [analysis-common] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [ingest-common] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [lang-expression] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [lang-mustache] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [lang-painless] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [mapper-extras] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [parent-join] [2019-12-02T10:06:43,796][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [percolator] [2019-12-02T10:06:43,797][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [rank-eval] [2019-12-02T10:06:43,797][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [reindex] [2019-12-02T10:06:43,797][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [repository-url] [2019-12-02T10:06:43,797][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [transport-netty4] [2019-12-02T10:06:43,797][INFO ][o.e.p.PluginsService ] [Wq-vR0u] loaded module [tribe] [2019-12-02T10:06:43,798][INFO ][o.e.p.PluginsService ] [Wq-vR0u] no plugins loaded [2019-12-02T10:06:54,458][INFO ][o.e.d.DiscoveryModule ] [Wq-vR0u] using discovery type [zen] [2019-12-02T10:06:55,374][INFO ][o.e.n.Node ] initialized [2019-12-02T10:06:55,374][INFO ][o.e.n.Node ] [Wq-vR0u] starting ... [2019-12-02T10:06:55,818][INFO ][o.e.t.TransportService ] [Wq-vR0u] publish_address {172.17.0.1:9300}, bound_addresses {[::]:9300} [2019-12-02T10:06:55,852][INFO ][o.e.b.BootstrapChecks ] [Wq-vR0u] bound or publishing to a non-loopback address, enforcing bootstrap checks [2019-12-02T10:06:55,866][ERROR][o.e.b.Bootstrap ] [Wq-vR0u] node validation exception [1] bootstrap checks failed [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144] [2019-12-02T10:06:55,877][INFO ][o.e.n.Node ] [Wq-vR0u] stopping ... [2019-12-02T10:06:56,163][INFO ][o.e.n.Node ] [Wq-vR0u] stopped [2019-12-02T10:06:56,164][INFO ][o.e.n.Node ] [Wq-vR0u] closing ... [2019-12-02T10:06:56,196][INFO ][o.e.n.Node ] [Wq-vR0u] closed [es@topcheer logs]$
解决:
[root@topcheer ~]# sysctl -w vm.max_map_count=262144 vm.max_map_count = 262144 [root@topcheer ~]#
[es@topcheer elasticsearch-6.2.3]$ ./bin/elasticsearch -d [es@topcheer elasticsearch-6.2.3]$ lsof -i:9200 [es@topcheer elasticsearch-6.2.3]$ lsof -i:9200 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 36864 es 174u IPv6 83816708 0t0 TCP *:wap-wsp (LISTEN) [es@topcheer elasticsearch-6.2.3]$ netstat -antp (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN - tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:6011 0.0.0.0:* LISTEN - tcp 0 0 192.168.180.113:22 192.168.180.1:65135 ESTABLISHED - tcp 0 36 192.168.180.113:22 192.168.180.1:65130 ESTABLISHED - tcp 0 0 192.168.180.113:22 192.168.180.1:49838 ESTABLISHED - tcp 0 0 192.168.180.113:22 192.168.180.1:49839 ESTABLISHED - tcp 0 1 192.168.180.113:45014 192.168.77.129:2377 SYN_SENT - tcp6 0 0 :::2377 :::* LISTEN - tcp6 0 0 :::27017 :::* LISTEN - tcp6 0 0 :::8009 :::* LISTEN - tcp6 0 0 :::7946 :::* LISTEN - tcp6 0 0 :::3306 :::* LISTEN - tcp6 0 0 :::6061 :::* LISTEN - tcp6 0 0 :::6062 :::* LISTEN - tcp6 0 0 :::111 :::* LISTEN - tcp6 0 0 :::9200 :::* LISTEN 36864/java tcp6 0 0 :::80 :::* LISTEN - tcp6 0 0 :::8080 :::* LISTEN - tcp6 0 0 :::6000 :::* LISTEN - tcp6 0 0 :::9300 :::* LISTEN 36864/java tcp6 0 0 :::22 :::* LISTEN - tcp6 0 0 ::1:631 :::* LISTEN - tcp6 0 0 :::23 :::* LISTEN - tcp6 0 0 ::1:25 :::* LISTEN - tcp6 0 0 ::1:6010 :::* LISTEN - tcp6 0 0 ::1:6011 :::* LISTEN - tcp6 0 0 :::8091 :::* LISTEN - tcp6 0 0 127.0.0.1:8005 :::* LISTEN - tcp6 0 0 ::1:34054 ::1:9300 TIME_WAIT - tcp6 0 0 127.0.0.1:46130 127.0.0.1:9300 TIME_WAIT - [es@topcheer elasticsearch-6.2.3]
4) 测试是否可以正常访问
[es@topcheer elasticsearch-6.2.3]$ curl localhost:9200 { "name" : "Wq-vR0u", "cluster_name" : "elasticsearch", "cluster_uuid" : "PwmRsjs6Q_WrsRzqnZZlxQ", "version" : { "number" : "6.2.3", "build_hash" : "c59ff00", "build_date" : "2018-03-13T10:06:29.741383Z", "build_snapshot" : false, "lucene_version" : "7.2.1", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" } [es@topcheer elasticsearch-6.2.3]$
5. 配置 logstash
Logstash 收集 nginx 日志之使用 grok 过滤插件解析日志, grok 作为一个 logstash 的过滤插件,支持根
据模式解析文本日志行,拆成字段。
1) logstash 中 grok 的正则匹配
[root@topcheer patterns]# pwd /mnt/logstash-6.2.3/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns [root@topcheer patterns]# vim grok-patterns WZ ([^ ]*) NGINXACCESS %{IP:remote_ip}\-\-\[%{HTTPDATE:timestamp}\] "%{WORD:method} %{WZ:request}HTTP/%{NUMBER:httpversion}" %{NUMBER:status} %{NUMBER:bytes} %{QS:referer} %{QS:agent} %{QS:xforward}
2) 创建 logstash 配置文件
[root@topcheer logstash-6.2.3]# cat nginx.conf input { beats { port => "5044" } } #数据过滤 filter { grok { match => { "message" => "%{NGINXACCESS}" } } geoip { # nginx客户端ip source => "192.168.180.112" } } #输出配置为本机的9200端口,这是ElasticSerach服务的监听端口 output { elasticsearch { hosts => ["127.0.0.1:9200"] } } [root@topcheer logstash-6.2.3]#
3) 进入到/usr/local/logstash-6.2.3 目录下,并执行下列命令
后台启动 logstash:
[root@topcheer logstash-6.2.3]# nohup bin/logstash -f nginx.conf & [1] 57261 [root@topcheer logstash-6.2.3]# nohup: 忽略输入并把输出追加到"nohup.out" [root@topcheer logstash-6.2.3]# ll
查看启动日志: tailf nohup.out
查看端口是否启动:
[root@topcheer logstash-6.2.3]# netstat -napt|grep 5044 tcp6 0 0 :::5044 :::* LISTEN 57261/java [root@topcheer logstash-6.2.3]#