链接:https://www.jianshu.com/p/172118ca0262
前言:目前有个需求,需要对线上几台拥有5个tomcat的服务器添加监控,分别监控各个tomcat的gc、连接数、使用内存等,现成的模板是监控一台tomcat,完全不适用目前情况,5个tomcat,如果现在新建模板,那每个tomcat得添加十几个item,也就是要添加60多个 Ttem,而且还要添加 Trigger 和 Graph,也都是成倍的添加,如果这台服务器在增加tomcat,又得再加一次,做工重复且繁琐,于是想到能不能自动发现tomca的端口以及把不同tomcat的监控数据直接发给zabbx-server。
此文章多方百度和多次实验失败后完成的。
参考1:zabbix_sender提交item数据
参考2:zabbix自动发现并监控主机的TCP监听端口
一、发现端口和pid
说明:发现端口是为了区分程序;而发现pid则是便于用jstat获取程序的状态数据;
直接上脚本:
点击(此处)折叠或打开
- > vim jstat.py
- #!/usr/bin/env python
- #coding=utf-8
- '''
- ##
- ## 功能: 调用jstat获取JMX的各项指标
- ## 说明: 用于zabbix自动发现告警
- ## 版本: V1.0 2016-11-02
- ## 特性: 1. 线程功能,提高脚本执行速度
- ##
- '''
- import sys
- import os
- import commands
- import subprocess
- import json
- import argparse
- import socket
- import threading
- jstat_cmd = commands.getoutput("which jstat")
- jstack_cmd = commands.getoutput("which jstack")
- jvmport_cmd = "netstat -tpnl|grep -oP '(?<=:)\d+.*\d+(?=/java)'|
- hostname = socket.gethostname()
- zbx_sender='/usr/local/zabbix/bin/zabbix_sender'
- zbx_cfg='/usr/local/zabbix/etc/zabbix_agentd.conf'
- zbx_tmp_file='/usr/local/zabbix/scripts/.zabbix_jmx_status'
- jstat_dict = {
- "S0":"Young.Space0.Percent",
- "S1":"Young.Space1.Percent",
- "E":"Eden.Space.Percent",
- "O":"Old.Space.Percent",
- "P":"Perm.Space.Percent",
- "FGC":"Old.Gc.Count",
- "FGCT":"Old.Gc.Time",
- "YGC":"Young.Gc.Count",
- "YGCT":"Young.Gc.Time",
- "GCT":"Total.Gc.Time",
- "PGCMN":"Perm.Gc.Min",
- "PGCMX":"Perm.Gc.Max",
- "PGC":"Perm.Gc.New",
- "PC":"Perm.Gc.Cur",
- "Tomcat.Thread":"Tomcat.Thread"
- }
- jmx_threads = []
- def get_status(cmd,opts,pid):
- value = commands.getoutput('sudo %s -%s %s' % (cmd,opts,pid)).strip().split('\n')
- kv = []
- for i in value[0].split(' '):
- if i != '':
- kv.append(i)
- vv = []
- for i in value[1].split(' '):
- if i != '':
- vv.append(i)
- data = dict(zip(kv,vv))
- return data
- def get_thread(cmd,pid):
- value = commands.getoutput('sudo %s %s|grep http|wc -l' % (cmd,pid))
- data = {"Tomcat.Thread":value}
- return data
- def get_jmx(jport,jprocess):
- '''
- 使用jstat获取Java的性能指标
- '''
-
- file_truncate() # 清空zabbix_data_tmp
- gcutil_data = get_status(jstat_cmd,"gcutil",jprocess)
- gccapacity_data = get_status(jstat_cmd,"gccapacity",jprocess)
- thread_data = get_thread(jstack_cmd,jprocess)
- data_dict = dict(gcutil_data.items()+gccapacity_data.items()+thread_data.items())
- for jmxkey in data_dict.keys():
- if jmxkey in jstat_dict.keys():
- cur_key = jstat_dict[jmxkey]
- zbx_data = "%s jstat[%s,%s] %s" %(hostname,jport,cur_key,data_dict[jmxkey])
- with open(zbx_tmp_file,'a') as file_obj: file_obj.write(zbx_data + '\n')
- def jvm_port_discovery():
- output = subprocess.Popen(jvmport_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
- jvm_port_lists = output.stdout.readlines()
- jvm_port_proce = []
- for jvm_port_tmp in jvm_port_lists:
- jvm_port_proce.append(jvm_port_tmp.split())
- return jvm_port_proce
-
- def file_truncate():
- '''
- 用于清空zabbix_sender使用的临时文件
- '''
- with open(zbx_tmp_file,'w') as fn: fn.truncate()
- def zbx_tmp_file_create():
- '''
- 创建zabbix_sender发送的文件内容
- '''
- jvmport_list = jvm_port_discovery()
- for jvm_tmp in jvmport_list:
- jvmport = jvm_tmp[0]
- jvmprocess = jvm_tmp[1]
- th = threading.Thread(target=get_jmx,args=(jvmport,jvmprocess))
- th.start()
- jmx_threads.append(th)
- def send_data_zabbix():
- '''
- 调用zabbix_sender命令,将收集的key和value发送至zabbix server
- '''
- zbx_tmp_file_create()
- for get_jmxdata in jmx_threads:
- get_jmxdata.join()
- zbx_sender_cmd = "%s -c %s -i %s" %(zbx_sender,zbx_cfg,zbx_tmp_file)
- print zbx_sender_cmd
- zbx_sender_status,zbx_sender_result = commands.getstatusoutput(zbx_sender_cmd)
- #print zbx_sender_status
- print zbx_sender_result
- def zbx_discovery():
- '''
- 用于zabbix自动发现JVM端口
- '''
- jvm_zabbix = []
- jvmport_list = jvm_port_discovery()
- for jvm_tmp in jvmport_list:
- jvm_zabbix.append({'{#JPORT}' : jvm_tmp[0],
- '{#JPROCESS}' : jvm_tmp[1],
- })
- return json.dumps({'data': jvm_zabbix}, sort_keys=True, indent=7,separators=(',', ':'))
- def cmd_line_opts(arg=None):
- class ParseHelpFormat(argparse.HelpFormatter):
- def __init__(self, prog, indent_increment=5, max_help_position=50, width=200):
- super(ParseHelpFormat, self).__init__(prog, indent_increment, max_help_position, width)
- parse = argparse.ArgumentParser(description='Jmx监控"',
- formatter_class=ParseHelpFormat)
- parse.add_argument('--version', '-v', action='version', version="0.1", help='查看版本')
- parse.add_argument('--jvmport', action='store_true', help='获取JVM端口')
- parse.add_argument('--data', action='store_true', help='发送JMX指标数据至zabbix')
- if arg:
- return parse.parse_args(arg)
- if not sys.argv[1:]:
- return parse.parse_args(['-h'])
- else:
- return parse.parse_args()
- if __name__ == '__main__':
- opts = cmd_line_opts()
- if opts.jvmport:
- print zbx_discovery()
- elif opts.data:
- send_data_zabbix()
- else:
- cmd_line_opts(arg=['-h'])
查看下返回数据
点击(此处)折叠或打开
- > sudo python jstat.py --jvmport
- {
- "data":[
- {
- "{#JPORT}":"8082",
- "{#JPROCESS}":"456"
- },
- {
- "{#JPORT}":"8083",
- "{#JPROCESS}":"11992"
- },
- {
- "{#JPORT}":"8084",
- "{#JPROCESS}":"7713"
- },
- {
- "{#JPORT}":"7074",
- "{#JPROCESS}":"11239"
- },
- {
- "{#JPORT}":"8899",
- "{#JPROCESS}":"14186"
- }
- ]
- }
二、新建 templates
避免麻烦,我直接导出 templates
点击(此处)折叠或打开
- <?xml version="1.0" encoding="UTF-8"?>
- <zabbix_export>
- <version>2.0</version>
- <date>2016-11-02T08:11:35Z</date>
- <groups>
- <group>
- <name>Template java</name>
- </group>
- </groups>
- <templates>
- <template>
- <template>jmx-im-jstat</template>
- <name>jmx-im-jstat</name>
- <groups>
- <group>
- <name>Template java</name>
- </group>
- </groups>
- <applications>
- <application>
- <name>JSTAT</name>
- </application>
- </applications>
- <items>
- <item>
- <name>jmxdata</name>
- <type>0</type>
- <snmp_community/>
- <multiplier>1</multiplier>
- <snmp_oid/>
- <key>jmxdata</key>
- <delay>30</delay>
- <history>90</history>
- <trends>365</trends>
- <status>0</status>
- <value_type>3</value_type>
- <allowed_hosts/>
- <units/>
- <delta>0</delta>
- <snmpv3_contextname/>
- <snmpv3_securityname/>
- <snmpv3_securitylevel>0</snmpv3_securitylevel>
- <snmpv3_authprotocol>0</snmpv3_authprotocol>
- <snmpv3_authpassphrase/>
- <snmpv3_privprotocol>0</snmpv3_privprotocol>
- <snmpv3_privpassphrase/>
- <formula>1</formula>
- <delay_flex/>
- <params/>
- <ipmi_sensor/>
- <data_type>0</data_type>
- <authtype>0</authtype>
- <username/>
- <password/>
- <publickey/>
- <privatekey/>
- <port/>
- <description/>
- <inventory_link>0</inventory_link>
- <applications>
- <application>
- <name>JSTAT</name>
- </application>
- </applications>
- <valuemap/>
- <logtimefmt/>
- </item>
- </items>
- <discovery_rules>
- <discovery_rule>
- <name>jmxport</name>
- <type>0</type>
- <snmp_community/>
- <snmp_oid/>
- <key>jmxport</key>
- <delay>30</delay>
- <status>0</status>
- <allowed_hosts/>
- <snmpv3_contextname/>
- <snmpv3_securityname/>
- <snmpv3_securitylevel>0</snmpv3_securitylevel>
- <snmpv3_authprotocol>0</snmpv3_authprotocol>
- <snmpv3_authpassphrase/>
- <snmpv3_privprotocol>0</snmpv3_privprotocol>
- <snmpv3_privpassphrase/>
- <delay_flex/>
- <params/>
- <ipmi_sensor/>
- <authtype>0</authtype>
- <username/>
- <password/>
- <publickey/>
- <privatekey/>
- <port/>
- <filter>:</filter>
- <lifetime>1</lifetime>
- <description/>
- <item_prototypes>
- <item_prototype>
- <name>port:$1 $2</name>
- <type>2</type>
- <snmp_community/>
- <multiplier>1</multiplier>
- <snmp_oid/>
- <key>jstat[{#JPORT},Young.Gc.Count]</key>
- <delay>0</delay>
- <history>90</history>
- <trends>365</trends>
- <status>0</status>
- <value_type>3</value_type>
- <allowed_hosts/>
- <units/>
- <delta>0</delta>
- <snmpv3_contextname/>
- <snmpv3_securityname/>
- <snmpv3_securitylevel>0</snmpv3_securitylevel>
- <snmpv3_authprotocol>0</snmpv3_authprotocol>
- <snmpv3_authpassphrase/>
- <snmpv3_privprotocol>0</snmpv3_privprotocol>
- <snmpv3_privpassphrase/>
- <formula>1</formula>
- <delay_flex/>
- <params/>
- <ipmi_sensor/>
- <data_type>0</data_type>
- <authtype>0</authtype>
- <username/>
- <password/>
- <publickey/>
- <privatekey/>
- <port/>
- <description/>
- <inventory_link>0</inventory_link>
- <applications>
- <application>
- <name>JSTAT</name>
- </application>
- </applications>
- <valuemap/>
- <logtimefmt/>
- </item_prototype>
- <item_prototype>
- <name>port:$1 $2</name>
- <type>2</type>
- <snmp_community/>
- <multiplier>1</multiplier>
- <snmp_oid/>
- <key>jstat[{#JPORT},Old.Gc.Count]</key>
- <delay>0</delay>
- <history>90</history>
- <trends>365</trends>
- <status>0</status>
- <value_type>3</value_type>
- <allowed_hosts/>
- <units/>
- <delta>0</delta>
- <snmpv3_contextname/>
- <snmpv3_securityname/>
- <snmpv3_securitylevel>0</snmpv3_securitylevel>
- <snmpv3_authprotocol>0</snmpv3_authprotocol>
- <snmpv3_authpassphrase/>
- <snmpv3_privprotocol>0</snmpv3_privprotocol>
- <snmpv3_privpassphrase/>
- <formula>1</formula>
- <delay_flex/>
- <params/>
- <ipmi_sensor/>
- <data_type>0</data_type>
- <authtype>0</authtype>
- <username/>
- <password/>
- <publickey/>
- <privatekey/>
- <port/>
- <description/>
- <inventory_link>0</inventory_link>
- <applications>
- <application>
- <name>JSTAT</name>
- </application>
- </applications>
- <valuemap/>
- <logtimefmt/>
- </item_prototype>
- <item_prototype>
- <name>port:$1 $2</name>
- <type>2</type>
- <snmp_community/>
- <multiplier>1</multiplier>
- <snmp_oid/>
- <key>jstat[{#JPORT},Tomcat.Thread]</key>
- <delay>0</delay>
- <history>90</history>
- <trends>365</trends>
- <status>0</status>
- <value_type>3</value_type>
- <allowed_hosts/>
- <units/>
- <delta>0</delta>
- <snmpv3_contextname/>
- <snmpv3_securityname/>
- <snmpv3_securitylevel>0</snmpv3_securitylevel>
- <snmpv3_authprotocol>0</snmpv3_authprotocol>
- <snmpv3_authpassphrase/>
- <snmpv3_privprotocol>0</snmpv3_privprotocol>
- <snmpv3_privpassphrase/>
- <formula>1</formula>
- <delay_flex/>
- <params/>
- <ipmi_sensor/>
- <data_type>0</data_type>
- <authtype>0</authtype>
- <username/>
- <password/>
- <publickey/>
- <privatekey/>
- <port/>
- <description/>
- <inventory_link>0</inventory_link>
- <applications>
- <application>
- <name>JSTAT</name>
- </application>
- </applications>
- <valuemap/>
- <logtimefmt/>
- </item_prototype>
- </item_prototypes>
- <trigger_prototypes>
- <trigger_prototype>
- <expression>{jmx-im-jstat:jstat[{#JPORT},Tomcat.Thread].last(0)}>500</expression>
- <name>Tomcat [#JPORT] Thread is too high</name>
- <url/>
- <status>0</status>
- <priority>0</priority>
- <description/>
- <type>0</type>
- </trigger_prototype>
- </trigger_prototypes>
- <graph_prototypes>
- <graph_prototype>
- <name>port:{#JPORT} Tomcat.Thread</name>
- <width>900</width>
- <height>200</height>
- <yaxismin>0.0000</yaxismin>
- <yaxismax>100.0000</yaxismax>
- <show_work_period>1</show_work_period>
- <show_triggers>1</show_triggers>
- <type>0</type>
- <show_legend>1</show_legend>
- <show_3d>0</show_3d>
- <percent_left>0.0000</percent_left>
- <percent_right>0.0000</percent_right>
- <ymin_type_1>0</ymin_type_1>
- <ymax_type_1>0</ymax_type_1>
- <ymin_item_1>0</ymin_item_1>
- <ymax_item_1>0</ymax_item_1>
- <graph_items>
- <graph_item>
- <sortorder>0</sortorder>
- <drawtype>0</drawtype>
- <color>C80000</color>
- <yaxisside>0</yaxisside>
- <calc_fnc>2</calc_fnc>
- <type>0</type>
- <item>
- <host>jmx-im-jstat</host>
- <key>jstat[{#JPORT},Tomcat.Thread]</key>
- </item>
- </graph_item>
- </graph_items>
- </graph_prototype>
- </graph_prototypes>
- <host_prototypes/>
- </discovery_rule>
- </discovery_rules>
- <macros/>
- <templates/>
- <screens/>
- </template>
- </templates>
- </zabbix_export>
注:模板太长,我只取2个item 放置其中;
下面附上几张配置截图
目前只配置了item,还没有添加报警和图形
三、创建数据,并用zabbix_sender发送到zabbix-server
zabbix_agentd配置文件
点击(此处)折叠或打开
- > cat /usr/local/zabbix/etc/zabbix_agentd.conf.d/jstat.conf
- UserParameter=jmxport,sudo /usr/bin/python /usr/local/zabbix/scripts/jstat.py --jvmport
- UserParameter=jmxdata,sudo /usr/bin/python /usr/local/zabbix/scripts/jstat.py --data
执行脚本
点击(此处)折叠或打开
- > sudo python jstat.py --data
- /usr/local/zabbix/bin/zabbix_sender -c /usr/local/zabbix/etc/zabbix_agentd.conf -i /usr/local/zabbix/scripts/.zabbix_jmx_status
- info from server: "processed: 45; failed: 0; total: 45; seconds spent: 0.000531"
- sent: 45; skipped: 0; total: 45
看上面结果,成功发送45个数据,数据会先临时存到 /usr/local/zabbix/scripts/.zabbix_jmx_status,我们查看下数据内容
点击(此处)折叠或打开
- > cat /usr/local/zabbix/scripts/.zabbix_jmx_status
- test-01 jstat[8083,Young.Gc.Time] 36.572
- test-01 jstat[8083,Old.Gc.Time] 3.971
- test-01 jstat[8083,Perm.Gc.New] 80384.0
- test-01 jstat[8083,Total.Gc.Time] 40.543
- ...
- test-01 jstat[7074,Old.Gc.Time] 1.734
- test-01 jstat[7074,Perm.Gc.New] 55296.0
- test-01 jstat[7074,Total.Gc.Time] 26.635
第一列表示主机名,这个与zabbix_agentd.conf的配置保持一致,而且 zabbix-web添加的主机名也要一样;
第二列表示 key,这个和 templates里定义的一致;
第三列是 数据;
四、查看数据
可以看到数据已经在zabbix展示了,之后我们可以添加相应的 报警和图形,使这个 template更加完善。
#############################################
zabbix_sender发送信息两种方式
1.zabbix_get主动去获取item时会提示timeout,可以修改server,agentd的conf文件里的timeout参数为300,然后重启服务
2.通过crontd定时执行脚本