问题描述
我在集群中设置了kube-prometheus( https ://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus ).它包含一些默认警报,例如"CoreDNSdown等".如何创建我自己的警报?
I have setup kube-prometheus in my cluster(https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus). It contains some default alerts like "CoreDNSdown etc". How to create my own alert?
任何人都可以向我提供示例示例来创建警报,该警报会将电子邮件发送到我的gmail帐户吗?
Could any one provide me sample example to create an alert that will send an email to my gmail account?
我遵循了这个在Docker容器警告时pod处于Error或CarshLoopBackOff kubernetes 中.但是我无法使其正常工作.
I followed this Alert when docker container pod is in Error or CarshLoopBackOff kubernetes. But I couldn't make it work.
推荐答案
要将警报发送到您的gmail帐户,您需要在一个名为alertmanager.yaml的文件中设置alertmanager配置:
To send an alert to your gmail account, you need to setup the alertmanager configuration in a file say alertmanager.yaml:
cat <<EOF > alertmanager.yml
route:
group_by: [Alertname]
# Send all notifications to me.
receiver: email-me
receivers:
- name: email-me
email_configs:
- to: $GMAIL_ACCOUNT
from: $GMAIL_ACCOUNT
smarthost: smtp.gmail.com:587
auth_username: "$GMAIL_ACCOUNT"
auth_identity: "$GMAIL_ACCOUNT"
auth_password: "$GMAIL_AUTH_TOKEN"
EOF
现在,当您使用kube-prometheus时,您将拥有一个名为alertmanager-main
的密码,这是alertmanager
的默认配置.您需要使用以下命令使用新配置再次创建密钥alertmanager-main
:
Now, as you're using kube-prometheus so you will have a secret named alertmanager-main
that is default configuration for alertmanager
. You need to create a secret alertmanager-main
again with the new configuration using following command:
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
现在,您的Alertmanager设置为在收到来自Prometheus的警报时发送电子邮件.
Now you're alertmanager is set to send an email whenever it receive alert from the prometheus.
现在,您需要设置一个警报,邮件将在该警报上发送.您可以设置DeadManSwitch警报,该警报在每种情况下都会触发,并用于检查警报管道
Now you need to setup an alert on which your mail will be sent. You can set up DeadManSwitch alert which fires in every case and it is used to check your alerting pipeline
groups:
- name: meta
rules:
- alert: DeadMansSwitch
expr: vector(1)
labels:
severity: critical
annotations:
description: This is a DeadMansSwitch meant to ensure that the entire Alerting
pipeline is functional.
summary: Alerting DeadMansSwitch
此后,将触发DeadManSwitch
警报,并应将电子邮件发送到您的邮件中.
After that the DeadManSwitch
alert will be fired and should send email to your mail.
参考链接:
deadmanswitch警报应放在您的普罗米修斯正在读取的配置映射中.我将在这里分享我的普罗米修斯的相关快照:
The deadmanswitch alert should go in a config-map which your prometheus is reading. I will share the relevant snaps from my prometheus here:
"spec": {
"alerting": {
"alertmanagers": [
{
"name": "alertmanager-main",
"namespace": "monitoring",
"port": "web"
}
]
},
"baseImage": "quay.io/prometheus/prometheus",
"replicas": 2,
"resources": {
"requests": {
"memory": "400Mi"
}
},
"ruleSelector": {
"matchLabels": {
"prometheus": "prafull",
"role": "alert-rules"
}
},
上面的配置是我的prometheus.json文件的名称,该文件具有要使用的alertmanager的名称,以及ruleSelector
,它将基于prometheus
和role
标签选择规则.所以我的规则配置映射如下:
The above config is of my prometheus.json file which have the name of alertmanager to use and the ruleSelector
which will select the rules based on prometheus
and role
label. So I have my rule configmap like:
kind: ConfigMap
apiVersion: v1
metadata:
name: prometheus-rules
namespace: monitoring
labels:
role: alert-rules
prometheus: prafull
data:
alert-rules.yaml: |+
groups:
- name: alerting_rules
rules:
- alert: LoadAverage15m
expr: node_load15 >= 0.50
labels:
severity: major
annotations:
summary: "Instance {{ $labels.instance }} - high load average"
description: "{{ $labels.instance }} (measured by {{ $labels.job }}) has high load average ({{ $value }}) over 15 minutes."
在上面的配置图中替换DeadManSwitch
.
Replace the DeadManSwitch
in above config map.
这篇关于如何使用Prometheus Alert Manager在Kubernetes中触发警报的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!