我正在尝试从bosun(使用docker映像)获取警报通知,当我使用bosun的ui在客户机vm上的cpu使用率很高时,它会显示严重情况,但不会发送通知,同时也会找到调试配置文件的方法。
我的配置文件包含-
tsdbHost = localhost:4242
stateFile = /data/bosun.state
template test {
body = `Alert definition:
Name: {{.Alert.Name}}
Crit: {{.Alert.Crit}}
Tags:
<table>
{{range $k, $v := .Group}}
<tr><td>{{$k}}</td><td>{{$v}}</td></tr>
{{end}}
</table>`
subject = {{.Last.Status}}: {{.Alert.Name}} on {{.Group.host}}
}
notification json {
post = http://localhost:8080/alert
body = {"text": {{.|json}}}
contentType = application/json
next = json
timeout = 5s
print = true
}
alert test {
template = test
$speed = avg(q("sum:rate{counter,,1}:linux.cpu{host=aaa}", "1h", ""))
crit = $speed>195
warn = $speed>180
critNotification = json
warnNotification = json
}
我的日志文件“/var/log/supervisor/bosun stderr---supervisor nhxzko.log”包含-
2015/12/17 06:27:19 info: search.go:199: Backing up last data to redis
2015/12/17 06:29:20 info: search.go:199: Backing up last data to redis
2015/12/17 06:30:08 info: notify.go:122: Batching and sending unknown notifications
2015/12/17 06:30:08 info: notify.go:152: Done sending unknown notifications
2015/12/17 06:30:13 info: bolt.go:79: wrote notifications: 48.00B
2015/12/17 06:30:13 info: bolt.go:79: wrote silence: 140.00B
2015/12/17 06:30:13 info: bolt.go:79: wrote status: 767.00B
2015/12/17 06:30:13 info: bolt.go:103: save to db complete
2015/12/17 06:31:20 info: search.go:199: Backing up last data to redis
2015/12/17 06:33:21 info: search.go:199: Backing up last data to redis
本地运行的服务器文件包含-
package main
import (
"fmt"
"log"
"net/http"
)
func handleAlerts(res http.ResponseWriter, req *http.Request) {
fmt.Print(req.Body)
fmt.Printf("request enjoy")
}
func main() {
http.HandleFunc("/alert", handleAlerts)
fmt.Printf("Starting server on 8080...")
err := http.ListenAndServe(":8080", nil)
if err != nil {
fmt.Println("Alerts: Server Down: ", err)
log.Fatal("Alerts: Server Down: ", err)
}
}
最佳答案
似乎人们没有收到任何通知的两个最常见的原因是:
Bosun正在安静模式下运行,该模式会抑制所有通知。使用-q
开关运行bosun时就是这种情况。
误解了Bosun的警报流程。在其生命周期内非常重要的事件(尚未关闭)将不会重新通知(请参阅文档的“事件生命周期”(http://bosun.org/usage#the-lifetime-of-an-incident)部分)。