问题描述
我的应用程序托管在 Amazon EC2 集群上.每个实例将事件写入日志文件.我需要在每天结束时收集(和数据挖掘)这些日志.在中央位置收集这些日志的推荐方法是什么?我想了几个选项,不知道该走哪条路:
My app is hosted on an Amazon EC2 cluster. Each instance writes events to log files. I need to collect (and data mine) over these logs at the end of each day. What's a recommended way to collect these logs in a central location? I have thought of several options, not sure which way to go:
- 使用 cron 作业将它们 scp 到一个实例
- 将 TCP/IP 上的所有事件记录到一个实例中
推荐答案
我们使用 Logstash在每个主机上(通过 Puppet 部署)收集日志事件并将其发送到中央主机上的消息队列(RabbitMQ,但可以是 Redis).另一个 Logstash 实例检索事件、处理它们并将结果填充到 ElasticSearch 中.Kibana 网络界面用于搜索该数据库.
We use Logstash on each host (deployed via Puppet) to gather and ship log events to a message queue (RabbitMQ, but could be Redis) on a central host. Another Logstash instance retrieves the events, processes them and stuffs the result into ElasticSearch. A Kibana web interface is used to search through this database.
它功能强大,易于扩展并且非常灵活.Logstash 有大量过滤器来处理来自各种输入的事件,并且可以输出到许多服务,ElasticSearch 就是其中之一.目前,我们每天从 EC2 实例在轻型硬件上发送大约 120 万个日志事件.在我们的设置中,日志事件从事件到可搜索的延迟约为 1 秒.
It's very capable, scales easily and is very flexible. Logstash has tons of filters to process events from various inputs, and can output to lots of services, ElasticSearch being one of them. We currently ship about 1,2 million log events per day from our EC2 instances, on light hardware. The latency for a log event from event to searchable is about 1 second in our setup.
这里有一些关于这种设置的文档:https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html,以及带有一些实时数据的 Kibana 搜索界面演示.
Here's some documentation on this kind of setup: https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html, and a demo of the Kibana search interface with some live data.
这篇关于从 Amazon EC2 实例收集日志的好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!