本文介绍了ElasticSearch 作为主要数据存储库的可靠性如何,不受写入丢失、数据可用性等因素的影响的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个项目,需要提出一个通用仪表板,用户可以在其中对不同的字段进行不同类型的分组、过滤和深入挖掘.为此,我们正在寻找允许切片和切块数据的搜索存储.

I am working on a project with a requirement of coming up with a generic dashboard where a users can do different kinds of grouping, filtering and drill down on different fields. For this we are looking for a search store that allows slice and dice of data.

将有多个数据源并将其存储在搜索存储中.可能需要对源数据进行一些预计算,可以由中间组件完成.

There would be multiple sources of data and would be storing it in the Search Store. There may be some pre-computation required on the source data which can be done by an intermediate components.

我浏览了几篇博客以了解 ES 是否也可以可靠地用作主数据存储.这主要取决于我们正在寻找的用例.关于我们拥有的用例的一些信息:

I have looked through several blogs to understand whether ES can be used reliably as a primary datastore too. It mostly depends on the use-case we are looking for. Some of the information about the use case that we have :

  • 每年约有 3 亿条记录,大小为 1-2 KB.
  • 假设存储 1 年的数据,我们今天有 300 GB,但考虑到数据的增长,用例可以达到 400-500 GB.
  • 目前还不确定我们将如何推送数据,但粗略地说,每 5 分钟最多可达约 2-3 百万条记录.
  • 搜索请求很少,但需要复杂的查询,可以搜索过去 6 周到 6 个月的数据.
  • 文档将在文档中的几乎所有字段中建立索引.

一些博客说它足够可靠,可以用作主要数据存储 -

Some blogs say that it is reliable enough to use as a primary data store -

还有一些博客说 ES 几乎没有限制 -

And some blogs say that ES have few limitations -

有没有人使用 Elastic Search 作为数据的唯一真理,而没有像 PostgreSQL、DynamoDB 或 RDS 这样的主存储?我查过 ES 存在某些问题,例如脑裂和索引损坏,其中可能存在数据丢失问题.所以,我想知道是否有人使用过 ES 并且对数据有任何问题

Has anyone used Elastic Search as the sole truth of data without having a primary storage like PostgreSQL, DynamoDB or RDS? I have looked up that ES has certain issues like split brains and index corruption where there can be a problem with the data loss. So, I am looking to know if anyone has used ES and have got into any troubles with the data

谢谢.

推荐答案

简短回答:这取决于您的用例,但您可能不想将其用作主存储.

更长的答案:您应该真正了解在弹性和数据丢失方面可能出现的所有问题.Elastic 有一些关于这些问题的大量文档,在使用之前您应该真正了解它们作为主要数据存储.此外,Aphyr 关于该主题的帖子 是一个很好的资源.

Longer answer: You should really understand all of the possible issues that can come up around resiliency and data loss. Elastic has some great documentation of these issues which you should really understand before using it as a primary data store. In addition Aphyr's post on the topic is a good resource.

如果您了解您所承担的风险并且您认为这些风险是可以接受的(例如,因为少量数据丢失对您的应用程序来说不是问题),那么您应该随时继续尝试.

If you understand the risks you are taking and you believe that those risks are acceptable (e.g. because small data loss is not a problem for your application) then you should feel free to go ahead and try it.

这篇关于ElasticSearch 作为主要数据存储库的可靠性如何,不受写入丢失、数据可用性等因素的影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 10:40