问题描述
在我的弹性搜索服务器中,我有一个索引 http:// localhost:9200 / blog
。
(博客)索引包含多种类型
In my elastic search server I have one index http://localhost:9200/blog
.
The (blog) index contains multiple types.
例如: http:// localhost:9200 / blog / posts
, http :// localhost:9200 / blog / tags
。
在标签类型中,我创建了超过1000个标签和10个帖子类型。
In the tags type I have created more than 1000 tags and 10 posts created in posts type.
例如:帖子
{
"_index":"blog",
"_type":"posts",
"_id":"1",
"_version":3,
"found":true,
"_source" : {
"catalogId" : "1",
"name" : "cricket",
"url" : "http://www.wikipedia/cricket"
}
}
例如:标签
{
"_index":"blog",
"_type":"tags",
"_id":"1",
"_version":3,
"found":true,
"_source" : {
"tagId" : "1",
"name" : "game"
}
}
我想将现有标签分配给博客帖子(即关系=>映射)。
I want to assign the existing tag to blog posts (i.e. relationship => mapping).
如何将标签分配给帖子映射?
How do I assign the tags to posts mapping?
推荐答案
您可以在弹性搜索中使用四种方法来管理关系。他们在Elasticsearch博客文章中有很好的概述 - 我建议阅读整个文章可以获得有关每种方法的更多细节,然后选择最符合您业务需求的方法,同时保持技术上的合适性。
There are 4 approaches that you can use within Elasticsearch for managing relationships. They are very well outlined in the Elasticsearch blog post - Managing Relations Inside Elasticsearch I would recommend reading the entire article to get more details on each approach and then select that approach that best meets your business needs while remaining technically appropriate.
以下是4种方法的亮点。
Here are the highlights for the 4 approaches.
- 简单,快速,有效率
- 只有保持一对一关系时才适用
- 不需要特殊查询
嵌套
- 嵌套文档存储在相同的Lucene块中,这有助于读取/查询性能。阅读嵌套文档比同等的父/子更快。
- 更新嵌套文档(父或嵌套子元素)中的单个字段会强制ES重新索引整个嵌套文档。对于大型嵌套文档,这可能非常昂贵
- 交叉引用嵌套文档是不可能的
- 最适合不经常更改的数据 li>
- Nested docs are stored in the same Lucene block as each other, which helps read/query performance. Reading a nested doc is faster than the equivalent parent/child.
- Updating a single field in a nested document (parent or nested children) forces ES to reindex the entire nested document. This can be very expensive for large nested docs
- "Cross referencing" nested documents is impossible
- Best suited for data that does not change frequently
父/子
- 儿童与父母分开存储,但被路由到相同的分片。因此,父/子对于读/查询的性能略低于嵌套
- 父/子映射有一些额外的内存开销,因为ES在内存中维护连接列表
- 更新子文档不会影响父或任何其他子级,这可能会在大型文档上保存大量索引。
- 排序/评分可能很困难
- Children are stored separately from the parent, but are routed to the same shard. So parent/children are slightly less performance on read/query than nested
- Parent/child mappings have a bit extra memory overhead, since ES maintains a "join" list in memory
- Updating a child doc does not affect the parent or any other children, which can potentially save a lot of indexing on large docs
- Sorting/scoring can be difficult with Parent/Child since the Has Child/Has Parent operations can be opaque at times
非规范化
- 您可以自行管理所有关系。
- 最灵活,最管理的开销
- 根据您的设置,可能或多或少的性能
这篇关于弹性关系映射(一对一关系映射)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!