

我正在阅读倒排索引(由 Solr、Elastic Search 等文本搜索引擎使用),据我了解(如果我们以Person"为例):

I was reading about inverted index (used by the text search engines like Solr, Elastic Search etc) and as I understand (if we take "Person" as an example):


The attribute to Person relationship is inverted:

John -> PersonId(1), PersonId(2), PersonId(3)
London -> PersonId(1), PersonId(2), PersonId(5)


I can now search the person records for 'John who lives in London'


Doesn't this solve all the problems? Why do we have the forward (or regular database index) at all? Or in other words, in what cases the regular indexing is useful? Please explain. Thanks.



The point that you're missing is that there is no real technical distinction between a forward index and an inverted index. "Forward" and "inverted" in this case are just descriptive terms to distinguish between:

  • 文档中包含的单词列表.
  • 包含一个单词的文档列表.


The concept of an inverted index only makes sense if the concept of a regular (forward) index already exists. In the context of a search engine, a forward index would be the term vector; a list of terms contained within a particular document. The inverted index would be a list of documents containing a given term.

当您了解术语正向"和反向"实际上只是用于描述您正在谈论的索引的性质的相对术语 - 并且实际上索引只是一个索引 - 你的问题并不真的再有意义了.

When you understand that the terms "forward" and "inverted" are really just relative terms used to describe the nature of the index you're talking about - and that really an index is just an index - your question doesn't really make sense any more.


07-29 11:03