本文介绍了{过滤}比{查询} Lucene更快吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!


在阅读Lucene in Action 2nd Edition时,我遇到了 Filter 类的描述,这些类可用于Lucene中的结果过滤。 Lucene有很多过滤器重复 Query 类。例如, NumericRangeQuery NumericRangeFilter

While reading "Lucene in Action 2nd edition" I came across the description of Filter classes which are could be used for result filtering in Lucene. Lucene has a lot of filters repeating Query classes. For example, NumericRangeQuery and NumericRangeFilter.

本书说 NRF NRQ 完全相同,但没有文件评分。这是否意味着如果我不需要评分或按文档字段值对文档进行排序,我应该更喜欢过滤结束从绩效角度查询

The book says that NRF does exactly the same as NRQ but without document scoring. Does this means that if I do not need scoring or sort documents by document field value I should prefer Filtering over Querying from performance point of view?


我收到了很棒的信息来自Uwe Schindler的答案,让我在这里重新发布。

I receive a great answer from Uwe Schindler, let me repost it here.

的情况下更快(例如,范围查询和类似的东西 - 称为MultiTermQueries
- 在内部也被实现通过与
过滤器相同的BitSet算法 - 实际上它们只是由Scorer-impl包装的过滤器。但是,将
则是能够缓存非评分查询。这将使得b $ b b代码变得更容易。

If you only want to e.g. randomly "filter" e.g. by a variable numeric range like a bounding box in a geographic search, use queries, queries are in most cases faster (e.g. Range Queries and similar stuff - called MultiTermQueries - are internally also implemented by the same BitSet algorithm like the Filter - in fact they are only Filters wrapped by a Scorer-impl). But the Scorer that ANDs the query and your "filter" query together (ConjunctionScorer) is generally faster than the code that applies the filter after searching. This may some improvement possible, but in general filters are something in Lucene that is not really needed anymore, so there were already some approaches to make Filters and Queries the same, and instead then be able to also cache non-scoring queries. This would make lots of code easier.

过滤器可以带来Lucene 4.0的巨大速度提升,如果它们是
) - 我正在研究它。我们

Filters can bring a huge speed improvement with Lucene 4.0, if they are plugged ontop of the IndexReader to filter the documents before scoring, but that's not yet implemented (see https://issues.apache.org/jira/browse/LUCENE-3212) - I am working on it. We may also make Filters random access (it's easy as they are bitsets), which could improve also the after-query filtering. But I would then also make Queries partially random access, if they could support it (like queries that are only based on FieldCache).


这篇关于{过滤}比{查询} Lucene更快吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:16