本文介绍了Solr接近有序与无序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Solr中,您可以使用语法

In Solr you can perform an ordered proximity search using syntax

"word1 word2"~10

按照命令,我的意思是word1将始终在文档中的word2之前出现.我想知道是否有一种简单的方法来执行无序的邻近搜索,即.单词1和单词2出现在彼此的10个单词之内,无论哪个先出现都没关系.

By ordered, I mean word1 will always come before word2 in the document. I would like to know if there is an easy way to perform an unordered proximity search, ie. word1 and word2 occur within 10 words of each other and it doesn't matter which comes first.

一种执行此操作的方法是:

One way to do this would be:

"word1 word2"~10 OR "word2 word1"~10

上面的方法可行,但我正在寻找更简单的方法.

The above will work but I'm looking for something simpler, if possible.

推荐答案

倾斜表示可以发生多少个单词转置.因此,"a b"将不同于"b a",因为允许使用不同数量的换位.

Slop means how many word transpositions can occur. So "a b" is going to be different than "b a" because a different number of transpositions are allowed.

  • a foo b具有位置(a,1),(foo,2),(b,3).要匹配(a,1),(b,2),需要进行一次更改:(b,2)=>(b,3)
  • 但是,要匹配(b,1),(a,2),您总共需要(a,2)=>(a,1)和(b,1)=>(b,3)三个位置的运动
  • a foo b has positions (a,1), (foo, 2), (b, 3). To match (a,1), (b,2) will require one change: (b,2) => (b,3)
  • However, to match (b,1), (a,2) you will need (a,2) => (a,1) and (b,1) => (b,3), for a total of three position movements

通常,如果"a b"~n与某项匹配,那么"b a"~(n+2)也将与之匹配.

In general, if "a b"~n matches something, then "b a"~(n+2) will match it too.

我想我从未给出答案.我看到两个选择:

I guess I never gave an answer. I see two options:

  1. 如果您想要n的斜率,则将其增加到n + 2
  2. 按照您的建议手动分类搜索

我认为#2可能更好,除非您的斜率很大.

I think #2 is probably better, unless your slop is very large to begin with.

这篇关于Solr接近有序与无序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:14