问题描述
我正在尝试使用Lucene.NET 4.8进行关系搜索(实际上,我使用最新资源进行了编译),方法是此帖子.我引用了 Lucene.Net
, Lucene.Net.Analysis.Common
, Lucene.Net.Grouping
, Lucene.Net.Join
和 Lucene.Net.QueryParser
.
I am trying to do a relational search with Lucene.NET 4.8 (actually I compiled it using the latest sources) by following this post. I reference Lucene.Net
, Lucene.Net.Analysis.Common
, Lucene.Net.Grouping
, Lucene.Net.Join
, and Lucene.Net.QueryParser
.
问题是:我没有得到任何结果.在下面的示例中,我认为 blog
是 parent
,而 comments
是 children
.我想找到一个博客,该博客包含 first
,并且其注释包含 like
(具有 Id
1的博客).
The problem is: I do not get any results. In my example below I consider blog
the parent
while comments
are the children
. I want to find a blog which contains first
and which has a comment containing like
(which is the one with Id
1).
如何修复此示例代码?
static void BlockJoinQueryTest(string dbFolder)
{
var analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48);
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer);
config.SetOpenMode(IndexWriterConfig.OpenMode_e.CREATE_OR_APPEND);
var indexPathBlog = dbFolder + "\\blog_db";
if (System.IO.Directory.Exists(indexPathBlog))
{
System.IO.Directory.Delete(indexPathBlog, true);
}
System.IO.Directory.CreateDirectory(indexPathBlog);
var indexDirectoryBlog = FSDirectory.Open(new System.IO.DirectoryInfo(indexPathBlog));
var indexWriterBlog = new IndexWriter(indexDirectoryBlog, config);
Document comment = new Document();
comment.Add(new TextField("BlogId", "1", Field.Store.YES));
comment.Add(new TextField("CommentContent", "I like your first blog!", Field.Store.YES));
comment.Add(new TextField("Type", "comment", Field.Store.YES));
comment.Add(new TextField("Note", "child", Field.Store.YES));
indexWriterBlog.AddDocument(comment);
comment = new Document();
comment.Add(new TextField("BlogId", "1", Field.Store.YES));
comment.Add(new TextField("CommentContent", "Not that great.", Field.Store.YES));
comment.Add(new TextField("Type", "comment", Field.Store.YES));
comment.Add(new TextField("Note", "child", Field.Store.YES));
indexWriterBlog.AddDocument(comment);
Document blog = new Document();
blog.Add(new TextField("Id", "1", Field.Store.YES));
blog.Add(new TextField("BlogContent", "Content of first blog", Field.Store.YES));
blog.Add(new TextField("Type", "blog", Field.Store.YES));
blog.Add(new TextField("Note", "parent", Field.Store.YES));
indexWriterBlog.AddDocument(blog);
blog = new Document();
blog.Add(new TextField("Id", "2", Field.Store.YES));
blog.Add(new TextField("BlogContent", "This is the second blog!", Field.Store.YES));
blog.Add(new TextField("Type", "blog", Field.Store.YES));
blog.Add(new TextField("Note", "parent", Field.Store.YES));
indexWriterBlog.AddDocument(blog);
indexWriterBlog.Commit();
var searcher = new IndexSearcher(DirectoryReader.Open(indexDirectoryBlog));
Console.WriteLine("Begin content enumeration:");
for (int i = 0; i < searcher.IndexReader.MaxDoc; i++)
{
var doc = searcher.IndexReader.Document(i);
Console.WriteLine("Document " + i + ": " + doc.ToString());
}
Console.WriteLine("End content enumeration.");
Filter blogs = new CachingWrapperFilter(
new QueryWrapperFilter(
new TermQuery(
new Term("Type", "blog"))));
BooleanQuery commentQuery = new BooleanQuery();
commentQuery.Add(new TermQuery(new Term("CommentContent", "like")), BooleanClause.Occur.MUST);
//commentQuery.Add(new TermQuery(new Term("BlogId", "1")), BooleanClause.Occur.MUST);
var commentJoinQuery = new ToParentBlockJoinQuery(
commentQuery,
blogs,
ScoreMode.None);
BooleanQuery query = new BooleanQuery();
query.Add(new TermQuery(new Term("BlogContent", "first")), BooleanClause.Occur.MUST);
query.Add(commentQuery, BooleanClause.Occur.MUST);
var c = new ToParentBlockJoinCollector(
Sort.RELEVANCE, // sort
10, // numHits
true, // trackScores
false // trackMaxScore
);
searcher.Search(query, c);
int maxDocsPerGroup = 10;
var hits = c.GetTopGroups(
commentJoinQuery,
Sort.INDEXORDER,
0, // offset
maxDocsPerGroup, // maxDocsPerGroup
0, // withinGroupOffset
true // fillSortFields
);
if (hits != null)
{
Console.WriteLine("Found " + hits.TotalGroupCount + " groups:");
for (int i = 0; i < hits.TotalGroupCount; i++)
{
var group = hits.Groups[i];
Console.WriteLine("Group " + i + ": " + group.ToString());
for (int j = 0; j < group.TotalHits && j < maxDocsPerGroup; j++)
{
Document doc = searcher.Doc(group.ScoreDocs[j].Doc);
Console.WriteLine("Hit " + i + ": " + doc.ToString());
}
}
}
else
{
Console.WriteLine("No hits.");
}
Console.WriteLine("Done.");
推荐答案
我也偶然发现了这个问题,并设法进行了修复.
I also stumbled across this and managed to fix it.
- @Ant在声明父文档必须是块中的最后一个文档时是正确的.
但是代码仍然存在两个问题
But there were two remaining problems with the code
-
由于某些原因-对于不是Lucene专家感到抱歉-当CommentCOntent是一个句子(我喜欢您的第一个博客!")并且您使用术语查询来搜索它时,您不会得到任何结果.我想这与该领域的分析有关.所以我要做的是用博客"
现在,IndexSercher似乎找到了一个结果,但引发了一个错误,例如"System.InvalidOperationException:'parentFilter必须返回FixedBitSet;获取了Lucene.Net.Search.QueryWrapperFilter + DocIdSetAnonymousInnerClassHelper"通过lucene.net(Github)的测试案例,我发现我不得不将parentQuery包装在FixedBitSetCachingWrapperFilter中:过滤parentQuery =新的FixedBitSetCachingWrapperFilter(新的QueryWrapperFilter(新的TermQuery(new Term("Type","blog")))));
Now the IndexSercher seemed to find a result, but threw an error as "System.InvalidOperationException: 'parentFilter must return FixedBitSet; got Lucene.Net.Search.QueryWrapperFilter+DocIdSetAnonymousInnerClassHelper"Looking through the test cases of lucene.net (Github), I saw that I had to wrap the parentQuery in a FixedBitSetCachingWrapperFilter: Filter parentQuery = new FixedBitSetCachingWrapperFilter( new QueryWrapperFilter( new TermQuery( new Term("Type", "blog"))));
完整代码是:
var analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48);
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer);
config.SetOpenMode(OpenMode.CREATE_OR_APPEND);
var indexPathBlog = Path.Combine(Environment.CurrentDirectory, "index");
if (System.IO.Directory.Exists(indexPathBlog))
{
System.IO.Directory.Delete(indexPathBlog, true);
}
System.IO.Directory.CreateDirectory(indexPathBlog);
var indexDirectoryBlog = FSDirectory.Open(new System.IO.DirectoryInfo(indexPathBlog));
var indexWriterBlog = new IndexWriter(indexDirectoryBlog, config);
var one = new List<Document>();
var two = new List<Document>();
Document commentOne = new Document();
commentOne.Add(new TextField("BlogId", "1", Field.Store.YES));
commentOne.Add(new TextField("CommentContent", "blog", Field.Store.YES));
commentOne.Add(new TextField("Type", "comment", Field.Store.YES));
commentOne.Add(new TextField("Note", "child", Field.Store.YES));
one.Add(commentOne);
var blogOne = new Document();
blogOne.Add(new TextField("Id", "1", Field.Store.YES));
blogOne.Add(new TextField("BlogContent", "Content of first blog!", Field.Store.YES));
blogOne.Add(new TextField("Type", "blog", Field.Store.NO));
blogOne.Add(new TextField("Note", "parent", Field.Store.YES));
one.Add(blogOne);
var commentTwo = new Document();
commentTwo.Add(new TextField("BlogId", "2", Field.Store.YES));
commentTwo.Add(new TextField("CommentContent", "Not that great.", Field.Store.YES));
commentTwo.Add(new TextField("Type", "comment", Field.Store.YES));
commentTwo.Add(new TextField("Note", "child", Field.Store.YES));
two.Add(commentTwo);
Document blogTwo = new Document();
blogTwo.Add(new TextField("Id", "2", Field.Store.YES));
blogTwo.Add(new TextField("BlogContent", "This is the second blog!", Field.Store.YES));
blogTwo.Add(new TextField("Type", "blog", Field.Store.NO));
blogTwo.Add(new TextField("Note", "parent", Field.Store.YES));
two.Add(blogTwo);
indexWriterBlog.AddDocuments(one);
indexWriterBlog.AddDocuments(two);
indexWriterBlog.Commit();
var searcher = new IndexSearcher(DirectoryReader.Open(indexDirectoryBlog));
Filter parentQuery =
new FixedBitSetCachingWrapperFilter(
new QueryWrapperFilter(
new TermQuery(
new Term("Type", "blog"))));
BooleanQuery childQuery = new BooleanQuery();
childQuery.Add(new TermQuery(new Term("CommentContent", "blog")), Occur.MUST);
var commentJoinQuery = new ToParentBlockJoinQuery(
childQuery,
parentQuery,
ScoreMode.None);
BooleanQuery query = new BooleanQuery();
//query.Add(new TermQuery(new Term("Type", "blog")), BooleanClause.Occur.MUST);
query.Add(commentJoinQuery, Occur.MUST);
var c = new ToParentBlockJoinCollector(
Sort.RELEVANCE, // sort
10, // numHits
false, // trackScores
false // trackMaxScore
);
searcher.Search(commentJoinQuery, c);
int maxDocsPerGroup = 10;
var hits = c.GetTopGroups(
commentJoinQuery,
Sort.INDEXORDER,
0, // offset
maxDocsPerGroup, // maxDocsPerGroup
0, // withinGroupOffset
true // fillSortFields
);
if (hits != null)
{
Console.WriteLine("Found " + hits.TotalGroupCount + " groups:");
for (int i = 0; i < hits.TotalGroupCount; i++)
{
var group = hits.Groups[i];
Console.WriteLine("Group " + i + ": " + group.ToString());
for (int j = 0; j < group.TotalHits && j < maxDocsPerGroup; j++)
{
Document doc = searcher.Doc(group.ScoreDocs[j].Doc);
Console.WriteLine("Hit " + i + ": " + doc.ToString());
}
}
}
else
{
Console.WriteLine("No hits.");
}
Console.WriteLine("Done.");
Console.ReadKey();
请注意,我在.NET Core控制台应用程序中使用了以下Pacakges:
Note that I used the following Pacakges in a .NET Core Console app:
<PackageReference Include="Lucene.Net" Version="4.8.0-beta00005" />
<PackageReference Include="Lucene.Net.Analysis.Common" Version="4.8.0-beta00005" />
<PackageReference Include="Lucene.Net.Join" Version="4.8.0-beta00005" />
这篇关于Lucene.NET:如何使用BlockJoinQuery?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!