问题描述
我正在Solr中编写一个自定义过滤器,以将令牌发布到Apache Stanbol进行增强,并将响应编入同一文档中不同字段的索引.
I am writing a custom filter in Solr to post a token to Apache Stanbol for enhancement and index the response to a different field in the same document.
在下面的测试代码中,我得到了Stanbol响应并将其作为新文档添加到Solr.我的要求是将stanbolResponse作为字段值添加到被索引的同一文档中.我认为如果可以从过滤器中的TokenStream检索文档ID,则可以完成此操作.
In my test code below I have got the Stanbol response and have added it as a new document to Solr. My requirement is to add the stanbolResponse as a field value to the same document being indexed.I think this can be done if I can retrieve the document Id from the TokenStream in the filter.
任何人都可以通过示例代码/示例或有关如何实现此目标的链接来帮助我吗?
Can anyone please help me with a sample code/example or a link on how to achieve this?
public boolean incrementToken() throws IOException {
if (!input.incrementToken()) {
return false;
}
int length = charTermAttr.length();
char[] buffer = charTermAttr.buffer();
String content = new String(buffer);
Client client = Client.create();
WebResource webResource = client.resource(stanbol_endpoint + "enhancer");
ClientResponse response = webResource
.type(MediaType.TEXT_PLAIN)
.accept(new MediaType("application", "rdf+xml"))
.entity(content2,MediaType.TEXT_PLAIN)
.post(ClientResponse.class);
int status = response.getStatus();
if (status != 200 && status != 201 && status != 202) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatus());
}
String output = response.getEntity(String.class);
charTermAttr.setEmpty();
char[] newBuffer = output.toCharArray();
charTermAttr.copyBuffer(newBuffer, 0, newBuffer.length);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField("stanbolResponse", output);
try {
server.add(doc1);
server.commit();
} catch (SolrServerException e) {
System.out.println("error while indexing response to solr");
e.printStackTrace();
}
return true;
}
推荐答案
通过编写自定义UpdateRequestProcessor并配置/update请求处理程序以在update.chain中使用我的自定义处理器,成功解决了该用例.
This usecase was successfuly covered by writing a custom UpdateRequestProcessor and configuring the /update request handler to use my custom processor in the update.chain.
在建立索引之前,我能够处理文档并将其添加到文档中.下面是我如何使用自定义处理器配置/update请求处理程序.
I was able to process and add new fields to the document prior to indexing.Below is how I configured my /update request handler with my custom processor.
stanbol流程的RequestProcessor:
RequestProcessor for stanbol process:
<updateRequestProcessorChain name="stanbolInterceptor">
<processor class="com.solr.stanbol.processor.StanbolContentProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
使用上面的链配置请求处理程序以进行update.chain:
configure the request-handler with above chain for update.chain:
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">stanbolInterceptor</str>
</lst>
</requestHandler>
这篇关于如何在自定义Solr筛选器中向文档添加新字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!