问题描述
在为维基百科数据建立索引的探索示例中在Solr中,如何获得预期的结果(即与导入的数据相同)?
While exploring example for indexing wikipedia datain Solr, how can we get the expected result (i.e. same as data imported)?
我们是否可以通过配置而不是通过组查询来实现此过程,因为我的数据中包含很多内部标签.
Is there any process that we can achieve it through configurations not from group query, because I have data which having lots of inner tags.
我探索了xslt结果转换,但是我正在寻找json响应.
I explored xslt result transformation, but i am looking for json response.
导入的文档:
<page>
<title>AccessibleComputing</title>
<ns>0</ns>
<id>10</id>
<redirect title="Computer accessibility" />
<revision>
<id>381202555</id>
<parentid>381200179</parentid>
<timestamp>2010-08-26T22:38:36Z</timestamp>
<contributor>
<username>OlEnglish</username>
<id>7181920</id>
</contributor>
</revision>
</page>
solrConfig.xml:
solrConfig.xml:
<dataConfig>
<dataSource type="FileDataSource" encoding="UTF-8" />
<document>
<entity name="page"
processor="XPathEntityProcessor"
stream="true"
forEach="/mediawiki/page/"
url="data/enwiki-20130102-pages-articles.xml"
transformer="RegexTransformer,DateFormatTransformer"
>
<field column="id" xpath="/mediawiki/page/id" />
<field column="title" xpath="/mediawiki/page/title" />
<field column="revision" xpath="/mediawiki/page/revision/id" />
<field column="user" xpath="/mediawiki/page/revision/contributor/username" />
<field column="userId" xpath="/mediawiki/page/revision/contributor/id" />
<field column="text" xpath="/mediawiki/page/revision/text" />
<field column="timestamp" xpath="/mediawiki/page/revision/timestamp" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" />
<field column="$skipDoc" regex="^#REDIRECT .*" replaceWith="true" sourceColName="text"/>
</entity>
</document>
</dataConfig>
通过solr查询进行响应:
Response by solr query:
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": "10",
"timestamp": "2010-08-26T17:08:36Z",
"revision": 381202555,
"titleText": "AccessibleComputing",
"userId": 7181920,
"user": "OlEnglish"
}
]
}
预期的响应:
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": "10",
"timestamp": "2010-08-26T17:08:36Z",
"revision": 381202555,
"titleText": "AccessibleComputing",
"contributor": [{
"userId": 7181920,
"user": "OlEnglish"
}]
}
]
}
推荐答案
如果您不喜欢使用 XsltResponseWriter (它也可以帮助int以JSON输出结果),您可以创建自己的SearchComponent
,它将修改输出.使用自定义SearchComponent
时,可以将不同的ResponseWriters应用于输出(xml,json,csv,xslt等).
If you don't like the idea of using XsltResponseWriter (which can help int outputting the results in JSON as well), you can create your own SearchComponent
, which will modify the output. When you use a custom SearchComponent
you can apply different ResponseWriters to the output (xml, json, csv, xslt, etc.).
您可以在这篇文章.
要使用XsltResponseWriter
,请将此代码添加到solrconfig.xml
:
To use XsltResponseWriter
, add this code to solrconfig.xml
:
<queryResponseWriter name="xslt" class="org.apache.solr.response.XSLTResponseWriter"/>
在conf/xslt
文件夹中添加一个json.xsl
文件,该文件具有XML输出的转换规则(在查询中使用wt=xml
时),如下所示:
Add a json.xsl
file to conf/xslt
folder, which has transformation rules for your XML output (when you use wt=xml
in your query), something like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="text" indent="no" media-type="application/json"/>
<xsl:template match="result">
<xsl:text>{"response":{"docs":[</xsl:text>
<xsl:apply-templates select="doc"/>
<xsl:text>]}}</xsl:text>
</xsl:template>
<xsl:template match="doc">
<xsl:if test="position() > 1">
<xsl:text>,</xsl:text>
</xsl:if>
<xsl:text>{"contributor": [{"userId": </xsl:text><xsl:value-of select="userId"/><xsl:text>, "user": "</xsl:text><xsl:value-of select="user"/><xsl:text>"}]}</xsl:text>
</xsl:template>
</xsl:stylesheet>
然后,您可以使用以下网址获得此响应:
Then you can get this response using a url like:
http://localhost:8983/solr/select/?q=id:10&wt=xslt&tr=json.xsl
这篇关于Solr-如何获取特定格式的搜索结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!