本文介绍了Solr:用于多语言索引&的DIH;多值字段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MySQL表:

I have a MySQL table:

CREATE TABLE documents (
    id INT NOT NULL AUTO_INCREMENT,
    language_code CHAR(2),
    tags CHAR(30),
    text TEXT,
    PRIMARY KEY (id)
);

我对Solr DIH有2个问题:

I have 2 questions about Solr DIH:

1)langauge_code字段指示text字段使用的语言.根据语言,我想将text索引到不同的Solr字段.

1) The langauge_code field indicates what language the text field is in. And depending on the language, I want to index text to different Solr fields.

# pseudo code

if langauge_code == "en":
    index "text" to Solr field "text_en"
elif langauge_code == "fr":
    index "text" to Solr field "text_fr"
elif langauge_code == "zh":
    index "text" to Solr field "text_zh"
...

DIH可以处理这样的用例吗?我该如何配置呢?

Can DIH handle a usecase like this? How do I configure it to do so?

2)tags字段需要索引到Solr multiValued字段中.多个值存储在字符串中,以逗号分隔.例如,如果tags包含字符串"blue, green, yellow",那么我想将3个值"blue""green""yellow"索引到Solr多值字段中.

2) The tags field needs to be indexed into a Solr multiValued field. Multiple values are stored in a string, separated by a comma. For example, if tags contains the string "blue, green, yellow" then I want to index the 3 values "blue", "green", "yellow" into a Solr multiValued field.

如何用DIH做到这一点?

How do I do that with DIH?

谢谢.

推荐答案

首先,您的架构需要允许使用以下内容:

First your schema needs to allow it with something like this:

<dynamicField name="text_*" type="string" indexed="true" stored="true" />

然后在您的DIH配置中,如下所示:

Then in your DIH config something like this:

<entity name="document" dataSource="ds1" transformer="script:ftextLang" query="SELECT * FROM documents" />

将脚本定义在数据源的正下方:

With the script being defined just below the datasource:

<script><![CDATA[
  function ftextLang(row){
     var name = row.get('language_code');
     var value = row.get('text');
     row.put('text_'+name, value); return row;
  }
]]></script>

这篇关于Solr:用于多语言索引&amp;的DIH;多值字段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-21 04:40