本文介绍了在 HBase 之上的 solr 中创建索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论如何我可以在 Solr 中创建索引以从 HBase 执行近实时全文搜索.

Is there anyway in which I can create indexes in Solr to perform full-text search from HBase for Near Real Time.

我不想将整个文本存储在我的 solr 索引中.制作 "stored=false"

I didn't wanted to store the whole text in my solr indexes. Made "stored=false"

注意: - 请记住,我正在处理大型数据集并希望进行近实时搜索.我们正在谈论 TB/PB 的数据.

Note: - Keeping in mind, I am working on large datasets and want to do Near Real Time search. WE are talking TB/PB of data.

Cloudera Distribution:5.4.x 与 Cloudera Search 组件一起使用.

Cloudera Distribution : 5.4.x is used with Cloudera Search components.

Solr : 4.10.x

Solr : 4.10.x

HBase:1.0.x

HBase : 1.0.x

Indexer Service : Lily HBase Indexer with cloudera morphlines

Indexer Service : Lily HBase Indexer with cloudera morphlines

是否有任何其他 NRT Indexer 服务或框架可以代替 Cloudera 上的 Lily.只是一个想法.

Is there any other NRT Indexer services or frameworks which can be used instead of Lily on Cloudera. Just a thought.

推荐答案

Cloudera :请检查 查看 Cloudera Search 的已知问题

是的,您可以考虑吗啡.它们可用于近乎实时的应用程序以及批处理应用程序.

Yes you can consider Morphlines. they can be used for near real-time applications as well as batch processing applications.

我不太了解 hortonworks 平台以及如何实现这一点.

I don't know much about hortonworks platform and how this can be achieved.

这篇关于在 HBase 之上的 solr 中创建索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-29 05:13