本文介绍了在HBase上创建索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论如何,我可以在Solr中创建索引以从HBase进行近实时全文搜索。



我不想将整个文本存储在我的索引索引中。制作stored = false



注意:请记住,我正在处理大型数据集并希望做近实时搜索。我们正在说TB / PB的数据。



已更新



Cloudera发行版:5.4.x与Cloudera搜索组件。



Solr:4.10.x



HBase:1.0.x



索引服务:带cloudera morphlines的Lily HBase索引器

是否有任何其他NRT索引器服务或框架可用于替代Lily Cloudera的即可。


请检查
是的,你可以考虑Morphlines。它们可以用于接近实时的应用程序以及批处理应用程序。



对hortonworks平台以及如何实现这一点我不太了解。

p>

Is there anyway in which I can create indexes in Solr to perform full-text search from HBase for Near Real Time.

I didn't wanted to store the whole text in my solr indexes. Made "stored=false"

Note: - Keeping in mind, I am working on large datasets and want to do Near Real Time search. WE are talking TB/PB of data.

UPDATED

Cloudera Distribution : 5.4.x is used with Cloudera Search components.

Solr : 4.10.x

HBase : 1.0.x

Indexer Service : Lily HBase Indexer with cloudera morphlines

Is there any other NRT Indexer services or frameworks which can be used instead of Lily on Cloudera. Just a thought.

解决方案

Cloudera :please check this article and Hbase-Solr using Cloudera-search which describes how to achieve that. see below screen shot as described by those articles.Have a look at known issues with Cloudera Search

Yes you can consider Morphlines. they can be used for near real-time applications as well as batch processing applications.

I don't know much about hortonworks platform and how this can be achieved.

这篇关于在HBase上创建索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 05:24