本文介绍了hadoop中hcatalog的用法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是hadoop的新手。我知道HCatalog是Hadoop的表和存储管理层。但它究竟如何运作?如何使用它。请给出一个简单的例子。

解决方案

HCatalog支持以任何格式读取和写入Hive SerDe(串行器 - 解串器)可以写。默认情况下,HCatalog支持RCFile,CSV,JSON和SequenceFile格式。要使用自定义格式,您必须提供InputFormat,OutputFormat和SerDe。



HCatalog构建在Hive Metastore之上,并结合了Hive DDL中的组件。 HCatalog为Pig和MapReduce提供读写接口,并使用Hive的命令行界面来发布数据定义和元数据浏览命令。

它还提供了一个REST接口,允许使用外部工具访问Hive DDL(数据定义语言)操作,例如create table和describe table。

HCatalog提供数据的关系视图。数据存储在表中,这些表可以放入数据库中。表还可以在一个或多个键上进行分区。对于一个键(或一组键)的给定值,将有一个分区包含具有该值(或一组值)的所有行。




编辑:大部分文本来自。


I'm new to hadoop.I know that the HCatalog is a table and storage management layer for Hadoop. But how exactly it works & how to use it. Please give some simple example.

解决方案

HCatalog supports reading and writing files in any format for which a Hive SerDe (serializer-deserializer) can be written. By default, HCatalog supports RCFile, CSV, JSON, and SequenceFile formats. To use a custom format, you must provide the InputFormat, OutputFormat, and SerDe.

HCatalog is built on top of the Hive metastore and incorporates components from the Hive DDL. HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s command line interface for issuing data definition and metadata exploration commands.

It also presents a REST interface to allow external tools access to Hive DDL (Data Definition Language) operations, such as "create table" and "describe table".

HCatalog presents a relational view of data. Data is stored in tables and these tables can be placed into databases. Tables can also be partitioned on one or more keys. For a given value of a key (or set of keys) there will be one partition that contains all rows with that value (or set of values).


Edit: Most of the text is from https://cwiki.apache.org/confluence/display/Hive/HCatalog+UsingHCat.

这篇关于hadoop中hcatalog的用法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-19 09:36