将数据插入到Hive表中

将数据插入到Hive表中

本文介绍了将数据插入到Hive表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是蜂房新手。我已经成功地为开发目的设置了一个单节点hadoop集群,并且在它之上,我已经安装了hive和pig。



我在配置单元中创建了一个虚拟表: p>

  create table foo(id int,name string); 

现在,我想将数据插入到此表中。我可以一次添加数据,就像sql一样吗?

  insert into foo(id,name)VALUES(12,xyz); 

另外,我有一个csv文件,其中包含以下格式的数据:

  1,name1 
2,name2
..
..

..


1000,name1000

如何将这些数据加载到dummy table?

解决方案

我认为最好的方法是:

a)将数据复制到HDFS )
b)在你的CSV上创建外部表,如下所示:

  CREATE EXTERNAL TABLE TableName id int,name字符串)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY','
'\\\
'
作为TEXTFILE存储
LOCATION'位置HDFS';

c)您可以通过向其发出查询来开始使用TableName。

d)如果你想插入数据到其他Hive ta ble:

  insert overwrite table finalTable select * from table name; 


I am new to hive. I have successfully setup a single node hadoop cluster for development purpose and on top of it, I have installed hive and pig.

I created a dummy table in hive:

create table foo (id int, name string);

Now, I want to insert data into this table. Can I add data just like sql one record at a time? kindly help me with an analogous command to:

insert into foo (id, name) VALUES (12,"xyz);

Also, I have a csv file which contains data in the format:

1,name1
2,name2
..
..

..


1000,name1000

How can I load this data into the dummy table?

解决方案

I think the best way is:
a) Copy data into HDFS (if it is not already there)
b) Create external table over your CSV like this

CREATE EXTERNAL TABLE TableName (id int, name string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'place in HDFS';

c) You can start using TableName already by issuing queries to it.
d) if you want to insert data into other Hive table:

insert overwrite table finalTable select * from table name;

这篇关于将数据插入到Hive表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 10:06