问题描述
因为这是来自新手...
As this is coming from a newbie...
我为Hadoop和Hive设置了,所以我可以在我的电脑上运行Hive查询访问数据AWS集群。
我可以使用存储在我的计算机上的.csv数据运行Hive查询,就像我使用MS SQL Server一样?
I had Hadoop and Hive set up for me, so I can run Hive queries on my computer accessing data on AWS cluster.Can I run Hive queries with .csv data stored on my computer, like I did with MS SQL Server?
如何将.csv数据加载到Hive中?它与Hadoop和哪个模式应该运行那个模式有什么关系?
How do I load .csv data into Hive then? What does it have to do with Hadoop and which mode I should run that one?
我应该关心什么设置,所以如果我做错了我可以总是回去
What settings I should care about so that if I did something wrong I can always go back and run queries on Amazon without compromising what was set up for me earlier?
推荐答案
如果你有一个蜂巢设置,你可以把本地数据集直接在hdfs / s3中使用Hive load命令。
if you have a hive setup you can put the local dataset directly using Hive load command in hdfs/s3.
在编写加载命令时,您需要使用Local关键字。
You will need to use "Local" keyword when writing your load command.
hiveload命令的语法
Syntax for hiveload command
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
有关详细信息,请参阅以下链接。
Refer below link for more detailed information.https://cwiki.apache.org/confluence/display/Hive/LanguageManual%20DML#LanguageManualDML-Loadingfilesintotables
这篇关于Hadoop / Hive:从本地计算机上的.csv加载数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!