将文本文件加载到Apache Kudu表中？

本文介绍了将文本文件加载到Apache Kudu表中？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何将文本文件加载到Apache Kudu表中？

How do you load a text file to an Apache Kudu table?

源文件是否需要首先位于HDFS空间中？

Does the source file need to be in HDFS space first?

如果它不与其他hadoop生态系统程序（例如，hive，impala）共享相同的hdfs空间，那么Apache Kudu是否等效于：

If it doesn't share the same hdfs space as other hadoop ecosystem programs (ie/ hive, impala), is there Apache Kudu equivalent of:

hdfs dfs -put /path/to/file

在尝试加载文件之前？

推荐答案

文件不必先放在HDFS中。可以从边缘节点/本地计算机。Kudu与Hbase类似。它是一种实时存储，支持键索引记录查找和变异，但不能像HDFS一样直接存储文本文件。对于Kudu，存储文本文件的内容需要解析和标记化。为此，您需要具有Spark执行/ java api以及Nifi（或Apache Gobblin）来执行处理，然后将其存储在Kudu表中。

The file need not to be in HDFS first.It can be taken from an edge node/local machine.Kudu is similar to Hbase.It is a real-time store that supports key-indexed record lookup and mutation but cant store text file directly as in HDFS.For Kudu to store the contents of a text file,it needs to be parsed and tokenised.For that, you need to have Spark execution/java api alongwith Nifi (or Apache Gobblin) to perform the processing and then storing it in Kudu table.

或

您可以将其与Impala allowin集成g您可以使用Impala使用Impala的SQL语法从Kudu平板电脑中插入，查询，更新和删除数据，以替代使用Kudu API来构建自定义Kudu应用程序。以下是步骤：

You can integrate it with Impala allowing you to use Impala to insert, query, update, and delete data from Kudu tablets using Impala’s SQL syntax, as an alternative to using the Kudu APIs to build a custom Kudu application.Below are the steps:

将文件导入hdfs

创建一个外部impala表。

然后插入

使用存储为KUDU的关键字和 As Select 创建一个kudu表。 code> 可以将内容从impala复制到kudu。

Import the file in hdfs
Create an external impala table.
Then insert the data in the table.
Create a kudu table using keyword stored as KUDU and As Selectto copy the contents from impala to kudu.

在此链接中，您可以参考更多信息-

In this link you can refer for more info- https://kudu.apache.org/docs/quickstart.html

                        这篇关于将文本文件加载到Apache Kudu表中？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！