问题描述
我想从Windows服务器将文件写入HDFS. Hadoop集群位于Linux上.我尝试在所有到处都可以使用"hadoop jar"运行的Java代码的地方进行研究
I want to write files to HDFS from windows server. Hadoop cluster is on Linux.I tried researching everywhere I got a java code that can be run using "hadoop jar"
有人可以帮助我了解如何运行HDFS文件从Windows编写Java代码吗? Windows box上需要什么?即使是正确的链接也可以.
Can somebody help me to understand how can I run HDFS file write java code from windows? What is required on Windows box? Even a proper link will do.
推荐答案
您只需要编写一个简单的Java程序并将其像普通的.jar文件一样运行.
You need only to code a simple java program and run it like a normal .jar file.
在项目中,您需要导入hadoop库.
In the project you need to import the hadoop library.
这是一个可行的示例maven项目(我在集群上对其进行了测试):
This is a working example maven project (I tested it on my cluster):
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
public class WriteFileToHdfs {
public static void main(String[] args) throws IOException, URISyntaxException {
String dataNameLocation = "hdfs://[your-namenode-ip]:[the-port-where-hadoop-is-listening]/";
Configuration configuration = new Configuration();
FileSystem hdfs = FileSystem.get( new URI( dataNameLocation ), configuration );
Path file = new Path(dataNameLocation+"/myFile.txt");
FSDataOutputStream out = hdfs.create(file);
out.writeUTF("Some text ...");
out.close();
hdfs.close();
}
}
请记住将依赖项放到pom.xml中,并为主类建立清单文件的说明:
Remember to put the dependencies to your pom.xml and the instruction to build the manifest file for the main class:
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<mainClass>your.cool.package.WriteFileToHdfs</mainClass>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<phase>install</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>${mainClass}</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
只需使用以下命令对程序进行午餐:
Just lunch the program with the command:
当然,您需要使用程序包名称和namenode ip地址来编辑代码.
Of course you need to edit the code with your package name and namenode ip address.
这篇关于从Windwos服务器读取文件/将文件写入HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!