我正在尝试在2个Windows设备之间设置多节点Hadoop集群。我正在使用Hadoop 2.9.2。
请问我该如何实现呢?
最佳答案
2.将您的hadoop主机和从机ips添加到主机文件中。打开“C:\ Windows \ System32 \ drivers \ etc \ hosts”
并添加
your-master-ip hadoopMaster
your-salve-ip hadoopSlave
您可以在配置文件中使用这些名称。与Linux系统非常相似,以下是在Windows上运行Hadoop集群所必须遵循的步骤:
3.首先,您需要在系统上安装Java,并且必须将JAVA_HOME添加到您的环境变量中。您可以从Oracle website下载Java并安装它。
HADOOP_HOME=”root of your hadoop extracted folder\hadoop-2.9.2″
HADOOP_BIN=”root of hadoop extracted folder\hadoop-2.9.2\bin”
JAVA_HOME=<Root of your JDK installation>”
echo %HADOOP_HOME%
echo %HADOOP_BIN%
echo %PATH%
配置HADOOP:10.打开“您的hadoop root \ hadoop-2.9.2 \ etc \ hadoop \ hadoop-env.cmd”,然后在文件底部添加以下几行:
set HADOOP_PREFIX=%HADOOP_HOME%
set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop
set YARN_CONF_DIR=%HADOOP_CONF_DIR%
set PATH=%PATH%;%HADOOP_PREFIX%\bin
11.打开“您的hadoop根\ hadoop-2.9.2 \ etc \ hadoop \ hdfs-site.xml”并添加以下内容:<property>
<name>dfs.name.dir</name>
<value>your desired address</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>your desired address</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoopMaster:50070</value>
<description>Your NameNode hostname for http access.</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoopMaster:50090</value>
<description>Your Secondary NameNode hostname for http access.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopMaster:9000</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>your-temp-directory</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hadoopMaster:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
14.编辑yarn-site.xml并添加:<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
<description>Long running service which executes on Node Manager(s) and provides MapReduce Sort and Shuffle functionality.</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>Enable log aggregation so application logs are moved onto hdfs and are viewable via web ui after the application completed. The default location on hdfs is '/log' and can be changed via yarn.nodemanager.remote-app-log-dir property</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopMaster:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoopMaster:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopMaster:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoopMaster:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoopMaster:8088</value>
</property>
hadoopSlave
18,格式化你的nameNode
hadoop namenode -format
19.运行以下命令:start-dfs.sh
start-yarn.sh
关于hadoop - 在2个Windows 10上设置Hadoop多集群,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/64130723/