本文介绍了Giraph ZooKeeper 端口问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行 Giraph 快速入门中描述的 SimpleShortestPathsVertex(又名 SimpleShortestPathComputation)示例.我使用 VirtualBox 在 Hortonworks Sandbox 实例 (HDP 2.1) 上运行它,并使用配置文件 hadoop_2.0.0 打包了 giraph.jar.

当我尝试使用

运行示例时

hadoop jar giraph.jar org.apache.giraph.GiraphRunnerorg.apache.giraph.examples.SimpleShortestPathsVertex -viforg.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip/user/hue/tinygraph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat-op/user/hue/output/shortestpaths -w 1

我收到以下异常

2014-04-30 07:22:15,390 INFO [main] org.apache.giraph.zk.ZooKeeperManager:onlineZooKeeperServers:最大 10 次连接尝试 0 次尝试连接到 sandbox.hortonworks.com:22181轮询毫秒 = 30002014-04-30 07:22:15,396 警告 [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectExceptionjava.net.ConnectException:连接被拒绝在 java.net.PlainSocketImpl.socketConnect(Native Method)在 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)在 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)在 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)在 java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)在 java.net.Socket.connect(Socket.java:579)在 org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)在 org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)在 org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)在 org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)在 org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)在 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)在 org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)在 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)在 java.security.AccessController.doPrivileged(Native Method)在 javax.security.auth.Subject.doAs(Subject.java:415)在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)在 org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

我找到了一个变通方法 - Giraph 似乎希望 ZooKeeper 在端口 22181 上运行,而它实际上在 2181 上运行.我只是使用 Ambari 接口将 ZooKeeper 设置为在 22181 上运行(转到 http://127.0.0.1:8080/,登录admin/admin,Services选项卡,ZooKeeper把端口改成22181,保存和Service Actions -> Restart All.

有没有人对这个问题有更好的解决方案?是否有一个配置来指定端口,或者 Giraph 源代码中的这个端口是一个错字?

解决方案

是的,您可以通过使用选项 -Dgiraph.zkList=localhost:2181 指定每次运行 Giraph 作业时>

您也可以在 Hadoop 配置中进行设置,然后每次提交 Giraph 作业时都不必传递此选项.为此,在 conf/core-site.xml 文件中添加以下行:

[请检查语法,我想不起来了,目前我无法访问集群来检查它]

I am trying to run the SimpleShortestPathsVertex (aka SimpleShortestPathComputation) example described in the Giraph Quick Start. I am running this on a Hortonworks Sandbox instance (HDP 2.1) using VirtualBox, and I packaged giraph.jar using profile hadoop_2.0.0.

When I try to run the example using

hadoop jar giraph.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimpleShortestPathsVertex -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip
/user/hue/tinygraph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/hue/output/shortestpaths -w 1

I get the following exception

2014-04-30 07:22:15,390 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to sandbox.hortonworks.com:22181 with poll msecs = 3000
2014-04-30 07:22:15,396 WARN [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)
at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)
at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

I have found a work around - it seems that Giraph expects ZooKeeper to be running on port 22181, while it is actually running on 2181. I have simply used the Ambari interface to set ZooKeeper to run on 22181 (go to http://127.0.0.1:8080/, login admin/admin, Services tab, ZooKeeper and change the port to 22181, save and Service Actions -> Restart All.

Does anyone have a better solution for this problem? Is there a config via which the port should be specified, or is this port in the Giraph source code a typo?

解决方案

Yes, you can specify each time you run a Giraph job via using option -Dgiraph.zkList=localhost:2181

Also you can set it up in Hadoop configs and then you don't have to pass on this option each time you submit a Giraph job. For that add the following line in conf/core-site.xml file :

<property><name>giraph.zkList</name><value>localhost:2181</value></property>

[Please check the syntax, I don't recall it on top my head and currently I don't have access to a cluster to check it]

这篇关于Giraph ZooKeeper 端口问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-17 20:13