问题描述
我收到以下错误,当我尝试调用
我使用Python客户端的火花。
I get the following error when I try to callI use python client for the spark.
lines = sc.textFile(hdfs://...)
lines.take(10)
我怀疑是火花和Hadoop版本可能不兼容。以下是Hadoop的版本的结果:
Hadoop的2.5.2
颠覆 -r cc72e9b000545b86b75a61f4835eb86d57bfafc0
詹金斯在2014-11-14T23编译:45Z
与protoc 2.5.0编译
从校验df7537a4faa4658983d397abf4514320源
该命令使用/etc/hadoop-2.5.2/share/hadoop/common/hadoop-common-2.5.2.jar运行
I suspect that spark and hadoop versions might not be compatible. Here are the result of hadoop version: Hadoop 2.5.2 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0 Compiled by jenkins on 2014-11-14T23:45Z Compiled with protoc 2.5.0 From source with checksum df7537a4faa4658983d397abf4514320This command was run using /etc/hadoop-2.5.2/share/hadoop/common/hadoop-common-2.5.2.jar
我也有火花1.3.1。
I also have spark 1.3.1.
File "/etc/spark/python/pyspark/rdd.py", line 1194, in take
totalParts = self._jrdd.partitions().size()
File "/etc/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/etc/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o21.partitions.
: java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$AppendRequestProto overrides final method getUnknownFields. ()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2436)
at java.lang.Class.privateGetPublicMethods(Class.java:2556)
at java.lang.Class.privateGetPublicMethods(Class.java:2566)
at java.lang.Class.getMethods(Class.java:1412)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:409)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:306)
at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:610)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:690)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:366)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:262)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:153)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:602)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:547)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:139)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.api.java.JavaRDDLike$class.partitions(JavaRDDLike.scala:64)
at org.apache.spark.api.java.AbstractJavaRDDLike.partitions(JavaRDDLike.scala:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
我一直在寻找这个问题,有人称它为protobuffer的版本,但我不是很熟悉如何正确设置它。任何想法?
I have been searching for the problem, some people refer to the version of protobuffer but I am not very familiar how to set it up correctly. Any idea?
推荐答案
检查pom.xml中,你编译
check the pom.xml where you compiled
搜索protobuf的版本。它可能会解决这个问题。
search for protobuf version. It might solve the problem.
或者问题可能会在本吉拉线程提到别的东西。
Or the problem might be something else as mentioned in this Jira thread.
这篇关于星火java.lang.VerifyError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!