问题描述
我正在尝试为需要将数据写入 Hive 的 Spark Streaming 项目设置开发环境.我有一个集群,其中有 1 台主机、2 台从机和 1 台开发机器(在 Intellij Idea 14 中编码).
在 spark shell 中,一切似乎都正常,我可以使用 DataFrame.write.insertInto("testtable") 通过 Spark 1.5 将数据存储到 Hive 中的默认数据库中
但是,当在 IDEA 中创建一个 scala 项目并使用具有相同设置的相同集群运行它时,在 mysql 中假设为metastore_db"的 Metastore 数据库中创建事务连接工厂时抛出错误.
这是 hive-site.xml:
<财产><name>hive.metastore.uris</name><value>thrift://10.1.50.73:9083</value></属性><财产><name>hive.metastore.warehouse.dir</name><value>/user/hive/warehouse</value></属性><财产><name>javax.jdo.option.ConnectionURL</name><value>jdbc:mysql://10.1.50.73:3306/metastore_db?createDatabaseIfNotExist=true</value></属性><财产><name>javax.jdo.option.ConnectionDriverName</name><value>com.mysql.jdbc.Driver</value></属性><财产><name>javax.jdo.option.ConnectionUserName</name><value>sa</value></属性><财产><name>javax.jdo.option.ConnectionPassword</name><value>欢欢</value></属性></配置>我运行IDEA的机器,可以远程登录Mysql和Hive建表,所以权限应该没有问题.这是 log4j 输出:
>/home/stdevelop/jdk1.7.0_79/bin/java -Didea.launcher.port=7536 -Didea.launcher.bin.path=/home/stdevelop/idea/bin -Dfile.encoding=UTF-8 -classpath/home/stdevelop/jdk1.7.0_79/jre/lib/plugin.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/deploy.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfxrt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/charsets.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/javaws.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfr.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jce.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jsse.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/rt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/resources.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/management-agent.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/zipfs.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunec.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunpkcs11.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunjce_provider.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/localedata.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/dnsns.jar:/home/stdevelop/IdeaProjects/StreamingIntoHive/target/scala-2.10/classes:/root/.sbt/boot/scala-2.10.4/lib/scala-library.jar:/home/stdevelop/SparkDll/spark-assembly-1.5.0-hadoop2.5.2.jar:/home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jar:/home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar:/home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar:/home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar:/home/stdevelop/idea/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain StreamingIntoHive 10.1.50.68 8080使用 Spark 的默认 log4j 配置文件:org/apache/spark/log4j-defaults.properties15/09/22 19:43:18 信息 SparkContext:运行 Spark 版本 1.5.015/09/22 19:43:21 警告 NativeCodeLoader:无法为您的平台加载本机 Hadoop 库...在适用的情况下使用内置 Java 类15/09/22 19:43:22 信息安全管理器:将视图 acl 更改为:root15/09/22 19:43:22 信息安全管理器:将修改 acls 更改为:root15/09/22 19:43:22 信息安全管理器:安全管理器:身份验证已禁用;ui acls 已禁用;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)15/09/22 19:43:26 信息 Slf4jLogger:Slf4jLogger 启动15/09/22 19:43:26 信息远程处理:开始远程处理15/09/22 19:43:26 信息远程:远程启动;监听地址:[akka.tcp://[email protected]:58070]15/09/22 19:43:26 信息实用程序:在端口 58070 上成功启动服务sparkDriver".15/09/22 19:43:26 INFO SparkEnv:注册 MapOutputTracker15/09/22 19:43:26 INFO SparkEnv:注册 BlockManagerMaster15/09/22 19:43:26 信息 DiskBlockManager:在/tmp/blockmgr-e7fdc896-ebd2-4faa-a9fe-e61bd93a9db4 创建本地目录15/09/22 19:43:26 INFO MemoryStore:MemoryStore 启动容量为 797.6 MB15/09/22 19:43:27 INFO HttpFileServer: HTTP 文件服务器目录是/tmp/spark-fb07a3ad-8077-49a8-bcaf-12254cc90282/httpd-0bb434c9-1418-49b6-a5201bc-15/09/22 19:43:27 信息 HttpServer:启动 HTTP 服务器15/09/22 19:43:27 信息实用程序:在端口 38865 上成功启动服务HTTP 文件服务器".15/09/22 19:43:27 INFO SparkEnv:注册 OutputCommitCoordinator15/09/22 19:43:29 INFO Utils:在端口 4040 上成功启动服务SparkUI".15/09/22 19:43:29 INFO SparkUI:在 http://10.1.50.68:4040 启动 SparkUI15/09/22 19:43:29 INFO SparkContext:在 http://10.1.50.68:38865/jars/mysql-添加了 JAR/home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar带时间戳的连接器-java-5.1.35-bin.jar 144292220949615/09/22 19:43:29 INFO SparkContext:在 http://10.1.50.68:38865/jars/datanucleus-api-添加了 JAR/home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jarjdo-3.2.6.jar 带有时间戳 144292220949815/09/22 19:43:29 INFO SparkContext:在 http://10.1.50.68:38865/jars/datanucleus-rdbms-3.2 添加了 JAR/home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar.9.jar 带有时间戳 144292220953415/09/22 19:43:29 INFO SparkContext:在 http://10.1.50.68:38865/jars/datanucleus-core-3.2 添加了 JAR/home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar.10.jar 带有时间戳 144292220956415/09/22 19:43:30 WARN MetricsSystem:使用默认名称 DAGScheduler 作为源,因为未设置 spark.app.id.15/09/22 19:43:30 INFO AppClient$ClientEndpoint:连接到 master spark://10.1.50.71:7077...15/09/22 19:43:32 INFO SparkDeploySchedulerBackend:使用应用程序 ID app-20150922062654-0004 连接到 Spark 集群15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器添加:app-20150922062654-0004/0 on worker-20150921191458-10.1.50.71-44716 (10.16s:145)15/09/22 19:43:32 INFO SparkDeploySchedulerBackend:在 hostPort 10.1.50.71:44716 上授予执行程序 ID app-20150922062654-0004/0,具有 1 个内核,1024.0 MB RAM15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器添加:app-20150922062654-0004/1 on worker-20150921191456-10.1.50.73-36446 (10.76s:10.16s)15/09/22 19:43:32 INFO SparkDeploySchedulerBackend:在 hostPort 10.1.50.73:36446 上授予执行程序 ID app-20150922062654-0004/1,具有 1 个内核,1024.0 MB RAM15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器添加:app-20150922062654-0004/2 on worker-20150921191456-10.1.50.72-53999 (10.29s):10.29s:50315/09/22 19:43:32 INFO SparkDeploySchedulerBackend:在 hostPort 10.1.50.72:53999 上授予执行程序 ID app-20150922062654-0004/2,具有 1 个内核,1024.0 MB RAM15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/1 现在正在加载15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/0 现在正在加载15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/2 现在正在加载15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/0 现在正在运行15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/1 现在正在运行15/09/22 19:43:32 INFO AppClient$ClientEndpoint:执行器更新:app-20150922062654-0004/2 现在正在运行15/09/22 19:43:33 信息实用程序:在端口 60161 上成功启动服务org.apache.spark.network.netty.NettyBlockTransferService".15/09/22 19:43:33 信息 NettyBlockTransferService:在 60161 上创建的服务器15/09/22 19:43:33 INFO BlockManagerMaster:尝试注册 BlockManager15/09/22 19:43:33 INFO BlockManagerMasterEndpoint:使用 797.6 MB RAM 注册块管理器 10.1.50.68:60161,BlockManagerId(驱动程序,10.1.50.68,60161)15/09/22 19:43:33 信息 BlockManagerMaster:注册的 BlockManager15/09/22 19:43:34 INFO SparkDeploySchedulerBackend:SchedulerBackend 已准备好在达到 minRegisteredResourcesRatio 后开始调度:0.015/09/22 19:43:35 INFO SparkContext:在 http://10.1.50.68:38865/jars/streamingintohive.jar 添加了 JAR/home/stdevelop/Builds/streamingintohive.jar,时间戳为 144292221516915/09/22 19:43:39 INFO SparkDeploySchedulerBackend:注册执行器:AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:40110/user/Executor#-132020084]) 与 ID 215/09/22 19:43:39 INFO SparkDeploySchedulerBackend:注册执行器:AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:38248/user/Executor#-1615730727]) 与 ID015/09/22 19:43:40 INFO BlockManagerMasterEndpoint:使用 534.5 MB RAM 注册块管理器 10.1.50.72:37819,BlockManagerId(2, 10.1.50.72, 37819)15/09/22 19:43:40 INFO BlockManagerMasterEndpoint:使用 534.5 MB RAM 注册块管理器 10.1.50.71:48028,BlockManagerId(0, 10.1.50.71, 48028)15/09/22 19:43:42 信息 HiveContext:初始化执行配置单元,版本 1.2.115/09/22 19:43:42 INFO ClientWrapper:检查的 Hadoop 版本:2.5.215/09/22 19:43:42 INFO ClientWrapper:为 Hadoop 版本 2.5.2 加载了 org.apache.hadoop.hive.shims.Hadoop23Shims15/09/22 19:43:42 INFO SparkDeploySchedulerBackend:注册执行器:AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:56385/user/Executor#1871695565]) 与 ID 115/09/22 19:43:43 INFO BlockManagerMasterEndpoint:使用 534.5 MB RAM 注册块管理器 10.1.50.73:43643,BlockManagerId(1, 10.1.50.73, 43643)15/09/22 19:43:45 INFO HiveMetaStore:0:使用实现类打开原始存储:org.apache.hadoop.hive.metastore.ObjectStore15/09/22 19:43:45 INFO ObjectStore:ObjectStore,初始化调用15/09/22 19:43:47 信息持久性:属性 datanucleus.cache.level2 未知 - 将被忽略15/09/22 19:43:47 信息持久性:属性 hive.metastore.integral.jdo.pushdown 未知 - 将被忽略15/09/22 19:43:47 警告连接:BoneCP 指定但不存在于 CLASSPATH(或依赖项之一)15/09/22 19:43:48 警告连接:BoneCP 指定但不存在于 CLASSPATH(或依赖项之一)15/09/22 19:43:58 INFO ObjectStore:使用 hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 设置 MetaStore 对象引脚类15/09/22 19:44:03 信息数据存储:类org.apache.hadoop.hive.metastore.model.MFieldSchema"被标记为仅嵌入",因此没有自己的数据存储表.15/09/22 19:44:03 信息数据存储:类org.apache.hadoop.hive.metastore.model.MOrder"被标记为仅嵌入",因此没有自己的数据存储表.15/09/22 19:44:10 信息数据存储:类org.apache.hadoop.hive.metastore.model.MFieldSchema"被标记为仅嵌入",因此没有自己的数据存储表.15/09/22 19:44:10 信息数据存储:类org.apache.hadoop.hive.metastore.model.MOrder"被标记为仅嵌入",因此没有自己的数据存储表.15/09/22 19:44:12 INFO MetaStoreDirectSql:使用直接 SQL,底层 DB 是 DERBY15/09/22 19:44:12 信息对象存储:初始化的对象存储15/09/22 19:44:13 警告 ObjectStore:在 Metastore 中找不到版本信息.hive.metastore.schema.verification 未启用,因此记录架构版本 1.2.015/09/22 19:44:14 警告 ObjectStore:无法获取数据库默认值,返回 NoSuchObjectException15/09/22 19:44:15 信息 HiveMetaStore:在 Metastore 中添加了管理员角色15/09/22 19:44:15 信息 HiveMetaStore:在 Metastore 中添加了公共角色15/09/22 19:44:16 INFO HiveMetaStore:没有用户添加到管理员角色中,因为配置为空15/09/22 19:44:16 信息 HiveMetaStore:0:get_all_databases15/09/22 19:44:16 信息审核:ugi=root ip=unknown-ip-addr cmd=get_all_databases15/09/22 19:44:17 信息 HiveMetaStore:0:get_functions:db=default pat=*15/09/22 19:44:17 信息审核:ugi=root ip=unknown-ip-addr cmd=get_functions:db=default pat=*15/09/22 19:44:17 信息数据存储:类org.apache.hadoop.hive.metastore.model.MResourceUri"被标记为仅嵌入",因此没有自己的数据存储表.15/09/22 19:44:18 INFO SessionState:创建本地目录:/tmp/9ee94679-df51-46bc-bf6f-66b19f053823_resources15/09/22 19:44:18 INFO SessionState:创建 HDFS 目录:/tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f05382315/09/22 19:44:18 INFO SessionState:创建本地目录:/tmp/root/9ee94679-df51-46bc-bf6f-66b19f05382315/09/22 19:44:18 INFO SessionState:创建 HDFS 目录:/tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f053823/_tmp_space.db15/09/22 19:44:19 INFO HiveContext:默认仓库位置是/user/hive/warehouse15/09/22 19:44:19 信息 HiveContext:使用 Spark 类初始化 HiveMetastoreConnection 版本 1.2.1.15/09/22 19:44:19 INFO ClientWrapper:检查的 Hadoop 版本:2.5.215/09/22 19:44:19 INFO ClientWrapper:为 Hadoop 版本 2.5.2 加载了 org.apache.hadoop.hive.shims.Hadoop23Shims15/09/22 19:44:22 警告 NativeCodeLoader:无法为您的平台加载本机 Hadoop 库...在适用的情况下使用内置 Java 类15/09/22 19:44:22 INFO HiveMetaStore:0:使用实现类打开原始存储:org.apache.hadoop.hive.metastore.ObjectStore15/09/22 19:44:22 信息对象存储:对象存储,初始化调用15/09/22 19:44:23 信息持久性:属性 datanucleus.cache.level2 未知 - 将被忽略15/09/22 19:44:23 信息持久性:属性 hive.metastore.integral.jdo.pushdown 未知 - 将被忽略15/09/22 19:44:23 警告连接:BoneCP 指定但不存在于 CLASSPATH(或依赖项之一)15/09/22 19:44:25 警告 HiveMetaStore:错误后重试创建默认数据库:创建事务连接工厂时出错javax.jdo.JDOFatalInternalException:创建事务连接工厂时出错在 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:606)在 javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)在 java.security.AccessController.doPrivileged(Native Method)在 javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)在 javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)在 javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)在 javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)在 org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)在 org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)在 org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)在 org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)在 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)在 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)在 org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57)在 org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)在 org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)在 org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)在 org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)在 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:199)在 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)在 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)在 org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)在 org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)在 org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)在 org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)在 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)在 org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)在 org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)在 org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)在 org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)在 org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)在 org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)在 org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)在 StreamingIntoHive$.main(StreamingIntoHive.scala:42)在 StreamingIntoHive.main(StreamingIntoHive.scala)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:606)在 com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)NestedThrowablesStackTrace:java.lang.reflect.InvocationTargetException在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)在 org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)在 org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)在 org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:240)在 org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:286)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)在 org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)在 org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)在 org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)在 org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:606)在 javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)在 java.security.AccessController.doPrivileged(Native Method)在 javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)在 javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)在 javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)在 javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)在 org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)在 org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)在 org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)在 org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)在 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)在 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)在 org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57)在 org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)在 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)在 org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)在 org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)在 org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)在 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:199)在 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)在 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)在 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)在 org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)在 org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)在 org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)在 org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)在 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)在 org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)在 org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)在 org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)在 org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)在 org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)在 org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)在 org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)在 StreamingIntoHive$.main(StreamingIntoHive.scala:42)在 StreamingIntoHive.main(StreamingIntoHive.scala)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:606)在 com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)引起:java.lang.OutOfMemoryError: PermGen space在 java.lang.ClassLoader.defineClass1(Native Method)在 java.lang.ClassLoader.defineClass(ClassLoader.java:800)在 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)在 java.net.URLClassLoader.defineClass(URLClassLoader.java:449)在 java.net.URLClassLoader.access$100(URLClassLoader.java:71)在 java.net.URLClassLoader$1.run(URLClassLoader.java:361)在 java.net.URLClassLoader$1.run(URLClassLoader.java:355)在 java.security.AccessController.doPrivileged(Native Method)在 java.net.URLClassLoader.findClass(URLClassLoader.java:354)在 java.lang.ClassLoader.loadClass(ClassLoader.java:425)在 org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.doLoadClass(IsolatedClientLoader.scala:165)在 org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.loadClass(IsolatedClientLoader.scala:153)在 java.lang.ClassLoader.loadClass(ClassLoader.java:358)在 org.datanucleus.store.rdbms.connectionpool.DBCPConnectionPoolFactory.createConnectionPool(DBCPConnectionPoolFactory.java:59)在 org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)在 org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)在 org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:85)在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:526)在 org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)在 org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)在 org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)在 org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:240)………………………………进程以退出代码 1 结束
==============谁能帮我找出原因?谢谢.
我相信添加 hiveContext 可能会有所帮助,如果尚未在代码中添加.
import org.apache.spark.sql.hive.HiveContext导入 hiveContext.implicits._导入 hiveContext.sqlval hiveContext = new HiveContext(sc)
蜂巢上下文增加了对在 MetaStore 中查找表和使用 HiveQL 编写查询的支持.没有现有 Hive 部署的用户仍然可以创建 HiveContext.当 hive-site.xml 未配置时,上下文会在当前目录中自动创建 metastore_db 和仓库.-- 来自火花示例
问题已解决,
其实我是手动创建了metastore_db,所以spark连接mysql的话,会直接通过sql DbType测试.但是,由于它没有通过 mysql 直接 sql,因此 spark 使用 derby 作为默认 Metastore db,如 log4j 中所示.这意味着尽管 hive-site.xml 已正确配置,但 spark 未连接到 mysql 中的 metastore_db.我找到了一个解决方案,在问题评论中进行了描述.希望有帮助
I am trying to setup a develop environment for a Spark Streaming project which requires write data into Hive.I have a cluster with 1 master, 2 slaves and 1 develop machine (coding in Intellij Idea 14).
Within the spark shell, everything seems working fine and I am able to store data into default database in Hive via Spark 1.5 using DataFrame.write.insertInto("testtable")
However when creating a scala project in IDEA and run it using same cluster with same setting, Error was thrown when creating transactional connection factory in the metastore database which suppose to be "metastore_db" in mysql.
Here's the hive-site.xml:
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://10.1.50.73:9083</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://10.1.50.73:3306/metastore_db?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>sa</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>huanhuan</value>
</property>
</configuration>
the machine which I was running IDEA, can remotely login Mysql and Hive to create tables, so there should have no problem with permissions.Here's the log4j Output:
> /home/stdevelop/jdk1.7.0_79/bin/java -Didea.launcher.port=7536 -Didea.launcher.bin.path=/home/stdevelop/idea/bin -Dfile.encoding=UTF-8 -classpath /home/stdevelop/jdk1.7.0_79/jre/lib/plugin.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/deploy.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfxrt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/charsets.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/javaws.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfr.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jce.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jsse.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/rt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/resources.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/management-agent.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/zipfs.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunec.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunpkcs11.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunjce_provider.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/localedata.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/dnsns.jar:/home/stdevelop/IdeaProjects/StreamingIntoHive/target/scala-2.10/classes:/root/.sbt/boot/scala-2.10.4/lib/scala-library.jar:/home/stdevelop/SparkDll/spark-assembly-1.5.0-hadoop2.5.2.jar:/home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jar:/home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar:/home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar:/home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar:/home/stdevelop/idea/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain StreamingIntoHive 10.1.50.68 8080
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/09/22 19:43:18 INFO SparkContext: Running Spark version 1.5.0
15/09/22 19:43:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/22 19:43:22 INFO SecurityManager: Changing view acls to: root
15/09/22 19:43:22 INFO SecurityManager: Changing modify acls to: root
15/09/22 19:43:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/09/22 19:43:26 INFO Slf4jLogger: Slf4jLogger started
15/09/22 19:43:26 INFO Remoting: Starting remoting
15/09/22 19:43:26 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:58070]
15/09/22 19:43:26 INFO Utils: Successfully started service 'sparkDriver' on port 58070.
15/09/22 19:43:26 INFO SparkEnv: Registering MapOutputTracker
15/09/22 19:43:26 INFO SparkEnv: Registering BlockManagerMaster
15/09/22 19:43:26 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e7fdc896-ebd2-4faa-a9fe-e61bd93a9db4
15/09/22 19:43:26 INFO MemoryStore: MemoryStore started with capacity 797.6 MB
15/09/22 19:43:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-fb07a3ad-8077-49a8-bcaf-12254cc90282/httpd-0bb434c9-1418-49b6-a514-90e27cb80ab1
15/09/22 19:43:27 INFO HttpServer: Starting HTTP Server
15/09/22 19:43:27 INFO Utils: Successfully started service 'HTTP file server' on port 38865.
15/09/22 19:43:27 INFO SparkEnv: Registering OutputCommitCoordinator
15/09/22 19:43:29 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/09/22 19:43:29 INFO SparkUI: Started SparkUI at http://10.1.50.68:4040
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar at http://10.1.50.68:38865/jars/mysql-connector-java-5.1.35-bin.jar with timestamp 1442922209496
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jar at http://10.1.50.68:38865/jars/datanucleus-api-jdo-3.2.6.jar with timestamp 1442922209498
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar at http://10.1.50.68:38865/jars/datanucleus-rdbms-3.2.9.jar with timestamp 1442922209534
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar at http://10.1.50.68:38865/jars/datanucleus-core-3.2.10.jar with timestamp 1442922209564
15/09/22 19:43:30 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/09/22 19:43:30 INFO AppClient$ClientEndpoint: Connecting to master spark://10.1.50.71:7077...
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20150922062654-0004
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/0 on worker-20150921191458-10.1.50.71-44716 (10.1.50.71:44716) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/0 on hostPort 10.1.50.71:44716 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/1 on worker-20150921191456-10.1.50.73-36446 (10.1.50.73:36446) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/1 on hostPort 10.1.50.73:36446 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/2 on worker-20150921191456-10.1.50.72-53999 (10.1.50.72:53999) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/2 on hostPort 10.1.50.72:53999 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/1 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/0 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/2 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/0 is now RUNNING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/1 is now RUNNING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/2 is now RUNNING
15/09/22 19:43:33 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 60161.
15/09/22 19:43:33 INFO NettyBlockTransferService: Server created on 60161
15/09/22 19:43:33 INFO BlockManagerMaster: Trying to register BlockManager
15/09/22 19:43:33 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.68:60161 with 797.6 MB RAM, BlockManagerId(driver, 10.1.50.68, 60161)
15/09/22 19:43:33 INFO BlockManagerMaster: Registered BlockManager
15/09/22 19:43:34 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
15/09/22 19:43:35 INFO SparkContext: Added JAR /home/stdevelop/Builds/streamingintohive.jar at http://10.1.50.68:38865/jars/streamingintohive.jar with timestamp 1442922215169
15/09/22 19:43:39 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:40110/user/Executor#-132020084]) with ID 2
15/09/22 19:43:39 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:38248/user/Executor#-1615730727]) with ID 0
15/09/22 19:43:40 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.72:37819 with 534.5 MB RAM, BlockManagerId(2, 10.1.50.72, 37819)
15/09/22 19:43:40 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.71:48028 with 534.5 MB RAM, BlockManagerId(0, 10.1.50.71, 48028)
15/09/22 19:43:42 INFO HiveContext: Initializing execution hive, version 1.2.1
15/09/22 19:43:42 INFO ClientWrapper: Inspected Hadoop version: 2.5.2
15/09/22 19:43:42 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.5.2
15/09/22 19:43:42 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://[email protected]:56385/user/Executor#1871695565]) with ID 1
15/09/22 19:43:43 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.73:43643 with 534.5 MB RAM, BlockManagerId(1, 10.1.50.73, 43643)
15/09/22 19:43:45 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/09/22 19:43:45 INFO ObjectStore: ObjectStore, initialize called
15/09/22 19:43:47 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/09/22 19:43:47 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/09/22 19:43:47 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:43:48 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:43:58 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/09/22 19:44:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:10 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:10 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:12 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/09/22 19:44:12 INFO ObjectStore: Initialized ObjectStore
15/09/22 19:44:13 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/09/22 19:44:14 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/09/22 19:44:15 INFO HiveMetaStore: Added admin role in metastore
15/09/22 19:44:15 INFO HiveMetaStore: Added public role in metastore
15/09/22 19:44:16 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/09/22 19:44:16 INFO HiveMetaStore: 0: get_all_databases
15/09/22 19:44:16 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
15/09/22 19:44:17 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/09/22 19:44:17 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/09/22 19:44:17 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:18 INFO SessionState: Created local directory: /tmp/9ee94679-df51-46bc-bf6f-66b19f053823_resources
15/09/22 19:44:18 INFO SessionState: Created HDFS directory: /tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f053823
15/09/22 19:44:18 INFO SessionState: Created local directory: /tmp/root/9ee94679-df51-46bc-bf6f-66b19f053823
15/09/22 19:44:18 INFO SessionState: Created HDFS directory: /tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f053823/_tmp_space.db
15/09/22 19:44:19 INFO HiveContext: default warehouse location is /user/hive/warehouse
15/09/22 19:44:19 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
15/09/22 19:44:19 INFO ClientWrapper: Inspected Hadoop version: 2.5.2
15/09/22 19:44:19 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.5.2
15/09/22 19:44:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/22 19:44:22 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/09/22 19:44:22 INFO ObjectStore: ObjectStore, initialize called
15/09/22 19:44:23 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/09/22 19:44:23 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/09/22 19:44:23 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:44:25 WARN HiveMetaStore: Retrying creating default database after error: Error creating transactional connection factory
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)
at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)
at StreamingIntoHive$.main(StreamingIntoHive.scala:42)
at StreamingIntoHive.main(StreamingIntoHive.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
NestedThrowablesStackTrace:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)
at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)
at StreamingIntoHive$.main(StreamingIntoHive.scala:42)
at StreamingIntoHive.main(StreamingIntoHive.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.doLoadClass(IsolatedClientLoader.scala:165)
at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.loadClass(IsolatedClientLoader.scala:153)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.datanucleus.store.rdbms.connectionpool.DBCPConnectionPoolFactory.createConnectionPool(DBCPConnectionPoolFactory.java:59)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
………………
………………
Process finished with exit code 1
===============Can anyone help me to find out the reason?Thanks.
I believe adding hiveContext may help, if not added already in code.
import org.apache.spark.sql.hive.HiveContext
import hiveContext.implicits._
import hiveContext.sql
val hiveContext = new HiveContext(sc)
A hive context adds support for finding tables in the MetaStore and writing queries using HiveQL. Users who do not have an existing Hive deployment can still create a HiveContext. When not configured by the hive-site.xml, the context automatically creates metastore_db and warehouse in the current directory. -- From spark example
Issue was resolved by,
Actually I created metastore_db manually so if spark connects mysql, it will pass the direct sql DbType test. However since it didn't pass the mysql direct sql therefore spark used derby as default metastore db as what was shown in the log4j. This implies spark was not connected to metastore_db in mysql although hive-site.xml was correctly configured. I found a solution for this which is described in the question comment. Hope it helps
这篇关于在 IDEA 中的 Hive 项目上运行 Spark 期间创建事务连接工厂时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!