我已经为Hadoop yarn 设置了 docker ,并且我正在尝试设置livy apache服务器以进行API调用以提交作业。
以下日志表示livy服务器启动了一段时间并自动停止
19/08/17 07:09:35 INFO utils.LineBufferedStream: Welcome to
19/08/17 07:09:35 INFO utils.LineBufferedStream: ____ __
19/08/17 07:09:35 INFO utils.LineBufferedStream: / __/__ ___ _____/ /__
19/08/17 07:09:35 INFO utils.LineBufferedStream: _\ \/ _ \/ _ `/ __/ '_/
19/08/17 07:09:35 INFO utils.LineBufferedStream: /___/ .__/\_,_/_/ /_/\_\ version 2.2.1
19/08/17 07:09:35 INFO utils.LineBufferedStream: /_/
19/08/17 07:09:35 INFO utils.LineBufferedStream:
19/08/17 07:09:35 INFO utils.LineBufferedStream: Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222
19/08/17 07:09:35 INFO utils.LineBufferedStream: Branch
19/08/17 07:09:35 INFO utils.LineBufferedStream: Compiled by user felixcheung on 2017-11-24T23:19:45Z
19/08/17 07:09:35 INFO utils.LineBufferedStream: Revision
19/08/17 07:09:35 INFO utils.LineBufferedStream: Url
19/08/17 07:09:35 INFO utils.LineBufferedStream: Type --help for more information.
19/08/17 07:09:35 INFO recovery.StateStore$: Using BlackholeStateStore for recovery.
19/08/17 07:09:35 INFO sessions.BatchSessionManager: Recovered 0 batch sessions. Next session id: 0
19/08/17 07:09:35 INFO sessions.InteractiveSessionManager: Recovered 0 interactive sessions. Next session id: 0
19/08/17 07:09:35 INFO sessions.InteractiveSessionManager: Heartbeat watchdog thread started.
19/08/17 07:09:35 INFO util.log: Logging initialized @1944ms
19/08/17 07:09:36 INFO server.Server: jetty-9.3.24.v20180605, build timestamp: 2018-06-05T17:11:56Z, git hash: xxx0x0x0xx00xxxx0x0x0x0x0x0x0x0xxxx
19/08/17 07:09:36 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@3543df7d{/,file:///livy/apache-livy-0.6.0-incubating-bin/bin/src/main/org/apache/livy/server,AVAILABLE}
19/08/17 07:09:36 INFO server.AbstractNCSARequestLog: Opened /livy/apache-livy-0.6.0-incubating-bin/logs/2019_08_17.request.log
19/08/17 07:09:36 INFO server.AbstractConnector: Started ServerConnector@686449f9{HTTP/1.1,[http/1.1]}{x.x.x.x:8080}
19/08/17 07:09:36 INFO server.Server: Started @2304ms
19/08/17 07:09:36 INFO server.WebServer: Starting server on http://x.x.x.x:8080
19/08/17 07:10:01 INFO server.LivyServer: Shutting down Livy server.
19/08/17 07:10:01 INFO handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@3543df7d{/,file:///livy/apache-livy-0.6.0-incubating-bin/bin/src/main/org/apache/livy/server,UNAVAILABLE}
19/08/17 07:10:01 INFO server.AbstractConnector: Stopped ServerConnector@686449f9{HTTP/1.1,[http/1.1]}{x.x.x.x:8080}
我提供了livy.conf,其中提到了要运行livy的服务器ip和服务器端口。在尝试提交 Spark 纱时,我也已经完成了设置,因此我附了以下文件
docker 组成
version: "2"
services:
livy:
image: namenode/hadoopspark:2.2.1
command: /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start
network_mode: "host"
ports:
- 8080:8080
#####################BASE DOCKERFILE#################
FROM ubuntu:14.04
ENV DAEMON_RUN=true
ENV SPARK_VERSION=2.2.1
ENV HADOOP_VERSION=2.7
ENV SPARK_HOME=/spark
ENV HADOOP_HOME=/hadoop
RUN apt-get update \
&& apt-get install -y software-properties-common openssh-server net-tools curl nano vim wget ca-certificates jq gnupg unzip
RUN add-apt-repository ppa:openjdk-r/ppa
RUN apt-get update
RUN apt-get install -y openjdk-8-jdk \
supervisor
RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
RUN cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
RUN wget https://www-eu.apache.org/dist/incubator/livy/0.6.0-incubating/apache-livy-0.6.0-incubating-bin.zip \
&& unzip apache-livy-0.6.0-incubating-bin.zip \
&& mkdir -p livy \
&& mv apache-livy-0.6.0-incubating-bin /livy
RUN wget https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
&& tar -xzf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
&& mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} /spark
RUN wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz \
&& tar -xzvf hadoop-2.7.3.tar.gz \
&& mv hadoop-2.7.3 /hadoop
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
ENV HADOOP_CONF_DIR=/hadoop/etc/hadoop
RUN echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \
export HADOOP_HOME=/hadoop \
export HADOOP_CONF_DIR=/hadoop/etc/hadoop \
export HADOOP_SSH_OPTS='"-p 22"' \
" >> /hadoop/etc/hadoop/hadoop-env.sh
ENV PATH=$SPARK_HOME/bin:$PATH
ENV PATH=$PATH:/hadoop/bin:/hadoop/sbin
################NAMENODE DOCKERFILE####################
FROM base/hadoopspark:2.2.1
COPY conf/* /tmp/
RUN cp /tmp/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml && \
cp /tmp/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml && \
cp /tmp/mapred-site.xml $HADOOP_HOME/etc/hadoop/mapred-site.xml && \
cp /tmp/yarn-site.xml $HADOOP_HOME/etc/hadoop/yarn-site.xml && \
cp /tmp/hdfs-site.xml $SPARK_HOME/conf/ && \
cp /tmp/core-site.xml $SPARK_HOME/conf/ && \
cp /tmp/mapred-site.xml $SPARK_HOME/conf/ && \
cp /tmp/yarn-site.xml $SPARK_HOME/conf/ && \
cp /tmp/spark-defaults.conf $SPARK_HOME/conf/ && \
cp /tmp/livy.conf /livy/apache-livy-0.6.0-incubating-bin/conf
COPY Docker_WordCount_Spark-1.0.jar /opt/Docker_WordCount_Spark-1.0.jar
COPY sample.txt /opt/sample.txt
#RUN hdfs dfs -put /opt/Docker_WordCount_Spark-1.0.jar Docker_WordCount_Spark-1.0.jar
#RUN hdfs dfs -put /opt/sample.txt sample.txt
ENV LD_LIBRARY_PATH=/hadoop/lib/native:$LD_LIBRARY_PATH
RUN sudo service ssh restart
RUN sudo /hadoop/bin/hadoop namenode -format
EXPOSE 8998 8080
是否需要其他帮助来启动livy服务器。谢谢!
最佳答案
Docker要求命令保持在前台运行。否则,它认为应用程序已停止并关闭了容器。由于livy服务器启动脚本在后台进程中运行,并且以后没有其他前台进程触发,因此这就是脚本结束时容器退出的原因。您可以通过多种方法解决此问题,简单的解决方案是在 Dockerfile 中添加以下命令以启动肝脏服务器(从docker-compose.yml中删除命令)
CMD /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start && /bin/bash
livy服务器docker镜像为:
FROM base/hadoopspark:2.2.1
COPY conf/* /tmp/
ENV SPARK_HOME=/spark
ENV HADOOP_HOME=/hadoop
RUN cp /tmp/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml && \
cp /tmp/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml && \
cp /tmp/mapred-site.xml $HADOOP_HOME/etc/hadoop/mapred-site.xml && \
cp /tmp/yarn-site.xml $HADOOP_HOME/etc/hadoop/yarn-site.xml && \
cp /tmp/hdfs-site.xml $SPARK_HOME/conf/ && \
cp /tmp/core-site.xml $SPARK_HOME/conf/ && \
cp /tmp/mapred-site.xml $SPARK_HOME/conf/ && \
cp /tmp/yarn-site.xml $SPARK_HOME/conf/ && \
cp /tmp/spark-defaults.conf $SPARK_HOME/conf/ && \
cp /tmp/livy.conf /livy/apache-livy-0.6.0-incubating-bin/conf
COPY Docker_WordCount_Spark-1.0.jar /opt/Docker_WordCount_Spark-1.0.jar
COPY sample.txt /opt/sample.txt
ENV LD_LIBRARY_PATH=/hadoop/lib/native:$LD_LIBRARY_PATH
RUN sudo service ssh restart
RUN sudo /hadoop/bin/hadoop namenode -format
ENV PATH=$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
EXPOSE 8998 8080
CMD /livy/apache-livy-0.6.0-incubating-bin/bin/livy-server start && /bin/bash
关于docker - 还有与Livy服务器(livy.conf)一起完成的其他配置吗?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57534640/