问题描述
我对如何将参数作为 REST API JSON 传递有点困惑.
I am little confused with how to pass the arguments as REST API JSON.
考虑下面的 spark 提交命令.
Consider below spark submit command.
spark-submit --packages org.apache.hudi:hudi-utilities-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 \
--master yarn \
--deploy-mode cluster \
--num-executors 10 \
--executor-memory 3g \
--driver-memory 6g \
--conf spark.driver.extraJavaOptions="-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/varadarb_ds_driver.hprof" \
--conf spark.executor.extraJavaOptions="-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/varadarb_ds_executor.hprof" \
--queue hadoop-platform-queue \
--conf spark.scheduler.mode=FAIR \
--conf spark.yarn.executor.memoryOverhead=1072 \
--conf spark.yarn.driver.memoryOverhead=2048 \
--conf spark.task.cpus=1 \
--conf spark.executor.cores=1 \
--conf spark.task.maxFailures=10 \
--conf spark.memory.fraction=0.4 \
--conf spark.rdd.compress=true \
--conf spark.kryoserializer.buffer.max=200m \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--conf spark.memory.storageFraction=0.1 \
--conf spark.shuffle.service.enabled=true \
--conf spark.sql.hive.convertMetastoreParquet=false \
--conf spark.ui.port=5555 \
--conf spark.driver.maxResultSize=3g \
--conf spark.executor.heartbeatInterval=120s \
--conf spark.network.timeout=600s \
--conf spark.eventLog.overwrite=true \
--conf spark.eventLog.enabled=true \
--conf spark.eventLog.dir=hdfs:///user/spark/applicationHistory \
--conf spark.yarn.max.executor.failures=10 \
--conf spark.sql.catalogImplementation=hive \
--conf spark.sql.shuffle.partitions=100 \
--driver-class-path $HADOOP_CONF_DIR \
--class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
--table-type MERGE_ON_READ \
--source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
--source-ordering-field ts \
--target-base-path /user/hive/warehouse/stock_ticks_mor \
--target-table stock_ticks_mor \
--props /var/demo/config/kafka-source.properties \
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
--continuous
如何将其作为 JSON 传递给 Livy 服务器.如何将 jar 文件作为文件和其他配置传递.
How to pass it as a JSON to Livy server. How to pass the jar files as files and other configurations.
推荐答案
如果对某人有帮助,请在此处发布.
Posting here, if it may help someone.
我们发现我们可以在 http 请求(到 livy 服务器)中将 args 作为列表传递.在 args 中,我们可以传递所有与 hudi 相关的配置文件,如 ["key1","value1","key2",","value2","--hoodie-conf","confname=value";... 等等].我们可以通过 livy 服务器提交作业.
WE found out that we can pass args as a list in a http request (to the livy server).in args, we can pass all the hudi related confs like ["key1","value1","key2",","value2","--hoodie-conf","confname=value"... etc]. We are able to submit jobs via livy server.
这篇关于在通过 apache livy 提交 hudi delta streamer 作业时需要帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!