pyspark中的函数input

pyspark中的函数input

本文介绍了pyspark中的函数input()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是,当我输入p的值时,什么也没有发生,它没有继续执行:请问有没有办法解决它?

My problem here is when I enter the value of p, Nothing happens, It does not pursue execution: is there a way to fix it please?

import sys
from pyspark import SparkContext
sc = SparkContext("local", "simple App")

p =input("Enter the word")
rdd1 = sc.textFile("monfichier")
rdd2= rdd1.map(lambda l : l.split("\t"))
rdd3=rdd2.map(lambda l: l[1])
print rdd3.take(6)
rdd5=rdd3.filter(lambda l : p in l)

sc.stop()

推荐答案

您必须区分不同的情况:

You have to distinguish between to different cases:

  • 使用 $ SPARK_HOME/bin/spark-submit script.py

在这种情况下,您将执行Scala应用程序,该应用程序随后将启动Python解释器.由于Scala应用程序不希望标准输入产生任何交互,更不用说将其传递给Python解释器了,因此您的Python脚本将只是挂起,等待不会出现的数据.

In this case you execute Scala application which in turn starts Python interpreter. Since Scala application doesn't expect any interaction from the standard input, not to mention passing it to Python interpreter, your Python script will simply hang waiting for data which won't come.

直接使用Python解释器执行的脚本( python script.py ).

Script executed directly using Python interpreter (python script.py).

您应该能够直接使用 input ,但要以处理所有配置详细信息为代价,通常由 spark-submit / org.apache处理.spark.deploy.SparkSubmit ,在您的代码中手动输入.

You should be able to use input directly but at the cost of handling all the configuration details, normally handled by spark-submit / org.apache.spark.deploy.SparkSubmit, manually in your code.

通常,可以使用命令行来传递脚本的所有必需参数

In general all required arguments for your scripts can be passed using commandline

$SPARK_HOME/bin/spark-submit script.py some_app_arg another_app_arg

并使用 sys.argv argparse 并使用 input 既不必要也不有用.

and accessed using standard methods like sys.argv or argparse and using input is neither necessary nor useful.

这篇关于pyspark中的函数input()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 08:53