问题描述
我正在从Azure机器学习笔记本运行pyspark.我正在尝试使用dbutil模块移动文件.
I am running pyspark from an Azure Machine Learning notebook. I am trying to move a file using the dbutil module.
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
def get_dbutils(spark):
try:
from pyspark.dbutils import DBUtils
dbutils = DBUtils(spark)
except ImportError:
import IPython
dbutils = IPython.get_ipython().user_ns["dbutils"]
return dbutils
dbutils = get_dbutils(spark)
dbutils.fs.cp("file:source", "dbfs:destination")
我收到此错误:ModuleNotFoundError:没有名为"pyspark.dbutils"的模块有解决方法吗?
I got this error: ModuleNotFoundError: No module named 'pyspark.dbutils'Is there a workaround for this?
这是另一个Azure机器学习笔记本中的错误:
Here is the error in another Azure Machine Learning notebook:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-183f003402ff> in get_dbutils(spark)
4 try:
----> 5 from pyspark.dbutils import DBUtils
6 dbutils = DBUtils(spark)
ModuleNotFoundError: No module named 'pyspark.dbutils'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-1-183f003402ff> in <module>
10 return dbutils
11
---> 12 dbutils = get_dbutils(spark)
<ipython-input-1-183f003402ff> in get_dbutils(spark)
7 except ImportError:
8 import IPython
----> 9 dbutils = IPython.get_ipython().user_ns["dbutils"]
10 return dbutils
11
KeyError: 'dbutils'
推荐答案
这是Databricks实用程序-DButils的已知问题.
This is a known issue with Databricks Utilities - DButils.
Databricks Connect不支持大多数DButils.唯一起作用的部分是 fs 和秘密.
Most of DButils aren't supported for Databricks Connect. The only parts that do work are fs and secrets.
参考: Databricks Connect-限制和已知问题.
注意:目前,fs和机密(本地)有效.窗口小部件(!!!),库等不起作用.这不是主要问题.如果您使用Python Task在Databricks上执行,则dbutils将失败,并显示以下错误:
Note: Currently fs and secrets work (locally). Widgets (!!!), libraries etc do not work. This shouldn’t be a major issue. If you execute on Databricks using the Python Task dbutils will fail with the error:
ImportError: No module named 'pyspark.dbutils'
我可以通过运行笔记本来成功执行查询.
I'm able to execute the query successfully by running as a notebook.
这篇关于ModuleNotFoundError:没有名为"pyspark.dbutils"的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!