本文介绍了无法让 pip install 在 EMR 集群上工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 EMR (emr-5.30.0) 集群,我正在尝试从 S3 中的引导文件开始.引导文件的内容是:

I have an EMR (emr-5.30.0) cluster I'm trying to start with a bootstrap file in S3. The contents of the bootstrap file are:

#!/bin/bash
sudo pip3 install --user \
     matplotlib \
     pandas \
     pyarrow \
     pyspark

我的标准错误文件中的错误是:

And the error in my stderr file is:

WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Command "python setup.py egg_info" failed with error code 1 in /mnt/tmp/pip-build-br9bn1h3/pyspark/

看起来很简单……不知道发生了什么.任何帮助表示赞赏.

Seems pretty simple...no idea what is going on. Any help is appreciated.

尝试@Dennis Traub 的建议并得到同样的错误.新的 EMR 引导程序如下所示:

Tried @Dennis Traub suggestion and get same error. New EMR bootstrap looks like this:

#!/bin/bash
sudo pip3 install --upgrade setuptools
sudo pip3 install --user matplotlib pandas pyarrow pyspark

推荐答案

#!/bin/bash

sudo python3 -m pip install matplotlib pandas pyarrow

不要安装 pyspark.它应该已经存在于具有所需配置的 EMR 中.安装可能会出现问题.

DO NOT install pyspark. It should be already there in EMR with required config. Installing may cause problems.

这篇关于无法让 pip install 在 EMR 集群上工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 03:29