-
一个全新的ubuntu20台式机,在Additional Drivers安装nvidia-470-server(一开始安装450,cunda版本只能到11.0,torch有些库用不了,可以直接切换点击Apply Changes重启就行)
-
nvidia-smi查看CUDA Version可到11.4,手动安装11.2,以下地址下载并安装,安装完成后加入环境变量,用nvcc -V测试是否安装成功
https://developer.nvidia.com/cuda-11.2.0-download-archive?
sudo sh cuda_11.2.0_460.27.04_linux.run
// 配置cuda环境变量
gedit ~/.bashrc
export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda
source ~/.bashrc
- 根据torch-geometric选择合适的pytorch版本,torch-1.9.0+cu111,装完torch几个版本库之后再装torch_geometric,指定版本2.0.4别太高了,安装完成后使用conda list torch查看安装情况
https://pytorch-geometric.com/whl/
下载torch-1.9.0+cu111-cp38-cp38-linux_x86_64.whl手动安装
下载地址https://download.pytorch.org/whl/torch_stable.html
ctrl+f 查找cu111
wget https://download.pytorch.org/whl/cu111/torch-1.9.0%2Bcu111-cp38-cp38-linux_x86_64.whl
pip install --no-index torch-scatter -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install --no-index torch-sparse -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install --no-index torch-cluster -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install --no-index torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install torch_geometric==2.0.4 -i https://pypi.doubanio.com/simple
- 安装JAX和cuda11.0时一样
- 之前安装cuda11.0时候查看pytorch对应版本
conda list torch
pytorch 1.7.1 py3.8_cuda11.0.221_cudnn8.0.5_0 pytorch
- 以下网址选择对应Jax版本,cuda11,cudnn选择和pytorch一样的805
https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
wget https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.18+cuda11.cudnn805-cp38-cp38-manylinux2014_x86_64.whl
pip install jaxlib-0.3.18+cuda11.cudnn805-cp38-cp38-manylinux2014_x86_64.whl
- 对应安装jax0.3.18
pip install jax==0.3.18
- 打印GPU测试一下是否安装成功
- 版本匹配很重要,匹配对应版本
- 参考可能出现的问题
JAX: 库安装和GPU使用,解决不能识别gpu问题https://blog.csdn.net/Papageno_Xue/article/details/125754893 - whl is not a supported wheel on this platform,查看conda是否激活版本,python版本是否为3.8
- 不行小则删除环境conda remove,大则重装系统安装驱动
- Ubuntu20.04下CUDA、cuDNN的详细安装与配置过程(图文)https://blog.csdn.net/weixin_37926734/article/details/123033286
- 类似运行报错: /usr/lib64/libstdc++.so.6: version GLIBCXX_3.4.21 not found 的问题,不用升级GCC,知乎上有讲将scipy版本降级或者在.bashrc文件添加conda的lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/miniconda/lib