报错如下:
!!!!PCI Resource ERROR!!!!
PCI OUT OF RESOURCES CONDITION:
Error: Insufficient PCI Resources Detected!!!
System is running with Insufficient PCI Resources!
In order to display this message some
PCI devices were set to disabled state!
It is strongly recommended to Power Off the system and remove some PCI/PCI Express cards from the system!
To continue booting, proceed to Menu Option and select Boot Device or .
WARNING: If you choose to continue booting some Operating
Systems might not be able to complete boot correctly!
解决方法:
- 开机进入 BIOS 设置
- 找到设置项:BIOS > Advanced > PCIe/PCI/PnP Configuration > Above 4G Decoding
- 设为 Enabled
ref
这里将介绍如何在 Ubuntu 16.04 LTS
系统上搭建 支持 GPU 的 TensorFlow 1.4.0
开发环境。
是否需要 GPU 支持?
这取决于你有没有一块儿支持 CUDA 的 NVIDIA 显卡。如果没有,只能选择 CPU 版本。如果有,继续往下看。
安装 NVIDIA 依赖
- 安装 CUDA Toolkit 9.0:在 CUDA Downloads 页面选择操作系统及版本,安装类型选择
deb (network)
,最后会给出一个下载链接和一系列的命令,类似:sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
修改 PATH
环境变量export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
修改 LD_LIBRARY_PATH
环境变量export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
- 安装显卡驱动,目前最新的驱动版本是 384
sudo apt install nvidia-384
驱动安装成功后,可以使用下面的命令查看显卡状态:nvidia-smi
- 安装 cuDNN 7,在 cuDNN下载页面 点击 Download 并填写调查问卷后,根据自己的系统环境下载对应的安装包并安装,以下是 64 位系统的示例:
sudo dpkg -i libcudnn7_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.3.11-1+cuda9.0_amd64.deb
- 安装 libcupti-dev库
sudo apt install libcupti-dev
修改 LD_LIBRARY_PATH
环境变量export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
安装 TensorFlow
安装预编译的包或者从源码编译都是可行的。
原生 pip 安装
python 2.7/3.n
都可以。
- 先安装并升级 pip
# for Python 2.7
sudo apt-get install python-pip python-dev
sudo pip install -U pip setuptools
# for Python 3.n
sudo apt-get install python3-pip python3-dev
sudo pip3 install -U pip setuptools
- 安装 tensorflow,根据需求只执行一条命令即可
pip install tensorflow # Python 2.7; CPU support (no GPU support)
pip3 install tensorflow # Python 3.n; CPU support (no GPU support)
pip install tensorflow-gpu # Python 2.7; GPU support
pip3 install tensorflow-gpu # Python 3.n; GPU support
源码编译安装
- git 下载源码仓库,切到
r1.4
分支git clone https://github.com/tensorflow/tensorflow
git checkout r1.4
- 安装 bazel
sudo apt-get install openjdk-8-jdk
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel
- 安装 TensorFlow 的 Python 依赖
sudo apt-get install python-numpy python-dev python-pip python-wheel # for Python 2.7
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel # for Python 3.n
- 执行安装配置,务必注意每一步的选择
cd tensorflow # 进入第 1 步克隆的仓库根目录
./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] Y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]
nvcc will be used as CUDA compiler
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 9.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 7
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: "3.5,5.2"]: 6.1
Do you wish to build TensorFlow with MPI support? [y/N]
MPI support will not be enabled for TensorFlow
Configuration finished
- 编译生成 pip 包
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
- 安装生成好的 pip 包,具体的
whl
包在 /tmp/tensorflow_pkg
目录下,文件名可能略有不同sudo pip install /tmp/tensorflow_pkg/tensorflow-1.4.0-cp27-cp27mu-linux_x86_64.whl
验证一下是否装成功了
- 启动 Python
$ python
- 逐行敲入下面的代码
# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
如果能看到下面的输出Hello, TensorFlow!
恭喜你,安装成功了……
更多的细节
请参考详细的官方文档
- Installing TensorFlow on Ubuntu
- Installing TensorFlow from Sources
- NVIDIA CUDA Installation Guide for Linux
- NVIDIA cuDNN
- CUDA GPUs