How to check if tensorflow is using all available GPU's











up vote
2
down vote

favorite












I am learning to use Tensorflow for object detection. To speed up the training process, I have taken a AWS g3.16xlarge instance which has 4 GPUs. I am using the following code to run training process:



export CUDA_VISIBLE_DEVICES=0,1,2,3
python object_detection/train.py --logtostderr --pipeline_config_path=/home/ubuntu/builder/rcnn.config --train_dir=/home/ubuntu/builder/experiments/training/


Inside the rcnn.config - i have set the batch-size = 1. During runtime I get the following output:



console output



2018-11-09 07:25:50.104310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-11-09 07:25:50.104385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0 1 2 3
2018-11-09 07:25:50.104395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y N N N
2018-11-09 07:25:50.104402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 1: N Y N N
2018-11-09 07:25:50.104409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 2: N N Y N
2018-11-09 07:25:50.104416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 3: N N N Y
2018-11-09 07:25:50.104429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla M60, pci bus id: 0000:00:1b.0, compute capability: 5.2)
2018-11-09 07:25:50.104439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla M60, pci bus id: 0000:00:1c.0, compute capability: 5.2)
2018-11-09 07:25:50.104446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla M60, pci bus id: 0000:00:1d.0, compute capability: 5.2)
2018-11-09 07:25:50.104455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla M60, pci bus id: 0000:00:1e.0, compute capability: 5.2)


When I run nvidia-smi, I get the following output:
nvidia-smi output



+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 0000:00:1B.0 Off | 0 |
| N/A 52C P0 129W / 150W | 7382MiB / 7612MiB | 92% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 Off | 0000:00:1C.0 Off | 0 |
| N/A 33C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 Off | 0000:00:1D.0 Off | 0 |
| N/A 40C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 Off | 0000:00:1E.0 Off | 0 |
| N/A 34C P0 39W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 97860 C python 7378MiB |
| 1 97860 C python 7233MiB |
| 2 97860 C python 7233MiB |
| 3 97860 C python 7233MiB |
+-----------------------------------------------------------------------------+


and **nvidia-smi dmon** provides the following output:



# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx W C % % % % MHz MHz
0 158 69 90 69 0 0 2505 1177
1 38 36 0 0 0 0 2505 556
2 38 45 0 0 0 0 2505 556
3 39 37 0 0 0 0 2505 556


I am confused with each of the output. While I read the console output as the program is recognizing the availability of 4 different gpus, in the nvidia-smi output the volatile GPU-Util percentage is shown only for the first GPU and for the rest it is zero. However the same table prints memory usage for all the 4 gpu's at the bottom. And the nvidia-smi dmon prints the sm values only for first gpu and for the others it is zero. From this blog I understand the zero in dmon indicates that GPU is free.



What I want to understand is, does the train.py utilizes all the 4 GPU's that I have in my instance. If it is not utilizing all the GPU's how do I ensure the object_detection/train.py of tensorflow is optimized for all the GPU's.










share|improve this question






















  • Maybe also: unix.stackexchange.com/questions/16407/…
    – Ciro Santilli 新疆改造中心 六四事件 法轮功
    Nov 9 at 7:38















up vote
2
down vote

favorite












I am learning to use Tensorflow for object detection. To speed up the training process, I have taken a AWS g3.16xlarge instance which has 4 GPUs. I am using the following code to run training process:



export CUDA_VISIBLE_DEVICES=0,1,2,3
python object_detection/train.py --logtostderr --pipeline_config_path=/home/ubuntu/builder/rcnn.config --train_dir=/home/ubuntu/builder/experiments/training/


Inside the rcnn.config - i have set the batch-size = 1. During runtime I get the following output:



console output



2018-11-09 07:25:50.104310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-11-09 07:25:50.104385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0 1 2 3
2018-11-09 07:25:50.104395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y N N N
2018-11-09 07:25:50.104402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 1: N Y N N
2018-11-09 07:25:50.104409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 2: N N Y N
2018-11-09 07:25:50.104416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 3: N N N Y
2018-11-09 07:25:50.104429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla M60, pci bus id: 0000:00:1b.0, compute capability: 5.2)
2018-11-09 07:25:50.104439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla M60, pci bus id: 0000:00:1c.0, compute capability: 5.2)
2018-11-09 07:25:50.104446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla M60, pci bus id: 0000:00:1d.0, compute capability: 5.2)
2018-11-09 07:25:50.104455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla M60, pci bus id: 0000:00:1e.0, compute capability: 5.2)


When I run nvidia-smi, I get the following output:
nvidia-smi output



+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 0000:00:1B.0 Off | 0 |
| N/A 52C P0 129W / 150W | 7382MiB / 7612MiB | 92% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 Off | 0000:00:1C.0 Off | 0 |
| N/A 33C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 Off | 0000:00:1D.0 Off | 0 |
| N/A 40C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 Off | 0000:00:1E.0 Off | 0 |
| N/A 34C P0 39W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 97860 C python 7378MiB |
| 1 97860 C python 7233MiB |
| 2 97860 C python 7233MiB |
| 3 97860 C python 7233MiB |
+-----------------------------------------------------------------------------+


and **nvidia-smi dmon** provides the following output:



# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx W C % % % % MHz MHz
0 158 69 90 69 0 0 2505 1177
1 38 36 0 0 0 0 2505 556
2 38 45 0 0 0 0 2505 556
3 39 37 0 0 0 0 2505 556


I am confused with each of the output. While I read the console output as the program is recognizing the availability of 4 different gpus, in the nvidia-smi output the volatile GPU-Util percentage is shown only for the first GPU and for the rest it is zero. However the same table prints memory usage for all the 4 gpu's at the bottom. And the nvidia-smi dmon prints the sm values only for first gpu and for the others it is zero. From this blog I understand the zero in dmon indicates that GPU is free.



What I want to understand is, does the train.py utilizes all the 4 GPU's that I have in my instance. If it is not utilizing all the GPU's how do I ensure the object_detection/train.py of tensorflow is optimized for all the GPU's.










share|improve this question






















  • Maybe also: unix.stackexchange.com/questions/16407/…
    – Ciro Santilli 新疆改造中心 六四事件 法轮功
    Nov 9 at 7:38













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am learning to use Tensorflow for object detection. To speed up the training process, I have taken a AWS g3.16xlarge instance which has 4 GPUs. I am using the following code to run training process:



export CUDA_VISIBLE_DEVICES=0,1,2,3
python object_detection/train.py --logtostderr --pipeline_config_path=/home/ubuntu/builder/rcnn.config --train_dir=/home/ubuntu/builder/experiments/training/


Inside the rcnn.config - i have set the batch-size = 1. During runtime I get the following output:



console output



2018-11-09 07:25:50.104310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-11-09 07:25:50.104385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0 1 2 3
2018-11-09 07:25:50.104395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y N N N
2018-11-09 07:25:50.104402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 1: N Y N N
2018-11-09 07:25:50.104409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 2: N N Y N
2018-11-09 07:25:50.104416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 3: N N N Y
2018-11-09 07:25:50.104429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla M60, pci bus id: 0000:00:1b.0, compute capability: 5.2)
2018-11-09 07:25:50.104439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla M60, pci bus id: 0000:00:1c.0, compute capability: 5.2)
2018-11-09 07:25:50.104446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla M60, pci bus id: 0000:00:1d.0, compute capability: 5.2)
2018-11-09 07:25:50.104455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla M60, pci bus id: 0000:00:1e.0, compute capability: 5.2)


When I run nvidia-smi, I get the following output:
nvidia-smi output



+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 0000:00:1B.0 Off | 0 |
| N/A 52C P0 129W / 150W | 7382MiB / 7612MiB | 92% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 Off | 0000:00:1C.0 Off | 0 |
| N/A 33C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 Off | 0000:00:1D.0 Off | 0 |
| N/A 40C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 Off | 0000:00:1E.0 Off | 0 |
| N/A 34C P0 39W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 97860 C python 7378MiB |
| 1 97860 C python 7233MiB |
| 2 97860 C python 7233MiB |
| 3 97860 C python 7233MiB |
+-----------------------------------------------------------------------------+


and **nvidia-smi dmon** provides the following output:



# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx W C % % % % MHz MHz
0 158 69 90 69 0 0 2505 1177
1 38 36 0 0 0 0 2505 556
2 38 45 0 0 0 0 2505 556
3 39 37 0 0 0 0 2505 556


I am confused with each of the output. While I read the console output as the program is recognizing the availability of 4 different gpus, in the nvidia-smi output the volatile GPU-Util percentage is shown only for the first GPU and for the rest it is zero. However the same table prints memory usage for all the 4 gpu's at the bottom. And the nvidia-smi dmon prints the sm values only for first gpu and for the others it is zero. From this blog I understand the zero in dmon indicates that GPU is free.



What I want to understand is, does the train.py utilizes all the 4 GPU's that I have in my instance. If it is not utilizing all the GPU's how do I ensure the object_detection/train.py of tensorflow is optimized for all the GPU's.










share|improve this question













I am learning to use Tensorflow for object detection. To speed up the training process, I have taken a AWS g3.16xlarge instance which has 4 GPUs. I am using the following code to run training process:



export CUDA_VISIBLE_DEVICES=0,1,2,3
python object_detection/train.py --logtostderr --pipeline_config_path=/home/ubuntu/builder/rcnn.config --train_dir=/home/ubuntu/builder/experiments/training/


Inside the rcnn.config - i have set the batch-size = 1. During runtime I get the following output:



console output



2018-11-09 07:25:50.104310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2018-11-09 07:25:50.104385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0 1 2 3
2018-11-09 07:25:50.104395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0: Y N N N
2018-11-09 07:25:50.104402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 1: N Y N N
2018-11-09 07:25:50.104409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 2: N N Y N
2018-11-09 07:25:50.104416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 3: N N N Y
2018-11-09 07:25:50.104429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla M60, pci bus id: 0000:00:1b.0, compute capability: 5.2)
2018-11-09 07:25:50.104439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla M60, pci bus id: 0000:00:1c.0, compute capability: 5.2)
2018-11-09 07:25:50.104446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla M60, pci bus id: 0000:00:1d.0, compute capability: 5.2)
2018-11-09 07:25:50.104455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla M60, pci bus id: 0000:00:1e.0, compute capability: 5.2)


When I run nvidia-smi, I get the following output:
nvidia-smi output



+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 0000:00:1B.0 Off | 0 |
| N/A 52C P0 129W / 150W | 7382MiB / 7612MiB | 92% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 Off | 0000:00:1C.0 Off | 0 |
| N/A 33C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 Off | 0000:00:1D.0 Off | 0 |
| N/A 40C P0 38W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 Off | 0000:00:1E.0 Off | 0 |
| N/A 34C P0 39W / 150W | 7237MiB / 7612MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 97860 C python 7378MiB |
| 1 97860 C python 7233MiB |
| 2 97860 C python 7233MiB |
| 3 97860 C python 7233MiB |
+-----------------------------------------------------------------------------+


and **nvidia-smi dmon** provides the following output:



# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx W C % % % % MHz MHz
0 158 69 90 69 0 0 2505 1177
1 38 36 0 0 0 0 2505 556
2 38 45 0 0 0 0 2505 556
3 39 37 0 0 0 0 2505 556


I am confused with each of the output. While I read the console output as the program is recognizing the availability of 4 different gpus, in the nvidia-smi output the volatile GPU-Util percentage is shown only for the first GPU and for the rest it is zero. However the same table prints memory usage for all the 4 gpu's at the bottom. And the nvidia-smi dmon prints the sm values only for first gpu and for the others it is zero. From this blog I understand the zero in dmon indicates that GPU is free.



What I want to understand is, does the train.py utilizes all the 4 GPU's that I have in my instance. If it is not utilizing all the GPU's how do I ensure the object_detection/train.py of tensorflow is optimized for all the GPU's.







python tensorflow gpu






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 9 at 7:36









Apricot

87711031




87711031












  • Maybe also: unix.stackexchange.com/questions/16407/…
    – Ciro Santilli 新疆改造中心 六四事件 法轮功
    Nov 9 at 7:38


















  • Maybe also: unix.stackexchange.com/questions/16407/…
    – Ciro Santilli 新疆改造中心 六四事件 法轮功
    Nov 9 at 7:38
















Maybe also: unix.stackexchange.com/questions/16407/…
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Nov 9 at 7:38




Maybe also: unix.stackexchange.com/questions/16407/…
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Nov 9 at 7:38












1 Answer
1






active

oldest

votes

















up vote
2
down vote













Check if it's returning list of all GPUs.



tf.test.gpu_device_name()


Returns the name of a GPU device if available or the empty string.



then you can do something like this to use all the available GPUs.



# Creates a graph.
c =
for d in ['/device:GPU:2', '/device:GPU:3']:
with tf.device(d):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
sum = tf.add_n(c)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(sum))


You see below output:



Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/device:GPU:3
Const_2: /job:localhost/replica:0/task:0/device:GPU:3
MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
Const_1: /job:localhost/replica:0/task:0/device:GPU:2
Const: /job:localhost/replica:0/task:0/device:GPU:2
MatMul: /job:localhost/replica:0/task:0/device:GPU:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[ 44. 56.]
[ 98. 128.]]





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53221523%2fhow-to-check-if-tensorflow-is-using-all-available-gpus%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote













    Check if it's returning list of all GPUs.



    tf.test.gpu_device_name()


    Returns the name of a GPU device if available or the empty string.



    then you can do something like this to use all the available GPUs.



    # Creates a graph.
    c =
    for d in ['/device:GPU:2', '/device:GPU:3']:
    with tf.device(d):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
    c.append(tf.matmul(a, b))
    with tf.device('/cpu:0'):
    sum = tf.add_n(c)
    # Creates a session with log_device_placement set to True.
    sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
    # Runs the op.
    print(sess.run(sum))


    You see below output:



    Device mapping:
    /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
    id: 0000:02:00.0
    /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
    id: 0000:03:00.0
    /job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
    id: 0000:83:00.0
    /job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
    id: 0000:84:00.0
    Const_3: /job:localhost/replica:0/task:0/device:GPU:3
    Const_2: /job:localhost/replica:0/task:0/device:GPU:3
    MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
    Const_1: /job:localhost/replica:0/task:0/device:GPU:2
    Const: /job:localhost/replica:0/task:0/device:GPU:2
    MatMul: /job:localhost/replica:0/task:0/device:GPU:2
    AddN: /job:localhost/replica:0/task:0/cpu:0
    [[ 44. 56.]
    [ 98. 128.]]





    share|improve this answer

























      up vote
      2
      down vote













      Check if it's returning list of all GPUs.



      tf.test.gpu_device_name()


      Returns the name of a GPU device if available or the empty string.



      then you can do something like this to use all the available GPUs.



      # Creates a graph.
      c =
      for d in ['/device:GPU:2', '/device:GPU:3']:
      with tf.device(d):
      a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
      b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
      c.append(tf.matmul(a, b))
      with tf.device('/cpu:0'):
      sum = tf.add_n(c)
      # Creates a session with log_device_placement set to True.
      sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
      # Runs the op.
      print(sess.run(sum))


      You see below output:



      Device mapping:
      /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
      id: 0000:02:00.0
      /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
      id: 0000:03:00.0
      /job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
      id: 0000:83:00.0
      /job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
      id: 0000:84:00.0
      Const_3: /job:localhost/replica:0/task:0/device:GPU:3
      Const_2: /job:localhost/replica:0/task:0/device:GPU:3
      MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
      Const_1: /job:localhost/replica:0/task:0/device:GPU:2
      Const: /job:localhost/replica:0/task:0/device:GPU:2
      MatMul: /job:localhost/replica:0/task:0/device:GPU:2
      AddN: /job:localhost/replica:0/task:0/cpu:0
      [[ 44. 56.]
      [ 98. 128.]]





      share|improve this answer























        up vote
        2
        down vote










        up vote
        2
        down vote









        Check if it's returning list of all GPUs.



        tf.test.gpu_device_name()


        Returns the name of a GPU device if available or the empty string.



        then you can do something like this to use all the available GPUs.



        # Creates a graph.
        c =
        for d in ['/device:GPU:2', '/device:GPU:3']:
        with tf.device(d):
        a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
        b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
        c.append(tf.matmul(a, b))
        with tf.device('/cpu:0'):
        sum = tf.add_n(c)
        # Creates a session with log_device_placement set to True.
        sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
        # Runs the op.
        print(sess.run(sum))


        You see below output:



        Device mapping:
        /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
        id: 0000:02:00.0
        /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
        id: 0000:03:00.0
        /job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
        id: 0000:83:00.0
        /job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
        id: 0000:84:00.0
        Const_3: /job:localhost/replica:0/task:0/device:GPU:3
        Const_2: /job:localhost/replica:0/task:0/device:GPU:3
        MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
        Const_1: /job:localhost/replica:0/task:0/device:GPU:2
        Const: /job:localhost/replica:0/task:0/device:GPU:2
        MatMul: /job:localhost/replica:0/task:0/device:GPU:2
        AddN: /job:localhost/replica:0/task:0/cpu:0
        [[ 44. 56.]
        [ 98. 128.]]





        share|improve this answer












        Check if it's returning list of all GPUs.



        tf.test.gpu_device_name()


        Returns the name of a GPU device if available or the empty string.



        then you can do something like this to use all the available GPUs.



        # Creates a graph.
        c =
        for d in ['/device:GPU:2', '/device:GPU:3']:
        with tf.device(d):
        a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
        b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
        c.append(tf.matmul(a, b))
        with tf.device('/cpu:0'):
        sum = tf.add_n(c)
        # Creates a session with log_device_placement set to True.
        sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
        # Runs the op.
        print(sess.run(sum))


        You see below output:



        Device mapping:
        /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus
        id: 0000:02:00.0
        /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus
        id: 0000:03:00.0
        /job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus
        id: 0000:83:00.0
        /job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus
        id: 0000:84:00.0
        Const_3: /job:localhost/replica:0/task:0/device:GPU:3
        Const_2: /job:localhost/replica:0/task:0/device:GPU:3
        MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3
        Const_1: /job:localhost/replica:0/task:0/device:GPU:2
        Const: /job:localhost/replica:0/task:0/device:GPU:2
        MatMul: /job:localhost/replica:0/task:0/device:GPU:2
        AddN: /job:localhost/replica:0/task:0/cpu:0
        [[ 44. 56.]
        [ 98. 128.]]






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 9 at 7:45









        CSMaverick

        1,2431328




        1,2431328






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53221523%2fhow-to-check-if-tensorflow-is-using-all-available-gpus%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Academy of Television Arts & Sciences

            L'Équipe

            1995 France bombings