automation-suite
2023.4
false
UiPath logo, featuring letters U and I in white
Automation Suite on Linux Installation Guide
Last updated Nov 21, 2024

GPU node affected by resource unavailability

Description

When configuring a GPU node in Automation Suite 2023.4.0 or 2023.4.1, you might face issues with resource availability.

To check if the GPU node is affected by this issue, run the following command:

kubectl describe node <GPU>kubectl describe node <GPU>
If the Allocatable resource does not contain nvidia.com/gpu, as is the case of the following sample, the GPU issue affects you.
Allocatable:
  cpu:                5400m
  ephemeral-storage:  51938908890
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             113173836Ki
  pods:               500Allocatable:
  cpu:                5400m
  ephemeral-storage:  51938908890
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             113173836Ki
  pods:               500

Solution

To fix this issue, run the following command on the GPU node:

awk '1;/plugins."io.containerd.grpc.v1.cri".containerd]/{print " default_runtime_name = \"nvidia\""}' /var/lib/rancher/rke2/agent/etc/containerd/config.toml > /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl
systemctl stop rke2-agent
rke2-killall.sh
systemctl start rke2-agentawk '1;/plugins."io.containerd.grpc.v1.cri".containerd]/{print " default_runtime_name = \"nvidia\""}' /var/lib/rancher/rke2/agent/etc/containerd/config.toml > /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl
systemctl stop rke2-agent
rke2-killall.sh
systemctl start rke2-agent

To verify if the GPU resource shows up, run the following command:

kubectl describe node <GPU>kubectl describe node <GPU>
In the following sample, you can see that nvidia.com/gpu is present, so the GPU issue no longer occurs.
Allocatable:
  cpu:                5400m
  ephemeral-storage:  51938908890
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             113173836Ki
  nvidia.com/gpu:     1
  pods:               500Allocatable:
  cpu:                5400m
  ephemeral-storage:  51938908890
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             113173836Ki
  nvidia.com/gpu:     1
  pods:               500
  • Description
  • Solution

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.