Adding a Node in Your Saagie Cluster
Adding Nodes With GPU
-
Configure the
Node
resource by adding thenvidia.com/gpu=present:NoSchedule
taint with the following command line:kubectl taint nodes [NODE_NAME] nvidia.com/gpu=present:NoSchedule
For more information, see the Kubernetes documentation about Taints and Tolerations. -
Activate the GPU support in the Saagie settings component for each platform where jobs using GPU can be scheduled. This can be done during the installation of the platform.
-
Retrieve your configuration status by running the following command lines:
# Authentication query TOKEN = $(curl -X POST -H "Content-Type:application/json" -H "Saagie-Realm:<realm>" https://<saagie_host>/authentication/api/open/authenticate --data '{"login":"<username>", "password":"<password>"}') # Query reading GPU setting curl -X GET -H "Content-Type:application/json" -H "Saagie-Realm:<realm>" -H "Authorization: Bearer $TOKEN" https://<saagie_host>/settings/api/v1/settings/platform/<platform_id>/gpu
Where:
-
<realm>
is the prefix that was determined during Saagie installation. -
<prefix>
must be replaced with the same value determined for your DNS entry at the beginning of the installation process. -
<saagie_host>
is your Saagie URL. -
<username>
and<password>
must be the credentials of an admin user. -
<platform_id>
is the ID of the platform being configured.
-
-
Activate the
ExtendedResourceToleration
admission controller on the Kubernetes cluster to schedule jobs on the GPU node.
Adding Nodes Dedicated to the saagie-common
Namespace
When a node is dedicated to the saagie-common
namespace, the Saagie platform will only run from that node.
A node with the label saagie-common
will not run anything other than the Saagie platform.
If you need to dedicate multiple nodes to the saagie-common
namespace, make sure all dedicated nodes have the same label or value pairs.
If you have installed Saagie without assigning nodes to the saagie-common namespace and would like to do so now, contact the Saagie support team.
|