This document explains how to enable and disable the Slurm Operator add-on for your Google Kubernetes Engine (GKE) clusters. The Slurm Operator add-on is only supported on Standard clusters; it is not supported on Autopilot clusters. The Slurm Operator add-on runs components in the Kubernetes control plane to manage Slurm workloads within the cluster.
Before you begin
Before you start, make sure that you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running the
gcloud components updatecommand. Earlier gcloud CLI versions might not support running the commands in this document.
Enable the Slurm Operator add-on on a GKE cluster
You can enable the Slurm Operator add-on on new or existing Standard GKE clusters by using the Google Cloud CLI.
gcloud container clusters create CLUSTER_NAME \
--location LOCATION \
--cluster-version=VERSION \
--addons=SlurmOperator
Replace the following:
CLUSTER_NAME: the name of the new cluster.LOCATION: the region of cluster.VERSION: the GKE version, which must be 1.35.2-gke.1842000 or later. You can also use the--release-channeloption to select a release channel. The release channel must have a default version of 1.35.2-gke.1842000 or later.
You can enable the Slurm Operator add-on on an existing cluster by using the
gcloud container clusters update
command and appending the --update-addons=SlurmOperator=ENABLED flag.
Verify the Slurm Operator add-on is enabled
You can verify that the Slurm Operator add-on is enabled on a cluster by using the gcloud CLI.
gcloud container clusters describe CLUSTER_NAME \
--location=LOCATION
Replace the following:
CLUSTER_NAME: the name of the new cluster.LOCATION: the region of cluster.
The output should be similar to the following:
# Several lines omitted
addonsConfig:
slurmOperatorConfig:
enabled: true
This output indicates that the Slurm Operator add-on is enabled for the cluster.
Disable the Slurm Operator add-on for a cluster
To disable the Slurm Operator add-on on an existing cluster, run the following command:
gcloud container clusters update CLUSTER_NAME \
--location=LOCATION \
--update-addons=SlurmOperator=DISABLED
Replace the following:
CLUSTER_NAME: the name of the new cluster.LOCATION: the region of cluster.
You can verify that the Slurm Operator add-on is disabled by re-running the
gcloud container clusters describe command. The slurmOperatorConfig section
should show enabled: false.
What's next
- Learn how to deploy a full Slurm cluster on GKE in the Quickstart: Deploy a Slurm cluster on GKE