Enable the Slurm Operator add-on for GKE

This document explains how to enable and disable the Slurm Operator add-on for your Google Kubernetes Engine (GKE) clusters. The Slurm Operator add-on is only supported on Standard clusters; it is not supported on Autopilot clusters. The Slurm Operator add-on runs components in the Kubernetes control plane to manage Slurm workloads within the cluster.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.

Enable the Slurm Operator add-on on a GKE cluster

You can enable the Slurm Operator add-on on new or existing Standard GKE clusters by using the Google Cloud CLI.

gcloud container clusters create CLUSTER_NAME \
    --location LOCATION \
    --cluster-version=VERSION \
    --addons=SlurmOperator

Replace the following:

  • CLUSTER_NAME: the name of the new cluster.
  • LOCATION: the region of cluster.
  • VERSION: the GKE version, which must be 1.35.2-gke.1842000 or later. You can also use the --release-channel option to select a release channel. The release channel must have a default version of 1.35.2-gke.1842000 or later.

You can enable the Slurm Operator add-on on an existing cluster by using the gcloud container clusters update command and appending the --update-addons=SlurmOperator=ENABLED flag.

Verify the Slurm Operator add-on is enabled

You can verify that the Slurm Operator add-on is enabled on a cluster by using the gcloud CLI.

gcloud container clusters describe CLUSTER_NAME \
    --location=LOCATION

Replace the following:

  • CLUSTER_NAME: the name of the new cluster.
  • LOCATION: the region of cluster.

The output should be similar to the following:

# Several lines omitted
addonsConfig:
  slurmOperatorConfig:
    enabled: true

This output indicates that the Slurm Operator add-on is enabled for the cluster.

Disable the Slurm Operator add-on for a cluster

To disable the Slurm Operator add-on on an existing cluster, run the following command:

gcloud container clusters update CLUSTER_NAME \
    --location=LOCATION \
    --update-addons=SlurmOperator=DISABLED

Replace the following:

  • CLUSTER_NAME: the name of the new cluster.
  • LOCATION: the region of cluster.

You can verify that the Slurm Operator add-on is disabled by re-running the gcloud container clusters describe command. The slurmOperatorConfig section should show enabled: false.

What's next