Kubernetes Task Runner

Available on: Enterprise EditionCloud>= 0.18.0

Run tasks as Kubernetes pods.

Overview

This plugin is available only in the Enterprise Edition (EE) and Kestra Cloud. The task runner is container-based, so the containerImage property must be set. To access the task's working directory, use either the {{workingDir}} Pebble expression or the WORKING_DIR environment variable. Input files and namespace files are available in this directory.

To generate output files, you can either:

Use the outputFiles property of the task and create a file with the same name in the task’s working directory, or
Create any file in the output directory, accessible via the {{outputDir}} Pebble expression or the OUTPUT_DIR environment variable.

When the Kestra Worker running this task is terminated, the pod continues until completion. After restarting, the Worker resumes processing on the existing pod unless resume is set to false.

If your cluster is configured with RBAC, the service account running your pod must have the following authorizations:

pods: get, create, delete, watch, list
pods/log: get, watch
pods/exec: get, watch

Here is an example role that grants these authorizations:

yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: task-runner
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "create", "delete", "watch", "list"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["get", "watch"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get", "watch"]

How to use the Kubernetes task runner

The Kubernetes task runner executes tasks in a specified Kubernetes cluster. It is useful for declaring resource limits and resource requests.

Here is an example of a workflow with a task running shell commands in a Kubernetes pod:

yaml

id: kubernetes_task_runner
namespace: company.team

description: |
  To get the kubeconfig file, run: `kubectl config view --minify --flatten`.
  Then, copy the values to the configuration below.
  Here is how Kubernetes task runner properties (on the left) map to the kubeconfig file's properties (on the right):
  - clientKeyData: client-key-data
  - clientCertData: client-certificate-data
  - caCertData: certificate-authority-data
  - masterUrl: server, e.g., https://docker-for-desktop:6443
  - oauthToken: token (if using OAuth, e.g., GKE/EKS)

inputs:
  - id: file
    type: FILE

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    inputFiles:
      data.txt: "{{ inputs.file }}"
    outputFiles:
      - "*.txt"
    containerImage: centos
    taskRunner:
      type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
      config:
        clientKeyData: client-key-data
        clientCertData: client-certificate-data
        caCertData: certificate-authority-data
        masterUrl: server e.g. https://docker-for-desktop:6443
    commands:
      - echo "Hello from a Kubernetes task runner!"
      - cp data.txt out.txt

To deploy Kubernetes with Docker Desktop, see this guide.

To install kubectl, see this guide.

File handling

If your script task has inputFiles or namespaceFiles configured, an init container uploads files into the main container.

If your script task has outputFiles configured, a sidecar container downloads files from the main container.

All containers use an in-memory emptyDir volume for file exchange.

Failure scenarios

If a task is resubmitted (for example, due to a retry or a Worker crash), the new Worker will reattach to the existing (or completed) pod instead of starting a new one.

Specifying resource requests for Python scripts

Some Python scripts may require more resources than others. You can specify resource requests in the resources property of the task runner.

yaml

id: kubernetes_resources
namespace: company.team

tasks:
  - id: python_script
    type: io.kestra.plugin.scripts.python.Script
    containerImage: ghcr.io/kestra-io/pydata:latest
    taskRunner:
      type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
      namespace: default
      pullPolicy: ALWAYS
      config:
        username: docker-desktop
        masterUrl: https://docker-for-desktop:6443
        caCertData: xxx
        clientCertData: xxx
        clientKeyData: xxx
      resources:
        request:
          cpu: "500m"
          memory: "128Mi"
    outputFiles:
      - "*.json"
    script: |
      import platform
      import socket
      import sys
      import json
      from kestra import Kestra

      print("Hello from a Kubernetes runner!")

      host = platform.node()
      py_version = platform.python_version()
      platform_info = platform.platform()
      os_arch = f"{sys.platform}/{platform.machine()}"

      def print_environment_info():
          print(f"Host name: {host}")
          print(f"Python version: {py_version}")
          print(f"Platform: {platform_info}")
          print(f"OS/Arch: {os_arch}")

          env_info = {
              "host": host,
              "platform": platform_info,
              "os_arch": os_arch,
              "python_version": py_version,
          }
          Kestra.outputs(env_info)

          with open("environment_info.json", "w") as json_file:
              json.dump(env_info, json_file, indent=4)

      if __name__ == "__main__":
          print_environment_info()

For a full list of Kubernetes task runner properties, see the Kubernetes plugin documentation or explore them in the built-in Code Editor in the Kestra UI.

Using plugin defaults to avoid repetition

You can use pluginDefaults to avoid repeating configuration across multiple tasks. For example, you can set the pullPolicy to ALWAYS for all tasks in a namespace:

yaml

id: k8s_taskrunner
namespace: company.team

tasks:
  - id: parallel
    type: io.kestra.plugin.core.flow.Parallel
    tasks:
      - id: run_command
        type: io.kestra.plugin.scripts.python.Commands
        containerImage: ghcr.io/kestra-io/kestrapy:latest
        commands:
          - pip show kestra

      - id: run_python
        type: io.kestra.plugin.scripts.python.Script
        containerImage: ghcr.io/kestra-io/pydata:latest
        script: |
          import socket

          ip_address = socket.gethostbyname(hostname)
          print("Hello from AWS EKS and Kestra!")
          print(f"Host IP Address: {ip_address}")

pluginDefaults:
  - type: io.kestra.plugin.scripts.python
    forced: true
    values:
      taskRunner:
        type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
        namespace: default
        pullPolicy: ALWAYS
        config:
          username: docker-desktop
          masterUrl: https://docker-for-desktop:6443
          caCertData: |-
            placeholder
          clientCertData: |-
            placeholder
          clientKeyData: |-
            placeholder

Guides

Below are several guides to help you set up the Kubernetes task runner on different platforms.

Google Kubernetes Engine (GKE)

Before you begin

Before starting, ensure you have the following:

A Google Cloud account.
A Kestra instance (version 0.18.0 or later) with Google credentials stored as secrets or environment variables.

Set up Google Cloud

In Google Cloud, perform the following steps:

Create and select a project.
Create a GKE cluster.
Enable the Kubernetes Engine API.
Set up the gcloud CLI with kubectl.
Create a service account.

To authenticate with Google Cloud, create a service account and add a JSON key to Kestra. Read more in our Google credentials guide. For GKE, ensure the Kubernetes Engine default node service account role is assigned to your service account.

Creating a flow

Here's an example flow using the Kubernetes task runner with GKE. To authenticate, use OAuth with a service account.

yaml

id: gke_task_runner
namespace: company.team

tasks:
  - id: metadata
    type: io.kestra.plugin.gcp.gke.ClusterMetadata
    clusterId: kestra-dev-gke
    clusterZone: "europe-west1"
    clusterProjectId: kestra-dev

  - id: auth
    type: io.kestra.plugin.gcp.auth.OauthAccessToken

  - id: pod
    type: io.kestra.plugin.scripts.shell.Commands
    containerImage: ubuntu
    commands:
      - echo "Hello from a Kubernetes task runner!"
    taskRunner:
      type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
      namespace: default
      config:
        caCertData: "{{ outputs.metadata.masterAuth.clusterCertificat }}"
        masterUrl: "https://{{ outputs.metadata.endpoint }}"
        oauthToken: "{{ outputs.auth.accessToken['tokenValue'] }}"

Use the gcloud CLI to get credentials such as masterUrl and caCertData:

bash

gcloud container clusters get-credentials clustername --region myregion --project projectid

Update the following arguments with your own values:

clusterId: the name of your cluster.
clusterZone: the region of your cluster (for example, europe-west2).
clusterProjectId: the ID of your Google Cloud project.

After running the command, access your config with kubectl config view --minify --flatten to replace caCertData, masterUrl, and username.

Amazon Elastic Kubernetes Service (EKS)

Here's an example flow using the Kubernetes task runner with AWS EKS. To authenticate, you need an OAuth token.

yaml

id: eks_task_runner
namespace: company.team

tasks:
  - id: shell
    type: io.kestra.plugin.scripts.shell.Commands
    containerImage: centos
    taskRunner:
      type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
      config:
        caCertData: "{{ secret('certificate-authority-data') }}"
        masterUrl: https://xxx.xxx.region.eks.amazonaws.com
        username: arn:aws:eks:region:xxx:cluster/cluster_name
        oauthToken: xxx
    commands:
      - echo "Hello from a Kubernetes task runner!"

Was this page helpful?

TypesDocker Task Runner

TypesAWS Batch Task Runner

​Kubernetes ​Task ​Runner

Kubernetes Task Runner