Documentation

Agent Substrate#

This example walks from an empty machine to a declarative agent running inside a gVisor actor on Agent Substrate. Everything pulls from published OCI charts — no source builds, no repo clones.

By the end you'll have:

  • A kind cluster running Agent Substrate 0.0.6 in ate-system.
  • kagent 0.9.7 or later installed with the substrate integration enabled. kagent 0.9.7 is the minimum version — earlier releases do not include the controller wiring that lets a SandboxAgent target substrate.
  • A SandboxAgent running on substrate, reachable from the kagent UI.

For background on what substrate is and how it differs from a per-pod agent runtime, see the Agent Substrate concept page. This guide does not cover the AgentHarness path on substrate.

Before you begin#

You will need:

  • kind, kubectl, and helm on your PATH.
  • A running Docker daemon (Docker Desktop or equivalent).
  • An OpenAI API key exported in your shell.
export OPENAI_API_KEY="sk-..."

Step 1: Create a kind cluster#

kind create cluster --name kagent-substrate

The substrate 0.0.6 chart defaults to JWT auth backed by Kubernetes ServiceAccount tokens, so a vanilla kind cluster works — no feature gates or custom kind config are required.

Step 2: Install Agent Substrate#

Install the CRDs first, then the substrate control plane and data plane.

helm upgrade --install substrate-crds \
oci://ghcr.io/kagent-dev/substrate/helm/substrate-crds \
--version {VERSIONS.agentSubstrate} \
--namespace ate-system --create-namespace --wait
helm upgrade --install substrate \
oci://ghcr.io/kagent-dev/substrate/helm/substrate \
--version {VERSIONS.agentSubstrate} \
--namespace ate-system --wait --timeout 10m

Verify the substrate pods are running.

kubectl get pods -n ate-system

You should see ate-api-server, ate-controller, atelet-*, atenet-router, valkey-cluster-{0..5}, and rustfs all Running, plus a few Completed init jobs.

Step 3: Install kagent with substrate enabled#

Pin the chart to 0.9.7 or later. The controller.substrate.* and substrateWorkerPool.* values used below were introduced in 0.9.7; against an older chart they will be silently ignored and the controller will start without the substrate integration.

Confirm the OpenAI key is set in this shell before running helm. If it's empty, the install will run silently with no kagent-openai Secret and the default agent pods will land in CreateContainerConfigError.

[[ -n "${OPENAI_API_KEY:-}" ]] && echo "key is set (len=${#OPENAI_API_KEY})" || echo "OPENAI_API_KEY is empty — export it first"

Don't combine the export and the helm command on one line — OPENAI_API_KEY="$(cat ...)" helm ... --set providers.openAI.apiKey="${OPENAI_API_KEY}" evaluates ${OPENAI_API_KEY} before the inline assignment runs and passes an empty string. Either export on its own line first, or splice the value directly with --set providers.openAI.apiKey="$(cat ~/path/to/key)".

Install the CRDs, then kagent with the substrate flags.

helm upgrade --install kagent-crds \
oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
--version 0.9.7 \
--namespace kagent --create-namespace --wait
helm upgrade --install kagent \
oci://ghcr.io/kagent-dev/kagent/helm/kagent \
--version 0.9.7 \
--namespace kagent --timeout 10m --wait \
--set providers.openAI.apiKey="${OPENAI_API_KEY}" \
--set providers.default=openAI \
--set controller.substrate.enabled=true \
--set controller.substrate.ateApiEndpoint=dns:///api.ate-system.svc:443 \
--set controller.substrate.ateApiInsecure=true \
--set substrateWorkerPool.create=true \
--set substrateWorkerPool.replicas=1 \
--set substrateWorkerPool.ateomImage=ghcr.io/kagent-dev/substrate/ateom-gvisor:v{VERSIONS.agentSubstrate}

The controller.substrate.* and substrateWorkerPool.* flags are what turn on the substrate integration. The rest is a standard kagent install.

If helm hits its --timeout 10m while waiting on the cold-start pod startup race (the controller restarts a couple of times waiting on postgres), wait for the controller manually and continue.

kubectl wait deploy/kagent-controller -n kagent --for=condition=Available --timeout=10m

Sanity-check that the key landed and the default agents are healthy.

kubectl get secret kagent-openai -n kagent # should exist with 1 data entry
kubectl get pods -n kagent | grep -v Running # only header + Completed jobs expected

If you see CreateContainerConfigError on the default agent pods, the secret didn't get created — re-run the kagent helm command with --reuse-values --set providers.openAI.apiKey="$(cat ~/path/to/key)" to patch it in. The deployments will roll to new pods automatically.

Tuning the WorkerPool size#

substrateWorkerPool.replicas=1 is the chart default. One worker is enough for a declarative-only walkthrough: session actors release their slot the moment they snapshot back to object storage, so a single worker can serve many sequential sessions. Bump it when:

  • You add a long-lived AgentHarness. The ahr-<...> actor pins a slot for the lifetime of the CR, so you need at least 1 + (number of harnesses).
  • You want simultaneous, overlapping declarative sessions.

You can change the size three ways, depending on how permanent you want it.

# 1) Quick, ephemeral — scale the live CR. Reverts on the next helm upgrade.
kubectl scale workerpool kagent-default -n kagent --replicas=3
# 2) Stick it into the helm release — survives upgrades.
helm upgrade kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
--version 0.9.7 --namespace kagent --reuse-values \
--set substrateWorkerPool.replicas=3
# 3) Fresh install — change the value on the Step 3 install command above.

Step 4: Open the kagent UI#

kubectl port-forward -n kagent svc/kagent-ui 8001:8080

Open http://localhost:8001. Skip the first-run wizard if it appears.

Step 5: Create a declarative agent on substrate#

A SandboxAgent with platform: substrate and a substrate.workerPoolRef runs as a substrate actor instead of a plain Deployment. Pick one of the two paths below.

Option A: Via the UI#

  1. CreateAgent → choose Declarative as the type.
  2. Set the basics:
    • Name: hello-substrate
    • Namespace: kagent
    • Model config: default-model-config
    • Runtime: Go (required — the Python ADK isn't supported on substrate today)
    • System message: You are a friendly assistant living inside an Agent Substrate sandbox. When asked who you are, say "I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor."
  3. In the Sandbox / Platform section (the label depends on UI version), set Platform to substrate and select the worker pool kagent-default.
  4. Save.

Option B: Via kubectl#

kubectl apply -f - <<EOF
apiVersion: kagent.dev/v1alpha2
kind: SandboxAgent
metadata:
name: hello-substrate
namespace: kagent
spec:
type: Declarative
description: Tiny declarative agent running inside a substrate actor
declarative:
runtime: go
modelConfig: default-model-config
systemMessage: |
You are a friendly assistant living inside an Agent Substrate sandbox.
When asked who you are, say "I am hello-substrate, a Go ADK declarative
agent running inside a gVisor actor."
platform: substrate
substrate:
workerPoolRef:
name: kagent-default
EOF

Wait for the agent to become ready. The first-time golden snapshot takes about 60–90 seconds.

kubectl wait sandboxagent/hello-substrate -n kagent --for=condition=Ready --timeout=5m

Step 6: Chat with the agent#

In the UI at http://localhost:8001, pick kagent/hello-substrate from the Agents list and send:

What are you, and where are you running? Answer in one sentence.

Expected reply:

I am hello-substrate, a Go ADK declarative agent running inside a gVisor actor.

Behind the scenes, a per-session gVisor actor was restored from the golden snapshot, ran the LLM call, and snapshotted itself back to object storage. Open View → Substrate to see the actor in the inventory — between requests it will sit Suspended.

Cleanup#

kind delete cluster --name kagent-substrate

Next steps#

  • Agent Substrate concept page — runtime architecture and how snapshots, actors, and worker pools fit together.
  • AgentHarness — provision long-running OpenClaw, NemoClaw, or Hermes sandboxes. Set runtime: substrate to run a harness on Agent Substrate too.
  • Agent Sandbox — the upstream kubernetes-sigs/agent-sandbox backend for SandboxAgent, a separate path from substrate.
Kagent Lab: Discover kagent and kmcp
Free, on‑demand lab: build custom AI agents with kagent and integrate tools via kmcp on Kubernetes.